Voice over Internet Protocol: Difference between revisions
imported>Howard C. Berkowitz (link to session layer) |
imported>Howard C. Berkowitz (Started voice digitizing discussion, just up to PCM with no compression.) |
||
Line 2: | Line 2: | ||
'''Voice over Internet Protocol''' is a family of standards that permits carrying voice telephony not over dedicated telephony networks, but over Internet Protocol networks that handle both voice and data. | '''Voice over Internet Protocol''' is a family of standards that permits carrying voice telephony not over dedicated telephony networks, but over Internet Protocol networks that handle both voice and data. | ||
==Voice Digitizing== | ==Voice Digitizing== | ||
When people speak to one another in person, the speech is conveyed as continuously varying (i.e., analog) sound waves. Most of the information in speech, as opposed to music, is carried in the frequency range from 300 to 4000 hertz, and a 4 KiloHertz (KHz) analog channel is considered the basic unit of voice conversation bandwidth when the sound waves are converted to analog electrical signals. | |||
From Bell's invention of the telephone to the early 1960s, the entire telephone system used analog transmission. This did not lend itself to the growing availability of computers and digital electronics, which offered a number of technical advantages. For example, whenever a weak analog signal is amplified, the amplifier adds noise to the signal. A digital signal, however, can be regenerated without adding noise, as long as it does not change the digital representation. | |||
It had been known, since Nyquist's research in 1928, that an analog signal could be accurately converted to digital one if it were sample at twice the highest analog frequency. In the case of the 4 KHz analog signal, that meant that 8000 digital samples per second were needed to represent that signal in digital form, without information loss. | |||
Until the availability of solid-state digital electronics, this knowledge largely remained a research area, except for very specialized applications such as voice encryption, some of which stayed analog. Once the electronic technology was available, digital voice received serious engineering attention. Nyquist's sampling rate alone did not characterize the digital bit stream that could reproduce a voice channel. | |||
Whenever a sample was taken, specifically of the analog sample amplitude at that instant, the next question was the amount of precision needed in an adequate digital channel. Ignoring some historical dead ends and certain overhead functions in telephony, the answer was that the appropriate number of bits in the sample was 8, which gave 256 voltage levels. 8000 samples per second, multiplied by 8 bits, produces a 64 Kbps digital channel as the representation of the analog voice channel. Another term for the stream produced is [[pulse code modulation]], the codes referring to the bit pattern that represented the analog amplitude of the sample. | |||
For some years, when digital telephony traveled over a dedicated digital network, the amount of bandwidth did not present a major engineering challenge. The digital streams were combined, using [[time division multiplexing]], into faster and faster channels carrying multiple voice streams. | |||
The Internet, however, does not offer continuous bit streams, and infinite bandwidth is not available. The next challenge in VoIP was determining if there were more bandwidth-efficient means to digitize voice, such that adequate information could be put into fixed packet sizes. | |||
==Real-time transport== | ==Real-time transport== | ||
==Call control== | ==Call control== |
Revision as of 07:26, 11 May 2008
Voice over Internet Protocol is a family of standards that permits carrying voice telephony not over dedicated telephony networks, but over Internet Protocol networks that handle both voice and data.
Voice Digitizing
When people speak to one another in person, the speech is conveyed as continuously varying (i.e., analog) sound waves. Most of the information in speech, as opposed to music, is carried in the frequency range from 300 to 4000 hertz, and a 4 KiloHertz (KHz) analog channel is considered the basic unit of voice conversation bandwidth when the sound waves are converted to analog electrical signals.
From Bell's invention of the telephone to the early 1960s, the entire telephone system used analog transmission. This did not lend itself to the growing availability of computers and digital electronics, which offered a number of technical advantages. For example, whenever a weak analog signal is amplified, the amplifier adds noise to the signal. A digital signal, however, can be regenerated without adding noise, as long as it does not change the digital representation.
It had been known, since Nyquist's research in 1928, that an analog signal could be accurately converted to digital one if it were sample at twice the highest analog frequency. In the case of the 4 KHz analog signal, that meant that 8000 digital samples per second were needed to represent that signal in digital form, without information loss.
Until the availability of solid-state digital electronics, this knowledge largely remained a research area, except for very specialized applications such as voice encryption, some of which stayed analog. Once the electronic technology was available, digital voice received serious engineering attention. Nyquist's sampling rate alone did not characterize the digital bit stream that could reproduce a voice channel.
Whenever a sample was taken, specifically of the analog sample amplitude at that instant, the next question was the amount of precision needed in an adequate digital channel. Ignoring some historical dead ends and certain overhead functions in telephony, the answer was that the appropriate number of bits in the sample was 8, which gave 256 voltage levels. 8000 samples per second, multiplied by 8 bits, produces a 64 Kbps digital channel as the representation of the analog voice channel. Another term for the stream produced is pulse code modulation, the codes referring to the bit pattern that represented the analog amplitude of the sample.
For some years, when digital telephony traveled over a dedicated digital network, the amount of bandwidth did not present a major engineering challenge. The digital streams were combined, using time division multiplexing, into faster and faster channels carrying multiple voice streams.
The Internet, however, does not offer continuous bit streams, and infinite bandwidth is not available. The next challenge in VoIP was determining if there were more bandwidth-efficient means to digitize voice, such that adequate information could be put into fixed packet sizes.
Real-time transport
Call control
Session Initiation Protocol
The Session Initiation Protocol (SIP), a modern version of session-layer network protocols, is key to deployed VoIP, where SIP may need to traverse a firewall-like function. Conventional firewalls make assumptions about port numbers, but SIP uses a dynamic range. SIP is the dominant protocol found inside the local multimedia border, although it rapidly is becoming the outside standard.
Session Border Controllers
A specialized class of security gateways called Session Border Controllers (SBC) deal with this problem, which are again controlled violations of the end-to-end principle. They terminate the SIP session coming from "inside", and create a new session to the outside. They may have firewalling or other security capabilities optimized for a session layer protocol.
Transcoding
Between those two session termination points, depending on the particular SBC, quite a number of things can happen. There can be deep packet inspection for security or accounting. If the particular codec being used to convert analog voice to digitized [[[packet]]s on the inside is different than the one expected from the outside (e.g., high-bandwidth G.711 versus low-bandwidth G.729A), the SBC can convert -- "transcode" -- although it is always advisable to avoid transcoding. Transcoding adds delay and may decrease quality.
Security
Encrypted voice is a problem unless the SBC is trusted to encrypt, examine plaintext, and encrypt in a new cryptosystem.
Lightweight call processing
An intelligent SBC, in the right topology, can considerably speed the processing of calls in the same part of the IP network)