Voice over IP

Factors that affect VOIP Call quality

In this article, we would take a look at the various factors that affect the call quality while transporting voice over IP Networks. We take a look at factors like the type of audio codec used, latency, jitter and jitter buffer, packet loss, packet size, silence suppression, echo and other network parameters that affect the call quality for VOIP applications.

Audio Codec:

Every system implementing VOIP/IP Telephony uses an audio codec to compress the audio signals at one end and de-compress the same at the other end. Although most of them are standardised, VOIP vendors implement proprietary codec’s too. Some examples of popular standardized Codec’s include G.723, G.729a etc.

The type of Codec used is an important factor that affect the VOIP call quality as higher the compression, lesser the size of data to be transmitted over the other side. But there is a flip side too – the voice quality generally suffers with higher compression rates. Most Codec’s can accommodate different target compression rates like 8 Kbps, 6.4 Kbps, 5.3 Kbps etc (Standard 64 Kbps required to transmit voice over T1 lines – Single channel, PCM). The bit rates mentioned are for audio only, and protocol overheads is added over that and hence the actual bit rate realized is quite higher.

The Codec’s also introduce a digitizing delay as each algorithm requires a certain amount of data to be buffered before it is processed. If the Codec is very complex to be implemented, more CPU resources would be required and hence this too affects the VOIP call quality.


Network latency is caused both due to the distance that the packet needs to travel and also due to the changing network conditions. More the distance needed for a packet to traverse (Eg. across continents), higher the delay. The delay also depends on the number of router hops that a packet needs to take to reach the destination. Higher the number of hops, more the delay.

The Compression algorithms also cause their own delays. For example, G.723 Codec generally adds a fixed 30 ms delay. The total network latency (two way round trip delay), for the VOIP call to be clear is around 150 ms to 500 ms. Although more than 250-300 ms delay is not preferred for most VOIP systems.

Jitter and Jitter buffer:

When the packets are sent from the Codec after compression, they are sent at a constant rate with equal spacing between them. But when they are received at the other end, the decompression algorithm also expects the packets to arrive with equal spacing between them and in the same order as they were sent. But since network imposes delays at packet level, the packets may arrive at different time intervals and they may not arrive in the same order, as they were sent. To compensate for this, there is a small Jitter buffer at the receiving end, which induces a certain calculated delay before sending the packets for decompressing. The Jitter buffer induces a small delay to collect a certain number of packets for rearranging them in the proper order as well as inducing equal spacing between them before sending them for decompression.

Packet Loss:

Some of the packets are always lost in an IP Network. It may be due to a lot of reasons like excessive collisions, physical media errors, overloaded links etc. Some protocols like TCP account for such packet losses and allow for recovery of lost packets, while some other protocols like UDP doesn’t allow recovery of lost packets.

The Codec’s perform certain operations to compensate for the lost packets (like using the previous packet instead of the lost packet or perform more sophisticated interpolations to approximate for these losses, etc). Generally packet losses up to 5% are compensated for, and the user may not experience a sufficient degradation in voice quality. But a packet loss of more than 5% might lower the quality of voice or induce noticeable delays.

Packet Size:

The packet size poses an interesting situation. If the packet size (RTP) is higher, the overall bandwidth is reduced as more information can be packed in to a single packet and there is a substantial amount of overhead control packets (header information) that needs to be added to every packet that goes out. This overhead control information is almost thrice the size of the original payload packet (RTP) itself! So, it is better that the packet size is bigger, but if the packet size is too big then there is a packetization delay which is induced as the sender needs to wait for some time for filling up the payload.

It is better to send bigger sized packets anyway as the overall bandwidth required is reduced. But that is generally done by increasing the inter-arrival timing so it is better to check if the delay budget allows for it. In certain Point to Point links, cRTP (Compressed RTP) is preferred as it compresses the header information required to send the control signals across. cRTP almost brings down the size of each packet by almost half, but it generates additional processing overload for the routers and used only for certain types of point to point WAN links as it does not contain the IP address information in the packet and hence not rout-able.

Silence suppression:

Since only one person talks in a two-way communication at any given point of time, it is better not to transfer any packets for the other person who is silently listening. Several vendors take advantage of this attribute to reduce the overall bandwidth required for the transportation of the voice packets across WAN links.


IP Telephony invariably involves the conversion of IP media to analog/digital and vice versa. There is an echo induced due to this conversion at various points in the network. There are two types of echo. Hybrid echo is generated due to the impedance mismatches at the various analog/digital points in the network. Acoustic echo is generated at the phone. It happens as the voice leaving the speaker is picked up by the microphone. It is generally difficult to monitor and contain echo, but certain vendors provide echo cancellation (hardware and software) modules at the gateway level where the translation takes place, to contain echo.

Most of the above parameters can be monitored by specialized tools and adjustments can be made accordingly

Network Parameters:

The overall network load is one important parameter that determines the quality of voice communications. More the network load, more collisions and lesser quality of transmission. Though this aspect may not always be under the control of the network administrators, following things could be done to increase the efficiency of transportation of the voice packets.

It is always recommended to set a higher priority to the voice packets traversing through the network, than the data packets (like mail traffic, etc). This is because voice/video packets are delay sensitive and even a slight delay might cause a degradation in quality. But even if the data packets like mail (SMTP) are delayed, it doesn’t make a noticeable difference to the user. The prioritization of real time packets needs to be done at every stage of transporting them (like switches, routers, WAN links etc).

Another alternative is to use bandwidth reservation or bandwidth limiting techniques in the network based on the application/protocol. This would ensure that some bandwidth is always reserved for voice packets and the sudden sprout in the usage of certain applications (like P2P) does not interfere with the sending of voice packets over the IP network.

Other parameters like Call set-up times (time taken for initial dialing of digits to establishing a voice connection), Call success ratio (ratio of successful connects to dial attempts) and Call set-up rate (the number of calls that can be set up per second, in the network) are also important factors that affect the VOIP call quality. Other factors like the type of protocol used – like SIP or H.323 may also affect the performance as various processes are handled differently by each of them.


You could stay up to date on the various computer networking technologies by subscribing to this blog with your email address in the sidebar box that says ‘Get email updates when new articles are published’.


  • Imran Malik

    Impressive piece of information, let me elaborate more on VoIP. Voice over Internet Protocol has been around since many years. But due to lack of sufficient and affordable bandwidth it was not possible to carry carrier grade voice over Internet Protocol. But since the arrival of low cost internet bandwidth and new speech codecs such as G.729, G.723 which utilizes very low payload to carry carrier class voice it has recently been possible to leverage the true benefits of VoIP. G.723 codec utilizes only 6 Kbps (Kilo Bytes/sec) which is capable of maintaining a constant stream of data between peers and deliver carrier grade voice quality. Lets put this way if you have 8 Mbps internet connection, by using G.723 codec you can run upto 100 telephone lines with crystal clear and carrier grade voice quality. I am also a user of VoIP and have setup a small PBX at home. Since I have discovered VoIP I have never used traditional PSTN service.

    Dear readers, if you have not yet tried VoIP I suggest that you try VoIP technology and I bet you will never want to use the traditional PSTN phone service ever again. VoIP has far more superior features to offer which traditional PSTN sadly cannot offer.

    Also It has recently been possile to carry Video alongwith VoIP by using low payload video codecs. I cannot resist to tell you that by using T.38 passthrough and disabling VAD VoIP can carry FAX transmission, but beaware FAX T.38 passthrough will only work when using wide band protocols such as G.711, a-Law and u-Law.

    By using ATA (Analog Telephone Adapter) which converts VoIP signals into traditional PSTN you can also using Dial-up modems to connect to various dialup services. I wont go in to the details what VoIP can offer, to cut my story short VoIP is a must to have product for every business and individual.

    How VoIP Works

    When we make a VoIP call, a communication channel is established between caller and called party over IP (Internet Protocol) which runs on top of computer data networks. A telephony conversation that takes place over VoIP are converted into binary data packets streams in real time and transmitted over data network, when these data packets arrive at the destination these are again converted into standard telephony conversation. This whole process of voice conversion into data, transmission and data conversion into back voice conversation takes place within less than few milliseconds. That is how a VoIP is call is transmitted over data networks. I hope that now you understand basics of how a VoIP call takes place.

    What are speech codec’s and what role codec plays in VoIP?

    Speech codec play a vital role in VoIP and codec determines the quality and cost of the call. Let me explain you what exactly VoIP codec’s are and how they work. You may have heard about data compression, or probably you have heard about air compressor which compresses a volume of air in enclosed container, VoIP codec’s are no different than a air compressor. Speech codec’s compresses voice into data packets and decompresses it upon arrival at destination. Some VoIP codec’s can compress huge amount of voice while maintaining QoS which means use this type of codec will cost less because it will consume just a fraction of data network. Some codec’s are just not capable of encoding huge amount of voice they simply consume huge amount of data networks bandwidth hence the cost goes up.

    Following is a list of VoIP codec’s along with how much data network bandwidth they consume.

    * AMR Codec
    * BroadVoice Codec 16Kbps narrowband, and 32Kbps wideband
    * GIPS Family – 13.3 Kbps and up
    * GSM – 13 Kbps (full rate), 20ms frame size
    * iLBC – 15Kbps,20ms frame size: 13.3 Kbps, 30ms frame size
    * ITU G.711 – 64 Kbps, sample-based Also known as alaw/ulaw
    * ITU G.722 – 48/56/64 Kbps ADPCM 7Khz audio bandwidth
    * ITU G.722.1 – 24/32 Kbps 7Khz audio bandwidth (based on Polycom’s SIREN codec)
    * ITU G.722.1C – 32 Kbps, a Polycom extension, 14Khz audio bandwidth
    * ITU G.722.2 – 6.6Kbps to 23.85Kbps. Also known as AMR-WB. CELP 7Khz audio bandwidth
    * ITU G.723.1 – 5.3/6.3 Kbps, 30ms frame size
    * ITU G.726 – 16/24/32/40 Kbps
    * ITU G.728 – 16 Kbps
    * ITU G.729 – 8 Kbps, 10ms frame size
    * Speex – 2.15 to 44.2 Kbps
    * LPC10 – 2.5 Kbps
    * DoD CELP – 4.8 Kbps

    Switch to VoIP Today and you will never want to use traditional PSTN ever again.



    Nice info, especially on the codec’s part….

  • Brian

    Hey guys

    Am a student at unisa and am writing my network exam the following week.
    Can someone please explain for me about cabling
    Twisted cable
    Cat 3,5,6
    And lastly their throughput like 10Base T throughout

    Thanks in advance