Transmission Control Protocol
The fundamental Internet end-to-end protocol for implementing reliable delivery that accounts for packet loss is the Transmission Control Protocol (TCP).[1]
When the Internet was first designed, one of its basic axioms was the end-to-end assumption. Under this assumption, the transfer of information between Internet endpoints is the responsibility of the endpoints. As opposed to other network architectures such as X.25, the Internet proper (i.e., IP) has limited error protection. There are different kinds of errors that can take place in transmission, and TCP will protect against some of them. For some requirements (e.g., Trivial File Transfer Protocol), some of the errors that TCP could correct are corrected at the application layer, because, for the specific use of TCP, the overhead and local resource demand of TCP would be intolerable.
Not all Internet applications need guaranteed delivery, and can be somewhat "lossy." Video on demand over the Internet, for example, can afford to let packets be lost en route to gain a speed advantage. When stronger error control is needed, such control is the responsibility of protocols running above IP in the protocol stack.
Returning to the example of video on demand, video applications cannot tolerate packets arriving out of order, which is a different type of error than individual bits being incorrect. TCP can promise that bytes of data will be in order in which they were transmitted, or the connection will be dropped. TCP does guarantee that as long as the connection stays up, bytes will be free of bit errors.
At a general level, assume that TCP guarantees a stream of error-free bytes. If TCP is unable to correct errors by repeated retransmission, it shuts down the connection. TCP does not guarantee the rate, or the variability of rate, of delivery; see differentiated services.
There is a cost to the error-free guarantee. Since TCP will retransmit PDUs containing errored bits, until either they are received correctly, or some programmed limits are exceeded and the connection is shut down, the delays introduced by retransmission can make end-to-end delay variable and unpredictable. For an application such as voice over internet protocol, highly variable delay makes the application unusable. Since VoIP can tolerate some loss of data better than it can tolerate variable delay, VoIP protocol stacks use User Datagram Protocol (UDP) rather than TCP. VoIP does have some other mechanisms, at higher protocol levels, to deal with certain errors.
TCP assumes that any loss is due to congestion, so it strives not to transmit more data than the network and destination can accept. This is done with flow control mechanisms, which also control retransmission of data lost in transmission.
Segment fields
TCP's protocol data unit (PDU) is called a segment, which runs from the first byte of the header to the last byte of the data in the payload. A segment may be split up into smaller packets, but the IP fragmentation mechanism will guarantee that if all the packets making up the segment are delivered, the receiving IP code will put them into the correct order before notifying the receiving TCP that the entire segment has been delivered.
These are written in order of their sequence from the start of the header:
- Destination port (16 bit integer)
- Source port (16 bit integer)
Trying to get the diagram from the RFC to format correctly; some things get close but I haven't succeeded. If all else fails, I'll create a graphic and put it in, but I think there is some way to make this work. <code.0 1 2 3 >/code> 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data | |U|A|P|R|S|F| | | Offset| Reserved |R|C|S|S|Y|I| Window | | | |G|K|H|T|N|N| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Connection establishment
Sequence numbering
Many of TCP's functions depend on the sequence numbering mechanism; many of the limitations of the original protocols come from the sequence numbering mechanism. Some of the enhancements center around an effective extension of this field, and there are some attacks on TCP where the miscreant successfully predicts the next sequence number.
Windowing
There are both implicit and explicit flow control mechanisms. In the conventional slow start mode, TCP starts with a window size of one byte. As long as the transmission stays up, and delay does not exceed certain parameters, TCP keeps doubling the window size until either reaches the 64K limit imposed by the 16-bit window size field in TCP (without high performance enhancements), or a transmission is not acknowledged.
If there is no acknowledgement, TCP assumes that is due to congestion, although TCP really does not know if the problem is congestion or a transmission error. In either case, TCP sets the window back to 1 and starts increasing the window until it hits a limit. Individual TCP implementations may, for local reasons, limit the maximum window size, but this is nor part of the standard.
There are methods of WAN acceleration or TCP acceleration that may, in properly selected circumstances, improve performance, by initially using a large window size. For some environments, this may be effective, but it may break other ones; see TCP acceleration.
A good application for a large starting window is on a router-to-router link, when it is known that the first protocol that comes up will be Border Gateway Protocol, and the other router is transferring a full routing table. Until the routing table is transferred and the internal forwarding plane table converges, no other traffic will be competing with BGP. After routing converges, then normal dynamic window adjustment makes sense for regular flow; the implementation might restrict the large initial window to BGP at startup.[2]
TCP over paths with specific performance characteristics
TCP was intended to be independent of the underlying transmission system, as is the Internet Protocol (IP).
Demonstrated independence of transmission medium
Indeed, IP's independence of the underlying medium has been demonstrated in some extreme cases[3] [4] [5]
When consistency is more important than maximum throughput
In certain applications, such as voice or video over the Internet, a consistent delay value is more important to the user experience than occasionally bursting for maximum throughput. [6] This can be done with the TCP Friendly Rate Control (TFRC) document simply specifies a congestion control mechanism. TFRC is not a new protocol, but a TCP implementation technique that would be appropriate for systems using mechanisms such as the Real Time Transport Protocol [7], applications that manage congestion at the application level, or in the endpoint-wide congestion control features of endpoints with a common congestion control policy.
High performance extensions
Nevertheless, experience demonstrated that the original TCP design limited transfer rates over high-speed, long-delay paths, such as relays through geosynchronous communications satellites. Such channels have been called "Long Fat Networks", pronounced "elephant". [8]
The concern is with the product of bandwidth and latency (i.e., delay in RFC1323). If its value becomes too large, the original sizes of the TCP window size and sequence numbers are too small; transmission has to stop until enough data is acknowledged to reuse those fields. Extensions in RFC1323 allow the window size to scale, and add timestamps to disambiguate sequence numbers.
Another problem is that traditional TCP, when retransmitting, will send all packets that were not acknowledged at when the error was detected. With high data rates, large packets, and low error rates, this can result in throughput reduction while the packets are retransmitted, and analyzed and acknowledged by the receiver. By using a feature called selective acknowledgement, only the actually errored packets need be retransmitted.[9]
Header compression
When there is very little change between TCP segments carrying a particular application protocol, such as telnet sending one character at a time and thus always incrementing counters by one, the header can be compressed. Header compression requires more processing, but can save significant bandwidth on slow links. Telnet gives the most dramatic results, but header compression can also be helpful for FTP and other protocols with an inherent order in their payloads. [10]
References
- ↑ Postel, J. (September 1981), Transmission Control Protocol, Internet Engineering Task Force, RFC0793
- ↑ Allman M., Floyd S., Partridge C. (October 2002), Increasing TCP's Initial Window, Internet Engineering Task Force, RFC3390
- ↑ Waitzman, D. (April 1 1990), Standard for the transmission of IP datagrams on avian carriers, Internet Engineering Task Force, RFC1149
- ↑ Waitzman, D. (April 1 1999), IP over Avian Carriers with Quality of Service, Internet Engineering Task Force, RFC2549
- ↑ Bergen Linux Users Group (April 28 2001, 12:00), The highly unofficial CPIP WG
- ↑ Handley, M. et al. (January 2003), TCP Friendly Rate Control (TFRC): Protocol Specification, Internet Engineering Task Force, RFC3448
- ↑ Schulzrinne, H.; S. Casner & R. Frederick et al. (July 2003), RTP: A Transport Protocol for Real-Time Applications, Internet Engineering Task Force, RFC3550
- ↑ Jacobson, V.; R. Braden & D Borman (May 1992), TCP Extensions for High Performance, Internet Engineering Task Force, RFC1323
- ↑ Mathis, M.; J. Mahdavi & S. Floyd et al. (October 1996), TCP Selective Acknowledgment Options, Internet Engineering Task Force, RFC2018
- ↑ Jacobson, V. (February 1990), Compressing TCP/IP Headers for Low-Speed Serial Links, Internet Engineering Task Force, RFC1144