Skip to content

Connection-Oriented Transport: TCP

The Transmission Control Protocol (TCP), defined in RFC 793, is a foundational Transport Layer protocol in the TCP/IP suite. It operates on top of the Internet Protocol (IP) to provide applications with a reliable communication channel.

Data handling in TCP

From the application's perspective, TCP provides a stream-of-bytes service. An application writes bytes to the stream, and TCP ensures they are delivered correctly to the other side.

Internally, TCP receieves data from application layer, groups these data bytes from this stream into chunks, adds a TCP header to each one to form a TCP segment, and then passes these segments to the IP layer, which are encapsulated into IP datagrams for transmission across the network.

Features of TCP (Transmission Control Protocol)

TCP provides a reliable, connection-oriented end-to-end communication service between processes with numerous features to ensure robust data transfer.

  • Connection-Oriented: Before any data is sent, TCP establishes a connection between the two hosts using a process called the three-way handshake. This ensures both sides are ready to communicate.

  • Reliable and In-Order Delivery: TCP guarantees that data will arrive intact, without errors, and in the same order it was sent. It achieves this using:

    • Sequence Numbers: To track data bytes and ensure ordered delivery of segments and Reordering.

    • checksums for error detection and retransmits lost or corrupted packets.

    • Cumulative Acknowledgments (ACKs): To confirm successful data delivery.

    • Retransmissions: Timer and ACKs to resend lost or corrupted segments.

  • Flow Control: Prevents a fast sender from overwhelming a slow receiver.

  • Congestion Control: Prevents the TCP connection from overwhelming the network, ensuring fair sharing of network resources among connections.

  • Full-Duplex Communication: Allows data to be sent and received simultaneously over the same connection.

It uses both Go-Back-N (GBN) and Selective Repeat sliding window protocols for flow and error control.

Due to these features, TCP is more complex and has higher overhead than UDP.

TCP is ideal for applications where data integrity and reliability is essential, such as web Browse (HTTP), file transfers (FTP), and email (SMTP, IMAP), Secure Shell (SSH).

TCP Segment Structure

A TCP segment consists of two main parts: the TCP Header and the Data Payload.

  • Data Payload: This field contains a chunk of application-layer data. Its size is generally limited by the Maximum Segment Size (MSS) to avoid IP fragmentation. While large files are sent in MSS-sized chunks, interactive applications (Telnet) may send segments with very small data payloads (even 1 byte).

  • TCP Header: The header contains control information essential for managing the connection. It is typically 20 bytes long but can be larger if optional fields are used.

Key TCP Header Fields

The header is composed of several critical fields that manage reliability, flow control, and connection state.

  • Source Port and Destination Port (16 bits each): Used for multiplexing and demultiplexing data to the correct application process on the end hosts.

  • Sequence Number (32 bits): This is the byte-stream number of the first data byte in the segment. It's crucial for ensuring data is reassembled in the correct order. The Initial Sequence Number (ISN) for a connection is chosen randomly for handling out of order segments.

  • Acknowledgment Number (32 bits): This number indicates the sequence number of the next byte the sender is expecting to receive. TCP uses cumulative acknowledgments, so this number implies that all prior bytes have been successfully received.

    • Example: If a host receives bytes 0–535 but is missing byte 536, it will continue sending an acknowledgment number of 536, even if it receives later packets bytes 900–1000.
  • Header Length (4 bits): Specifies the length of the TCP header in 32-bit words. A value of 5, for instance, indicates a 20-byte header (5×4 bytes).

  • Receive Window (16 bits): Used for flow control. It tells the sender how many bytes the receiver is currently able to accept.

  • Checksum (16 bits): Used for error detection. It is calculated over both the TCP header and the data payload.

  • Flags (6 bits): These single-bit fields control the connection's state (6 Control bits):

    • URG (Urgent): Indicates that some data is urgent (rarely used).

    • ACK (Acknowledgment): Indicates that the Acknowledgment number field is valid.

    • PSH (Push): Tells the receiver to pass the data to the application immediately.

    • RST (Reset): Abruptly aborts a connection.

    • SYN (Synchronize): Used to initiate a TCP connection.

    • FIN (Finish): Used to gracefully terminate a TCP connection.

  • Options Field (variable length): Used for negotiating parameters like the MSS, enabling window scaling for high-speed networks, or carrying timestamps.

  • Urgent Data Pointer (16 bits): Used only when the URG flag is set to indicate the location of "urgent" data.

TCP: Round-Trip Time and Timeout Calculation

To recover from lost segments, TCP uses a timeout-and-retransmission mechanism. The timeout interval must be carefully calculated to be longer than the current Round-Trip Time (RTT) but short enough to react quickly to loss.

  1. Measuring the RTT: TCP measures SampleRTT—the time between sending a segment and receiving its ACK. To avoid ambiguity, SampleRTT is not measured for retransmitted segments (this is part of Karn's Algorithm).

  2. Smoothing the RTT: Because SampleRTT can fluctuate wildly, TCP calculates a smoothed average, EstimatedRTT, using an Exponential Weighted Moving Average (EWMA). This gives more weight to recent samples.

    EstimatedRTT=(1−α) * EstimatedRTT + α * SampleRTT

    The recommended value for α = 0.125.

  3. Calculating RTT Variation: To create a safety margin, TCP measures the variation in the RTT, DevRTT.

    DevRTT = (1-β) * DevRTT + β * |SampleRTT - EstimatedRTT|

    The recommended value for β = 0.25.

  4. Setting the Timeout Interval: The final timeout value is a combination of the estimated RTT and a safety margin based on its variance.

    TimeoutInterval = EstimatedRTT + 4 * DevRTT

TCP: Acknowledgment Strategies

TCP uses two primary methods for acknowledging data.

FeatureCumulative AcknowledgmentSelective Acknowledgment (SACK)
What it AcknowledgesAll bytes up to the first missing one in the stream.Individual, out-of-order blocks of correctly received data.
Sender FeedbackVague. The sender only knows which packet was lost first.Precise. The sender knows exactly which segments are missing.
RetransmissionsCan cause unnecessary retransmissions of correctly delivered segments.Minimizes retransmissions by targeting only lost segments.
ComplexitySimple. Requires minimal memory and logic at the receiver.More complex. The receiver must buffer out-of-order data.
UsageDefault TCP behavior.An optional TCP feature that must be negotiated.

TCP: Reliable Data Transfer Mechanism

TCP builds a reliable, in-order data delivery service on top of IP's unreliable, best-effort service. It achieves this primarily through sequence numbers, acknowledgments, and a retransmission timer.

Instead of using multiple timers, TCP uses a single retransmission timer.

A TCP sender reacts to three key events:

  1. Data Received from Application: TCP segments the data, assigns a sequence number, sends the packet, and starts a single timer if one is not already running for the oldest unacknowledged segment.

  2. Timer Timeout: If the retransmission timer expires, TCP assumes the segment at SendBase was lost. It retransmits that segment (oldest unacknowledged segment) and restarts the timer.

  3. ACK Received:

    • Cumulative ACK: If an ACK for byte y is received where y > SendBase, it confirms all data up to y-1 has been received. The sender updates SendBase to y. If there are still unacknowledged segments, the timer is restarted.

    • Duplicate ACKs: Receiving several duplicate ACKs is an early signal that a segment was lost. This often triggers a Fast Retransmit, where TCP resends the missing segment before the timer expires, improving efficiency.

SendBase: This is the sequence number of the oldest, unacknowledged byte of data. If ACK number y > SendBase, sender updates SendBase.


Re-transmission Scenarios

Scenario 1: ACK Loss with Timeout

  • Host A sends a segment with sequence number 92 and 8 bytes of data.
  • Host B receives the segment and sends ACK 100, but the ACK is lost.
  • Host A times out and retransmits the same segment.
  • Host B sees duplicate data and discards it.
  • Host B sends ACK 100 again to confirm it has already received data up to byte 99.

Scenario 2: Two Segments, Both ACKs Lost

  • Host A sends:
    • Segment 1: Seq = 92, 8 bytes.
    • Segment 2: Seq = 100, 20 bytes.
  • Host B receives both segments and replies with:
    • ACK 100 (for segment 1)
    • ACK 120 (for segment 2)
  • Both ACKs are lost.
  • Host A times out and retransmits segment 1 (Seq = 92).
  • If ACK 120 is received in time, segment 2 is not retransmitted.

Scenario 3: Later ACK Arrives Before Timeout

  • Same initial setup as Scenario 2.
  • ACK 100 is lost, but ACK 120 arrives before timeout.
  • Host A learns that Host B has received all bytes up to 119.
  • Host A does not retransmit either segment.

TCP Flow Control

TCP provides a flow control service to prevent a fast sender from overwhelming a slow receiver's buffer. This is a speed-matching mechanism between the sender and receiver application.

Not be confused with congestion control, which manages the sender's rate based on network-wide traffic.

The process works as follows:

  1. The receiver maintains a receive buffer (RcvBuffer) and calculates the available buffer space, which it calls the receive window (rwnd).

    rwnd = RcvBuffer − (LastByteRcvd−LastByteRead)

  2. The receiver places this rwnd value in the TCP header of a segment sent to the sender.

  3. The sender must ensure that the amount of unacknowledged ("in-flight") data it sends does not exceed the rwnd value it received.

    LastByteSent − LastByteAcked ≤ rwnd

If the receiver's buffer fills up, it advertises rwnd = 0, and the sender must stop sending data.

To prevent a deadlock in this state, the sender will periodically send a 1-byte probe segment. This forces the receiver to send an ACK, which will contain the latest rwnd value, allowing the sender to resume when space becomes available.

In contrast, UDP provides no flow control. If an application reads too slowly from its UDP socket, the buffer will overflow and subsequent incoming packets will simply be dropped.

Key Variables of Reciever: LastByteRead – last byte read by the application from the buffer. LastByteRcvd – last byte placed into the buffer by TCP.

Key Variables of Sender: LastByteSent – last byte sent by the sender. LastByteAcked – last acknowledged byte.

TCP Connection Establishment (Three-Way Handshake)

A TCP connection is established using a three-step process to ensure both sides are synchronized and ready to communicate.

  • Step 1 (Client → Server: SYN): The client initiates the connection by sending a TCP segment with the SYN (synchronize) flag set. This segment contains the client's randomly chosen initial sequence number (client_isn).

  • Step 2 (Server → Client: SYN-ACK): The server receives the SYN, allocates resources for the connection, and replies with a segment where:

    • The SYN and ACK flags are set.
    • It includes its own random initial sequence number (server_isn).
    • The acknowledgment number is set to client_isn + 1.
  • Step 3 (Client → Server: ACK): The client receives the SYN-ACK and sends a final segment to complete the handshake:

    • The ACK flag is set.
    • The acknowledgment number is set to server_isn + 1.
    • Data maybe included

After this exchange, the connection is established, and both sides can begin sending data.

TCP Connection Termination (Four-Way Handshake)

Either the client or the server can initiate the closing of a connection. This is a graceful four-step process.

  • Step 1 (Client → Server: FIN): The initiating side (e.g., client) sends a segment with the FIN (finish) flag set, signaling it has no more data to send.

  • Step 2 (Server → Client: ACK): The server acknowledges the client's FIN. The connection is now in a half-closed state—the server can still send data to the client, but not vice-versa.

  • Step 3 (Server → Client: FIN): When the server is also finished sending data, it sends its own FIN segment.

  • Step 4 (Client → Server: ACK): The client acknowledges the server's FIN. After this final ACK, the client enters a TIME_WAIT state to handle any lingering packets before all connection resources are released on both sides.

Made with ❤️ for students, by a fellow learner.