Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do TCP/IP and HTTP work together?

I'm playing with Wireshark to debug some IoT home automation projects I'm working on. I think I'd benefit from understanding more about how HTTP and TCP/IP are actually working. Most explanations I'm finding describe HTTP as "riding on top of" TCP/IP, but I'm asking more specifically about what is actually being sent.

Here's an example of a client/server interaction I captured:

Client: [SYN]
Server: [SYN, ACK]
Client: [ACK]

If I understand so far, they've now successfully established a TCP connection. The next capture, though, shows me

Client: POST /whatever
Server: 200 OK

Okay now I'm lost. Examining that capture shows me that I have an Ethernet, IP, TCP, and HTTP layer all in one frame. Is it actually as simple as the client adding a bunch of text after the TCP packet ends and squirting those extra bytes over to the router? Which, presumably, then parses the TCP/IP out and forwards it accordingly? This is the source of my confusion. By "rides on top of" is it meant (in a physical sense) that HTTP is just a series of bytes that are sent in the same frame, after the TCP packet? Is the HTTP in this case considered to be the payload of the TCP/IP?

And of course to finish

Server: [FIN, ACK]
Client: [ACK]
Client: [FIN, ACK]
Server: [ACK]
//In this case the server terminates the connection.

Edit: A commenter below asked a question which makes me feel as if I haven't been very clear about what I'm asking.

Imagine that I could stand between my client and the server (or perhaps it would be more accurate to stand between my client and the router and again between the router and the server). Ignoring the considerations when one has to physically send raw data over a physical medium (checksums, error correction codes, etc), what would the actual traffic look like, with respect to time? Would I see bytes for an ethernet layer followed by bytes for an ip layer, tcp, http, and so on?

like image 814
brenzo Avatar asked Oct 17 '17 16:10

brenzo


1 Answers

The network layers use abstraction and encapsulation. The lower layers encapsulate the higher layers.

  • The Application layer can have its own protocols, e.g. HTTP. HTTP communicates with HTTP on the target device, and it is a protocol that transfers the application data (HTML).
  • The Transport layer (layer 4) encapsulates the application datagrams, and it communicates with the same Transport layer protocol on the target device. Some transport protocols have guarantees and create connections for reliability, e.g. TCP (segments), but some are connectionless with no guarantees, e.g. UDP (datagrams). The purpose of this layer is to get the application data from one application to another application. Some transport protocols use addressing (ports) to accomplish this, and some use something else, or nothing at all.
  • The Network layer encapsulates the transport protocol datagrams into packets, and it communicates with the target device network protocol. The purpose of this layer is to get packets from a device on one network to a device on another network. Routers use the addressing information in the packet headers to accomplish this (IPv4, IPX, IPv6, AppleTalk, etc. addresses).
  • The Data Link layer encapsulates the network packets into frames, and it communicates with the data link of a device on the same network. The purpose of this layer is to get frames to another device on the same network (PC printer, router, etc.). Some data-link protocols use addressing (IEEE protocols use MAC addressing, either 48-bit or 64-bit MAC addresses), some use other addressing (frame relay uses DLCIs, ATM uses VPI/VCI, etc.), and some use no addressing (PPP only has two devices, so it needs no addressing). The protocol can change as the encapsulated packet is sent from one network to another on its way to the destination device. Routers strip off the frame and discard it as they forward the packets from one network to another, creating a new frame to encapsulate the packet for the new network.
  • The physical layer (layer 1) converts the frames of the Data Link layer (layer-2) into the "bits on the wire."

The destination device performs the reverse of the above, delivering the application data to the destination application.

Because of the abstraction and encapsulation at each layer, you can mix and match different protocols at different layers. For example, ethernet can carry any number of network protocols (IPv4, IPX, IPv6, AppleTalk, etc.) without knowing or caring what is in the payload of the ethernet frame. Conversely, IP doesn't know or care which data-link protocol (ethernet, Wi-Fi, token ring, PPP, frame relay, etc.) is carrying it.

Your web browser uses HTTP to communicate the data (HTML) between it and the web server. HTTP uses TCP to transport it to the web server. The web browser will request that TCP assign it a TCP address (port). The web server likely uses the well-known TCP port 80 for HTTP, and TCP will segment the stream of data from the application into TCP segments (do not confuse this with IPv4 fragmentation). TCP will create a connection with TCP on the OS of the web server, and TCP guarantees that the segments will arrive, and that the data presented to the destination application will be complete, and in order.

TCP can theoretically use any network-layer protocol, but in practice it only uses IPv4 or IPv6. IP will encapsulate the TCP segments into IP packets.

IP will use the data-link protocol of the interface through which that packets will be sent. On a PC, this is most likely either ethernet or Wi-Fi, but it can be something else like PPP. The data-link protocol will encapsulate the packets into frames for the data-link protocol. Each data-link protocol has a different frame format. If the destination device is on the same network, the frames are addressed and delivered directly to the destination. If the destination is on a different network, the frames are addressed and delivered to the gateway (router) configured in the source OS.

The interface will encode the bits in the frame and signal on the medium of the interface.

like image 143
Ron Maupin Avatar answered Sep 25 '22 00:09

Ron Maupin