What is a socket, physically?

Tags:

I always prefer the pyhsical meaning of a programming concept to its logical meaning. So here comes this question.

As I review the socket programming paradigm, I noticed that what the bind(), connect() functions do are just like tuning the socket created by the socket() function. So I guess that what the socket() function does is just creating a data structure (and possibly a data structure in the kernel space) to hold the details about the end-to-end communication settings between the client and the server. And bind(), connect() just fill in that data structure.

I am not familiar with the implementation of the socket API, so I hope someone could address my concern.

506

asked Feb 14 '11 15:02

smwikipedia

2 Answers

This is highly platform-dependent. The point of the API is so that you DON'T need to know these details.

If you're really interested in learning this (which you shouldn't be for just applications and system applications programming), you can download a linux kernel source archive from kernel.org and examine Linux's TCP/IP implementation by looking under net/ipv4

To add some clarity,

To transport data across the network, we usually adhere to standards defined by the International Standards Organization. They have a standard called the OSI, or Open Systems Interconnection, model.

This model defines 7 layers of abstraction for applications to move data across a network. I'll only talk about the first 4, as they are the the pertinent ones for your question.

Physical Layer:

This layer defines how the data is actually transmitted over the media. Hardware vendors adhere to defined standards on how to move the data. The standards agree on electrical signals and the electronic aspects of the data moving.

How it fits into the system:

Hopefully, there's very little software support required for this layer. Whatever programming is done here is likely to be done on-module and not in the kernel or application.

Data Link Layer:

This is the first layer that arguably involves some sort of programming. This layer defines the line-level protocols that operate on the physical links. Ethernet is one protocol. Frame relay is another. Token Ring is another. Each end of the link must be running the same data link protocol. This layer combines a compatible physical layer standard to give a means to actually transfer data from one host to another. In many regards it can be thought more of an appendix to the physical layer rather than its own layer, but because link-level protocols are defined here that's not a great analogy. This layer gives physical addresses to nodes on a network.

How it fits into the system:

You'd need to write a driver to talk to the interface module that runs these data-link protocols. Depending on the module and the system, the module may have all that it needs to actually work, or it may need some system-level help. Ideally, you just create a set of code interfaces (perhaps implemented as structs that contain function pointers for the appropriate handling of I/O.. I don't really know) and when you install a new physical module, a driver need only to implement those code interfaces and now your physical module is usable.

Network Layer

This is the layer that provides the ability to move data between networks (in the case of TCP/IP). The Internet Protocol is defined at this layer. This layer gives logical addresses to nodes so that they can be grouped into networks. By knowing what network (also called a subnet, determined programatically using the subnet mask) the host is on, we run algorithms that correctly move data from one network to another. If one host is on network A in China and one host is on network B in Australia, algorithms at this level are in charge of providing a path that links these networks and therefore these hosts.

An important thing about programming for this layer is that you should be able to just "plug in" any data link layer to transmit over. This means that once you create code on your system to transmit over Ethernet, Token Ring, 3G, or Frame Relay that you should be able to use all of them without the network layer needing to know what data link technology it is using. The logic of moving data between networks should not depend on the actual physical link it is operating on.

This layer puts your data into packets, and packets are what are routed over the internet.

How it fits into the system:

All of this layer must be coded as part of the system. It is entirely a software construct and should be isolated as much as possible from the data link layer. I am not enough of an expert to say in practice how well this is accomplished. Because the functionality of this layer is system-defined, we have total control over what the software must support. This makes the construction of the code interfaces that allow using this layer by higher-layer protocols rather simple compared to the ones in the data link layer.

Transport Layer:

This layer defines segmentation of data (because if you just sent giant pieces of data all at once, hardly anything would make it in order). This layer also defines TCP, which provides hand-shaking, checksums, packet ordering, variable data window sizes, and guaranteed reliabilty. TCP gives you the ability to create multiple logical channels of communication over the same physical link. It differentiates one coversation on a link from another conversation on the same link. UDP is also defined at this level, and can be thought of as an extremely light-weight TCP. UDP provided almost none of the beneifts of TCP but still provides the physical channel multiplexing.

If your transport layer is written well, your applications don't need (speaking from a code architecture standpoint) to worry about whether the transport layer is using TCP or UDP (just mentioning these two b/c the yare most popular on IP). While you may pick one or the other based on timing performance needs or reliability needs (and in practice, applications often make an assumption about which one they are running), your application doesn't need to have exact knowledge of which one is running.

Because this layer is built on top of the network layer, we don't need to worry about how our data will get from one host to another if they are on different networks. If a router is running a standard routing protocol, augmented by some statically-defined routes, we don't need to worry about that. It's all taken care of for us by the network layer. If the network-layer configuration changes on the host that we are running, it doesn't matter. We don't need to change our entire application to account for this.

How it fits into the system:

Very similar to network layer, except it provides different functionality than does the network layer. Additionally, these interfaces are used more in user-space than are the network layer interfaces. This is the layer that actually defines the sockets that you use in TCP/IP networking.

Hope this helps and you can understand why your question is a little confusing to most of us.

106

answered Sep 22 '22 10:09

San Jacinto

Are you familiar with the OSI model? bind() specifies the local IP address and port (layer 4) to use, so when the packet is physically sent out, it specifies that IP address as the sender, and connect() specifies the remote IP address and port to physically place in those packets.

As an aside, a lot of programming is pure "logic", and doesn't really have a "physical" meaning, unless by "physical" you actually mean "implementation detail", which will vary from platform to platform. If you're actually asking about the physical implementation meaning how "meaning" is transformed into electrical signals, you would probably be happier as a computer engineer than as a programmer.

answered Sep 20 '22 10:09

Sam Skuce

Related questions
                            
                                Scale an entire WPF window
                            
                                Ruby vs Lua as scripting language for C++
                            
                                Print a struct in C
                            
                                getting git branches of a certain age
                            
                                JSF 2 and Post/Redirect/Get?
                            
                                jQuery Mobile for desktop and mobile
                            
                                Defining IDs within style, is it safe or a disaster?
                            
                                Fastest 2D convolution or image filter in Python
                            
                                Maven: NoClassDefFoundError in the main thread
                            
                                Where can I learn DirectX programming? [closed]
                            
                                Adding a controller with read/write actions and views, using Entity Framework - what is "Data Context class"?
                            
                                Pydev Code Completion for everything

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is a socket, physically?

Tags:

smwikipedia

People also ask

2 Answers

Physical Layer:

How it fits into the system:

Data Link Layer:

How it fits into the system:

Network Layer

How it fits into the system:

Transport Layer:

How it fits into the system:

San Jacinto

Sam Skuce

Recent Activity

Donate For Us