Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do we cast sockaddr_in to sockaddr when calling bind()?

Tags:

c

linux

sockets

People also ask

What is Sockaddr?

sockaddr. The first structure is sockaddr that holds the socket information − struct sockaddr { unsigned short sa_family; char sa_data[14]; }; This is a generic socket address structure, which will be passed in most of the socket function calls. The following table provides a description of the member fields −

What's the structure of Sockaddr_in?

The SOCKADDR_IN structure specifies a transport address and port for the AF_INET address family.


No, it's not just convention.

sockaddr is a generic descriptor for any kind of socket operation, whereas sockaddr_in is a struct specific to IP-based communication (IIRC, "in" stands for "InterNet"). As far as I know, this is a kind of "polymorphism" : the bind() function pretends to take a struct sockaddr *, but in fact, it will assume that the appropriate type of structure is passed in; i. e. one that corresponds to the type of socket you give it as the first argument.


I don't know if its very much relevant for this question, but I would like to provide some extra info which may make the typecaste more understandable as many people who haven't spent much time with C get confused seeing such a typecaste.

I use macOS, so I am taking examples based on header files from my system.

struct sockaddr is defined as follows:

struct sockaddr {
    __uint8_t       sa_len;         /* total length */
    sa_family_t     sa_family;      /* [XSI] address family */
    char            sa_data[14];    /* [XSI] addr value (actually larger) */
};

struct sockaddr_in is defined as follows:

struct sockaddr_in {
    __uint8_t       sin_len;
    sa_family_t     sin_family;
    in_port_t       sin_port;
    struct  in_addr sin_addr;
    char            sin_zero[8];
};

Starting from the very basics, a pointer just contains an address. So struct sockaddr * and struct sockaddr_in * are pretty much the same. They both just store an address. Only relevant difference is how compiler treats their objects.

So when you say (struct sockaddr *) &name, you are just tricking the compiler and telling it that this address points to a struct sockaddr type.


So let's say the pointer is pointing to a location 1000. If the struct sockaddr * stores this address, it will consider memory from 1000 to sizeof(struct sockaddr) possessing the members as per the structure definition. If struct sockaddr_in * stores the same address it will consider memory from 1000 to sizeof(struct sockaddr_in).


When you typecasted that pointer, it will consider the same sequence of bytes upto sizeof(struct sockaddr).

struct sockaddr *a = &name; // consider &name = 1000

Now if I access a->sa_len, the compiler would access from location 1000 to sizeof(__uint8_t) which is same bytes size as in case of sockaddr_in. So this should access the same sequence of bytes.

Same pattern is for sa_family.

After that there is a 14 byte character array in struct sockaddr which stores data from in_port_t sin_port (typedef'd 16 bit unsigned integer = 2 bytes ) , struct in_addr sin_addr (simply a 32 bit ipv4 address = 4 bytes) and char sin_zero[8](8 bytes). These 3 add up to make 14 bytes.

Now these three are stored in this 14 bytes character array and we can access any of these three by accessing appropriate indices and typecasting them again.

user529758's answer already explains the reason to do this.


This is because bind can bind other types of sockets than IP sockets, for instance Unix domain sockets, which have sockaddr_un as their type. The address for an AF_INET socket has the host and port as their address, whereas an AF_UNIX socket has a filesystem path.