Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which addrinfo struct should be used in connect()?

I am writing a program that would connect to different websites and request and download web-pages. I am doing this in large part to learn and properly understand web programming. I would like to know if the pointer to a linked list of type struct addrinfo returned by getaddrinfois arranged in any particular order , and if so does the ip address chosen to connect to matter in any way.

For example, if I run getaddrinfo("google.com", "http", &hints, &res), res will sometimes have up to seven internet addresses. Does it make a difference in any way if I connect to the first one or the last one? Please note that I have studied the manual pages for this function and to my understanding, my question is not answered there.

like image 636
FutureSci Avatar asked Feb 07 '15 15:02

FutureSci


1 Answers

Since you have multiple addrinfo structures organized in a linked list, you should iterate over it and try to connect until a connection is successful. That is:

struct addrinfo *ptr = res;

while (res != NULL) {
     int rc = connect(socket_fd, (sockaddr *) ptr->ai_addr, ptr->addr_len);
     if (rc == 0) 
         break; // we managed to connect successfully
     // handle error

This might be needed because the DNS lookup can return multiple entries, thus the need to have a linked list in order to allow you to access them. If connect succeeds, you're done; if it fails, you should keep trying for each available IP the lookup returned, so advancing the pointer to the next element. Moreover, consider that connect can fail for multiple reasons therefore you need to check errno for errors that may allow further attempts. As @R.. pointed out, you also have to pass connect a new socket as the address family may change, releasing the previous one; getaddrinfo will help you since this information is provided in the addrinfo node (ai_family).

However, this is typically unnecessary: the first result will generally work. Personally, if I may, I have never encountered the need to loop through the linked list but it's still good to know in case you might need that.

getaddrinfo(3)

There are several reasons why the linked list may have more than one addrinfo structure, including: the network host is multihomed, accessible over multiple protocols (e.g., both AF_INET and AF_INET6); or the same service is available from multiple socket types (one SOCK_STREAM address and another SOCK_DGRAM address, for example). Normally, the application should try using the addresses in the order in which they are returned. The sorting function used within getaddrinfo() is defined in RFC 3484; the order can be tweaked for a particular system by editing /etc/gai.conf (available since glibc 2.5).

like image 57
edmz Avatar answered Oct 05 '22 16:10

edmz