Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

On linux, when performing a socket bind with port 0 (pick a random port) using C, I get errno 98, Address already in use. How is that possible?

So, we have a long standing commercial product, that is well established and I've never seen this type of issue before. We use a client program to send data to a server. Sometimes, because of firewalls in customer environments, we allow the end user to specify outbound port ranges to bind, however, in this particular issue i'm seeing, we're not doing that, and are using port 0 to perform a bind. From everything i've read, this means to pick a random port. But what I can't find out is, what does that mean to the kernel/OS. If i'm asking for a random port, how can that already be in use? Strictly speaking, only the unique pairing of src ip/src port & dst ip/port make the connection unique. I believe the same port can be used, if talking to another destination ip, but maybe that's not relevant here.

Also, this doesn't happen on all the customer's systems, only some. So, this may be some form of load related issue. The systems are fairly busy i'm told.

Here is the code we're using. I left out some of the ifdef code for windows, and left out what we do after the bind for shortness.

    _SocketCreateClient(Socket_pwtP sock, SocketInfoP sInfo )
{
int nRetries;                       /* number of times to try connect()  */
unsigned short port;
BOOL success = FALSE;
BOOL gotaddr = FALSE;
char buf[INET6_ADDRSTRLEN] ="";
int connectsuccess =1;
int ipv6compat =0;

#ifdef SOCKET_SEND_TIMEOUT
struct timeval time;
#endif /* SOCKET_SEND_TIMEOUT */

nRetries = sInfo->si_nRetries;
sock->s_hostName = strdup(sInfo->si_hostName);

#ifdef DEBUG_SOCKET
LogWrite(LogF,LOG_WARNING,"Socket create client");
LogWrite(LogF,LOG_WARNING,"Number of retries = %d", nRetries);
#endif

ipv6compat = GetIPVer();
if (ipv6compat == -1) /* ipv6 not supported */
    gotaddr = GetINAddr(sInfo->si_hostName, &sock->s_sAddr.sin_addr);
else
    gotaddr = GetINAddr6(sInfo->si_hostName, &sock->s_sAddr6.sin6_addr);

/* translate supplied host name to an internet address */
if (!gotaddr) {
                        /* print this message only once */
                        if ( sInfo->si_logInfo && ( sInfo->si_nRetries == 1 ) )
                        {
                           LogWrite(LogF, LOG_ERR,
           "unable to resolve ip address for host '%s'", sInfo->si_hostName);
                        }
                        sock = _SocketDestroy(sock);
}

else {

    if (ipv6compat == 1) /* ipv6 supported */
    {
            /* try to print the address in sock->s_sAddr6.sin6_addr to make sure it's good.  from call above */
            LogWrite(LogF, LOG_DEBUG2, "Before call to inet_ntop");
            inet_ntop(AF_INET6, &sock->s_sAddr6.sin6_addr, buf, sizeof(buf));
            LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_sAddr6.sin6_addr from GetINAddr6: %s", buf);


            LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_sAddr6.sin6_scope_id from if_nametoindex: %d", sock->s_sAddr6.sin6_scope_id);

            LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_type: %d", sock->s_type);
    }


    /* try to create the socket nRetries times */
    while (sock && sock->s_id == INVALID_SOCKET) {
        int socketsuccess = FALSE;

        /* create the actual socket */

        if (ipv6compat == -1) /* ipv6 not supported */
            socketsuccess = sock->s_id = socket(AF_INET, sock->s_type, 0);
        else
            socketsuccess = sock->s_id = socket(AF_INET6, sock->s_type, 0);

        if ((socketsuccess) == INVALID_SOCKET) {
            GETLASTERROR;
            LogWrite(LogF, LOG_ERR, "unable to create socket: Error %d: %s", errno,
            strerror(errno) );
            sock = _SocketDestroy(sock);
        }
        else
        {

             /* cycle through outbound port range for firewall support */
            port = sInfo->si_startPortRange;
         while ( !success && port <= sInfo->si_endPortRange ) {
                    int bindsuccess = 1;

             /* bind to outbound port number */
                    if ( ipv6compat == -1) /* ipv6 not supported */
                    {
                            sock->s_sourceAddr.sin_port   = htons(port);
                            bindsuccess = bind(sock->s_id, (struct sockaddr *) &sock->s_sourceAddr,
                                             sizeof(sock->s_sourceAddr));

                    }

                    else {
                            sock->s_sourceAddr6.sin6_port   = htons(port);
                            inet_ntop(AF_INET6, &sock->s_sourceAddr6.sin6_addr, buf, sizeof(buf));
                            LogWrite(LogF, LOG_DEBUG,
                                            "attempting bind to s_sourceAddr6 %s ", buf);

                            bindsuccess = bind(sock->s_id, (struct sockaddr *) &sock->s_sourceAddr6,
                                             sizeof(sock->s_sourceAddr6));
                    }

                     if (bindsuccess == -1) {
                            GETLASTERROR;
                            LogWrite(LogF, LOG_ERR,
                                    "unable to bind port %d to socket: Error %d: %s. Will attempt next port if protomgr port rules configured(EAV_PORTS).", port, errno, strerror(errno) );

                            /* if port in use, try next port number */
                          port++;
              }
              else {
                    /* only log if outbound port was specified */
                    if (port != 0)
                             {
                               if ( sInfo->si_sourcehostName ) {
                                  LogWrite(LogF, LOG_DEBUG,
                                       "bound outbound address %s:%d to socket",
                                             sInfo->si_sourcehostName, port);
                               }
                               else {
                                  LogWrite(LogF, LOG_DEBUG,
                                       "bound outbound port %d to socket", port);
                               }
                            }
                            success = TRUE;
              }


         }
        }
    }
}
return(sock);
}

The errors we're seeing in our log file look like this. It's making 2 tries and both fail:

protomgr[628453] : ERROR: unable to bind port 0 to socket: Error 98: Address already in use. Will attempt next port if protomgr port rules configured(EAV_PORTS).

protomgr[628453] : ERROR: unable to bind port(s) to socket: Error 98: Address already in use. Consider increase the number of EAV_PORTS if this msg is from protomgr.

protomgr[628453] : ERROR: unable to bind port 0 to socket: Error 98: Address already in use. Will attempt next port if protomgr port rules configured(EAV_PORTS).

protomgr[628453] : ERROR: unable to bind port(s) to socket: Error 98: Address already in use. Consider increase the number of EAV_PORTS if this msg is from protomgr.

like image 866
Jeffery K Avatar asked Nov 19 '15 20:11

Jeffery K


People also ask

What will happen when you bind port 0?

When trying to bind on port 0, actually a random port is selected.

Can you bind to port 0?

Another option is to specify port 0 to bind() . That will allow you to bind to a specific IP address (in case you have multiple installed) while still binding to a random port. If you need to know which port was picked, you can use getsockname() after the binding has been performed.

How do I fix bind failed address already in use?

The Error “address already in use” occurred because some process was already running on the same port. So we can resolve the issue just by killing the process. To stop the process, we need the process ID (PID), which we can fetch using the lsof command.

How do you fix could not bind socket address and port are already in use?

To do so, open the program options by going to Edit -> Options -> Browsers and change the value of the WebSockets port. The same value must then also be set in the browser add-on (within the browser itself).


1 Answers

So, it looks like this was related to the system running out of available ports, and it being configured to only have about 9000 port available.

This setting, in /etc/sysctl.conf controls the available ports: net.ipv4.ip_local_port_range = 9000 65500

the first number is the starting port, and the second is the max. This example was pulled from a unaltered Suse Enterprise linux server 11.0. The customer of ours who reported this problem had their configured in such a way it only had around 9000 ports available in the range they defined, and all were used on the system.

Hopefully, this helps someone else in the future.

like image 157
Jeffery K Avatar answered Mar 13 '23 08:03

Jeffery K