I'm reading and trying to get an idea of C
, and I tried to program a Java chat
with UDP
and TCP
a couple of years back, and as much as I pulled it off... I could not do it.
I want to program sockets and I'm reading tons of documentation, but there is always a part that is unclear, every kicking documentation has a flaw.
For example, there is one about
int socket(int domain, int type, int protocol);
The domain I will use is clearly AF_INET
, and if I want a TCP
Socket I think type should be SOCK_STREAM
, but what is protocol? Documentation says it should be 0... why??? what is it?
From the man page for socket:
The protocol specifies a particular protocol to be used with the socket. Normally only a single protocol exists to support a particular socket type within a given protocol family, in which case protocol can be specified as 0. However, it is possible that many protocols may exist, in which case a particular protocol must be specified in this manner. The protocol number to use is specific to the “communication domain” in which communication is to take place; see protocols(5). See getprotoent(3) on how to map protocol name strings to protocol numbers.
According to the man page for protocols:
This file is a plain ASCII file, describing the various DARPA internet protocols that are available from the TCP/IP subsystem. It should be consulted instead of using the numbers in the ARPA include files, or, even worse, just guessing them. These numbers will occur in the protocol field of any IP header.
Each line is of the following format:
protocol number aliases ...
...
/etc/protocols The protocols definition file.
And in the /etc/protocols file on my linux box:
ip 0 IP # internet protocol, pseudo protocol number
hopopt 0 HOPOPT # hop-by-hop options for ipv6
icmp 1 ICMP # internet control message protocol
igmp 2 IGMP # internet group management protocol
ggp 3 GGP # gateway-gateway protocol
ipencap 4 IP-ENCAP # IP encapsulated in IP (officially ``IP'')
st 5 ST # ST datagram mode
tcp 6 TCP # transmission control protocol
cbt 7 CBT # CBT, Tony Ballardie <[email protected]>
egp 8 EGP # exterior gateway protocol
igp 9 IGP # any private interior gateway (Cisco: for IGRP)
bbn-rcc 10 BBN-RCC-MON # BBN RCC Monitoring
...
And according to the man page for getprotocol:
The getprotobyname() function returns a protoent structure for the entry from the database that matches the protocol name name. A connection is opened to the database if necessary.
...
The protoent structure is defined in as follows:
struct protoent { char *p_name; /* official protocol name */ char **p_aliases; /* alias list */ int p_proto; /* protocol number */ }
So if you pass "ip" to getprotobyname() it would return 0 which is the number you are using anyway. But using 0 directly is always safe even if you don't know the name of the protocol.
The last protocol parameter of socket() can be used with raw packets. I will try to explain it practically.
If you are using raw sockets to get packets from TCP stack, you can control the amount of packet data you want to send/receive with this parameter.
socket (AF_INET, SOCK_RAW, IPPROTO_TCP);
Above call will give you a raw packet in which kernel will take care of the packet up to IP header. You will have to manually fill in the rest of the packet when sending it or when you will read the packet, kernel will provide the contents of TCP header as well with the data.
On the other hand:
socket (AF_INET, SOCK_RAW, IPPROTO_RAW);
Using IPPROTO_RAW, you can control the packet from IP layer upwards. i.e. kernel will provide you services up to ethernet header, rest of the packet is in your control.
There may be different protocols to support a particular socket type, so that's why you also can specify the protocol
in socket(2)
.
From the manpage (emphasis mine):
The protocol specifies a particular protocol to be used with the socket. Normally only a single protocol exists to support a particular socket type within a given protocol family, in which case protocol can be specified as 0. However, it is possible that many protocols may exist, in which case a particular protocol must be specified in this manner.
So it is not mandatory to specify the protocol as 0
. Actually 0
means that the standard library will figure out the correct protocol for you. But you could specify it as explicitly and it is perfectly valid to do so.
On Linux, you can see the available protocols by doing:
$ cat /etc/protocols
# Internet (IP) protocols
#
# Updated from http://www.iana.org/assignments/protocol-numbers and other
# sources.
# New protocols will be added on request if they have been officially
# assigned by IANA and are not historical.
# If you need a huge list of used numbers please install the nmap package.
ip 0 IP # internet protocol, pseudo protocol number
hopopt 0 HOPOPT # IPv6 Hop-by-Hop Option [RFC1883]
icmp 1 ICMP # internet control message protocol
igmp 2 IGMP # Internet Group Management
ggp 3 GGP # gateway-gateway protocol
ipencap 4 IP-ENCAP # IP encapsulated in IP (officially ``IP'')
st 5 ST # ST datagram mode
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With