This is just a general question relating to some high-performance computing I've been wondering about. A certain low-latency messaging vendor speaks in its supporting documentation about using raw sockets to transfer the data directly from the network device to the user application and in so doing it speaks about reducing the messaging latency even further than it does anyway (in other admittedly carefully thought-out design decisions).
My question is therefore to those that grok the networking stacks on Unix or Unix-like systems. How much difference are they likely to be able to realise using this method? Feel free to answer in terms of memory copies, numbers of whales rescued or areas the size of Wales ;)
Their messaging is UDP-based, as I understand it, so there's no problem with establishing TCP connections etc. Any other points of interest on this topic would be gratefully thought about!
Best wishes,
Mike
To reduce latency in High-performance, you should decline to use a kernel driver. Smallest latency will be achieved with user-space drivers (MX does it, Infinband may be too).
There is a rather good (but slightly outdated) overview of linux networking internals "A Map of the Networking Code in Linux Kernel 2.4.20". There are some schemes of TCP/UDP datapath.
Using raw sockets will make path of tcp packets a bit shorter (thanks for an idea). TCP code in kernel will not add its latency. But user must handle all tcp protocol itself. There is a some chance of optimizing it for some specific situations. Code for clusters don't require handling of long distance links or slow links as for default TCP/UDP stack.
I'm very interested in this theme too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With