Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Lost" UDP packets (JBoss + DatagramSocket)

Tags:

java

udp

datagram

I develop part of some JBoss+EJB based enterprise application. My module needs to process huge amount of incoming UDP packets. I've done some load testing and it looks that in case of sending packets with 11ms interval everything is fine, but in case of 10ms interval some packets are lost. It's rather strange in my opinion, but I done 10/11ms interval load tests comparison several times and it is always the same result (10 ms - some "lost" packets, 11ms - everything's fine).

If it was something wrong with synchronization, I'd expect that it will also be visible in case of 11ms tests (at least one packet lost, or at least one wrong counter value). So if it is not because of synchronization, then maybe DatagramSocket through which I receive packets doesn't work as expected.

I found that receive buffer size (SO_RCVBUF) has default 57344 value (probably it's underlying IO network buffers dependent). I suspect, that maybe when this buffer goes full, then new incoming UDP datagrams are rejected. I tried set this value to some higher, but I noticed that if I exaggerate, buffer returns to its default size. If it's underlying layer dependent how can I find out maximum buffer size for certain OS/network card from JBoss level?

Is it possible that it is caused by receive buffer size, or maybe 57344 value is big enough to handle most cases? Do you have any experience with such issues?

There is no timeout set on my DatagramSocket. My UDP datagrams contains about 70 bytes of data (value without datagram header included).

[Edited] I have to use UDP because I receive Cisco Netflow data - it is protocol used by network devices to send some traffic statistics. Also, I have no influence on sent bytes format (e.g. I cannot add counters for packets and so on). It is not expected that all packets will be processed (some datagrams may be lost), but I'd expect that I will process most of packets. During 10ms interval tests, about 30% of packets were lost.

It is not very possible that slow processing causes this issue. Currently singleton component holds reference to DatagramSocket calling receive method in a loop. When packet is received, it is passed to the queue, and processed by picked from pool stateless component. "Facade" Singleton is only responsible for receiving packets and passing it on to the processing (it does not wait for processing complete event).

Thanks in advance, Piotr

like image 251
omnomnom Avatar asked Feb 23 '11 18:02

omnomnom


2 Answers

UDP does not guarantee delivery, so you can tweak parameters, but you can't guarantee that the message will get delivered, especially in the case of very large data transfers.

If you need to guarantee delivery, you should use TCP instead.

If you need (or want) to use UDP, you can encode each packet with a number, and also send the number of packets expected. For example, if you sent 10 large packets, you could include the information: packet 1/10, packet 2/10, etc. This way you can at least tell if you have not received all of the packets. If you have not received them, you could send a request to resend those missing packets.

like image 111
Jeff Storey Avatar answered Sep 25 '22 10:09

Jeff Storey


UDP is inherently unreliable.

Datagrams can be thrown away at any point between sender and receiver, even within the receiver at a level below your code. Setting the recv buffer to a larger size is likely to help the networking code within your machine buffer more datagrams but you should expect that some datagrams will be lost anyway.

If your recv logic takes too long (i.e. longer than it takes for a new datagram to arrive) then you'll always be behind and you'll always miss datagrams eventually. All you can do is make sure that your recv code runs as fast as possible, perhaps move the inbound datagram to a queue and process it 'later' or on another thread but then that will just move your problem to being one where you have a queue that keeps growing.

[Re your edit...] And what's processing your queue and how does the locking work between the producer and the consumers? Change your code so that the recv logic simply increments a count and discards the data and loops back around and see if you're losing fewer datagrams; either way, UDP is unreliable, you WILL have datagrams that are discarded and you should just expect that and deal with it. Worrying about it means you're focusing on the wrong problem; make use of the data you DO get and assume that you wont get much of it and then your program will work even if the network gets congested and MOST of your datagrams get discarded.

In summary, that's just how is it with UDP.

like image 40
Len Holgate Avatar answered Sep 25 '22 10:09

Len Holgate