Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Severe UDP packet loss on some Android devices

I have scoured the interwebz with no result. We are facing a problem where some Android devices experience severe packet loss. To give some background, the application connects to a specific Wifi and looks for UDP packets broadcast on port 17216. These packets are of size 832 bytes, excluding the wrapped headers, and are sent at a regular rate of four per second.

We have only met the problem on two devices, a low-end Turbox Rubik II tablet and an ASUS Memo Pad HD 7. The other devices we've tested (phones and tablets) all gather the packets at the stipulated regular interval.

The function that receives the packets is this:

public void run()
{
    while (isUDPServerRunning)
    {
        try
        {
            socket.receive(packet);

            ProcessRawPacketData();

            DisplayLoggingInfo();

        }
        catch (IOException e)
        {
            Log.e("receive", e.getMessage());
            e.printStackTrace();
        }
    }
}

And that is part of a Runnable. The socket is created thus:

byte[] buffer = new byte[1024];

DatagramSocket socket;
DatagramPacket packet = new DatagramPacket(buffer, buffer.length);

with the socket being initialized in the onCreate() method of our Service extension:

socket = new DatagramSocket(SERVERPORT);

The packets are being received by the Wifi module. We've confirmed that by rooting one of the devices and installing a packet sniffer, so the problem must somehow be code related.

On the affected devices packets are received correctly for a couple of seconds and then there is complete dropout that lasts for several seconds, so I estimate the loss to exceed 50%.

Any help would be much appreciated. We are pulling our hair out.

Update I was mistaken about the packet sniffer. It seems that the packet sniffer is also losing several relevant packets on the rooted device. Sometimes, though, simply starting the packet sniffer fixes the issue! Turning Bluetooth on/off like suggested below does not seem to make a difference. Could this be another hardware issue?

Update 2 Here is an example of the logs I'm printing immediately after the socket.receive() line. Notice how it skips half a minute's worth of packets and then works fine for a few seconds.

05-25 15:44:38.670: D/LOG(4393): Packet Received
05-25 15:44:38.941: D/LOG(4393): Packet Received
05-25 15:45:09.482: D/LOG(4393): Packet Received
05-25 15:45:09.716: D/LOG(4393): Packet Received
05-25 15:45:09.928: D/LOG(4393): Packet Received
05-25 15:45:10.184: D/LOG(4393): Packet Received
05-25 15:45:10.451: D/LOG(4393): Packet Received
05-25 15:45:10.661: D/LOG(4393): Packet Received
like image 609
Kristian D'Amato Avatar asked May 25 '15 07:05

Kristian D'Amato


People also ask

What causes packet loss UDP?

Congestion in the network is the primary reason for packet loss in UDP, as every communication network has a flow limit. For example, network congestion is similar to a traffic jam on the road, where exceeding the maximum number of vehicles allowed on a given road may cause traffic to slow or stop during peak hours.

Can UDP handle packet loss?

Packet Loss with UDPUDP is used in real time streaming applications which can deal with some amount of packet loss (or out of order reception). If an application requires UDP retransmission it must implement it on its own - or switch to TCP/IP.

Does UDP have data loss?

The communication protocol just sends the packets, which means it has much lower bandwidth overhead and latency. With UDP, packets may take different paths between sender and receiver. As a result, some packets may be lost or received out of order.

Why do I have high packet loss?

Network congestion - The primary cause of network packet loss is congestion. All networks have space limitations, so in simple terms, network congestion is very much the same as peak hour traffic. Think of the queues on the road at certain times of the day, like early mornings and the end of the working day.


2 Answers

Packet loss (as you know, of course) can happen at multiple stages along the transmission:

  1. Sending from the server
  2. Transmission over the network
  3. Physical reception at the client and handling in hardware
  4. Processing/buffering of the packet in the kernel/OS
  5. Handling/buffering of the packet in your app.

You can quickly check whether point 1 or 2 are an issue by having other devices listen for the same broadcast while being connected to the same Wifi router. Sounds like you already did this and that there is no issue. (Note that a packet that gets dropped in step 2 (or sometimes even 1) might not be missing from the WireShark dump if you run it on the server.)

Points 3 through 5 are therefore likely to be the problem and they might be a little harder to separate out.

Here are a couple of things that might help:

  • Like @Mick suggested, don't just print out when you received the packet, but give every packet an increasing ID number to figure out whether you actually lost a packet or whether it was just delayed.
  • Move your packet-receiving code into its own thread (if it isn't already) and set the priority of that thread to MAX_PRIORITY to minimize the chance that your code is holding up the lunch line. Given that the Memo Pad is a quad-core 1.2GHz machine, MAX_PRIORITY shouldn't even be necessary, but if you aren't currently running the receive-loop in its own dedicated thread, you might see hick-ups anyways. If this fixes things, simply have a minimal receive-loop stick the packets into your own buffer-queue and have an independent thread process them.
  • Check/increase the size of the packet buffer for receiving packets via setReceiveBufferSize(...) (more verbose Java reference here). Make sure you specify a size that can hold many packets. Given that running the packet-sniffer sometimes seems to help things, it does sound like there might be some socket setting that can improve things, which the sniffer happens to set.
  • On the server you can also add a tag to the packet that tells all involved devices how to treat the packet. If you call setTrafficClass(IPTOS_RELIABILITY), you are asking everyone involved to optimize their packet handling for maximum reliability. Not all devices will care, but it may make a difference.
  • You can try to use DatagramChannels instead of DatagramSockets and then use select() to wait for the next packet to read. While this technically should not make a difference, sometimes using a different API call can provide a work-around for an issue.
  • Unfortunately Android is a very heterogeneous environment where many manufacturers will provide their own kernel modules, etc. This also introduces various incompatibilities or non-standard behavior everywhere. You might be able to find a custom ROM (Cyanogen, etc.?) for one or both of your problem-devices. If installing that instead of the factory ROM fixes your problem, then it's a bug in the manufacturer provided (kernel) network drivers, in which case, you might get lucky to find a work-around, or you could maybe file a bug-report with them, but in general, you might just have to select those devices as unsupported in the Play Store to avoid bad reviews...

Finally, here is a work-around that should fix the issue for sure:

Add some code to your client that detects dropped packets and, if the drop-rate goes too high, opens a TCP connection to the server instead, which will then guarantee packet delivery. Given that your packets are small and infrequent and that only a few devices will ever need to use this mechanism, I don't think that this should cause a problem for your server load. If you don't have a way to change the server code to provide a TCP stream, you could write an independent proxy-server that collects the UDP packets and makes them available via TCP. If you can run it on the same machine as the original server, you even know what IP address it is at (the same as the source address of the UDP packets that did arrive).

like image 73
Markus A. Avatar answered Nov 15 '22 12:11

Markus A.


Just a wild guess, but how long do your computations on the packet take? Is it possible that the allocated buffer for the socket fills up and starts to drop the packages?

I know, this sounds unlikely for a transfer rate at about 4 KB/s... But if your computations take longer than 250 ms than this would occur sooner or later. This would also explain why some devices work like a charm, and others don't.

Have you tried to remove the computations and just print the "package received" message for debugging?

like image 32
bratkartoffel Avatar answered Nov 15 '22 12:11

bratkartoffel