I have scoured the interwebz with no result. We are facing a problem where some Android devices experience severe packet loss. To give some background, the application connects to a specific Wifi and looks for UDP packets broadcast on port 17216. These packets are of size 832 bytes, excluding the wrapped headers, and are sent at a regular rate of four per second.
We have only met the problem on two devices, a low-end Turbox Rubik II tablet and an ASUS Memo Pad HD 7. The other devices we've tested (phones and tablets) all gather the packets at the stipulated regular interval.
The function that receives the packets is this:
public void run()
{
while (isUDPServerRunning)
{
try
{
socket.receive(packet);
ProcessRawPacketData();
DisplayLoggingInfo();
}
catch (IOException e)
{
Log.e("receive", e.getMessage());
e.printStackTrace();
}
}
}
And that is part of a Runnable
. The socket is created thus:
byte[] buffer = new byte[1024];
DatagramSocket socket;
DatagramPacket packet = new DatagramPacket(buffer, buffer.length);
with the socket being initialized in the onCreate()
method of our Service
extension:
socket = new DatagramSocket(SERVERPORT);
The packets are being received by the Wifi module. We've confirmed that by rooting one of the devices and installing a packet sniffer, so the problem must somehow be code related.
On the affected devices packets are received correctly for a couple of seconds and then there is complete dropout that lasts for several seconds, so I estimate the loss to exceed 50%.
Any help would be much appreciated. We are pulling our hair out.
Update I was mistaken about the packet sniffer. It seems that the packet sniffer is also losing several relevant packets on the rooted device. Sometimes, though, simply starting the packet sniffer fixes the issue! Turning Bluetooth on/off like suggested below does not seem to make a difference. Could this be another hardware issue?
Update 2 Here is an example of the logs I'm printing immediately after the socket.receive()
line. Notice how it skips half a minute's worth of packets and then works fine for a few seconds.
05-25 15:44:38.670: D/LOG(4393): Packet Received
05-25 15:44:38.941: D/LOG(4393): Packet Received
05-25 15:45:09.482: D/LOG(4393): Packet Received
05-25 15:45:09.716: D/LOG(4393): Packet Received
05-25 15:45:09.928: D/LOG(4393): Packet Received
05-25 15:45:10.184: D/LOG(4393): Packet Received
05-25 15:45:10.451: D/LOG(4393): Packet Received
05-25 15:45:10.661: D/LOG(4393): Packet Received
Congestion in the network is the primary reason for packet loss in UDP, as every communication network has a flow limit. For example, network congestion is similar to a traffic jam on the road, where exceeding the maximum number of vehicles allowed on a given road may cause traffic to slow or stop during peak hours.
Packet Loss with UDPUDP is used in real time streaming applications which can deal with some amount of packet loss (or out of order reception). If an application requires UDP retransmission it must implement it on its own - or switch to TCP/IP.
The communication protocol just sends the packets, which means it has much lower bandwidth overhead and latency. With UDP, packets may take different paths between sender and receiver. As a result, some packets may be lost or received out of order.
Network congestion - The primary cause of network packet loss is congestion. All networks have space limitations, so in simple terms, network congestion is very much the same as peak hour traffic. Think of the queues on the road at certain times of the day, like early mornings and the end of the working day.
Packet loss (as you know, of course) can happen at multiple stages along the transmission:
You can quickly check whether point 1 or 2 are an issue by having other devices listen for the same broadcast while being connected to the same Wifi router. Sounds like you already did this and that there is no issue. (Note that a packet that gets dropped in step 2 (or sometimes even 1) might not be missing from the WireShark dump if you run it on the server.)
Points 3 through 5 are therefore likely to be the problem and they might be a little harder to separate out.
Here are a couple of things that might help:
Finally, here is a work-around that should fix the issue for sure:
Add some code to your client that detects dropped packets and, if the drop-rate goes too high, opens a TCP connection to the server instead, which will then guarantee packet delivery. Given that your packets are small and infrequent and that only a few devices will ever need to use this mechanism, I don't think that this should cause a problem for your server load. If you don't have a way to change the server code to provide a TCP stream, you could write an independent proxy-server that collects the UDP packets and makes them available via TCP. If you can run it on the same machine as the original server, you even know what IP address it is at (the same as the source address of the UDP packets that did arrive).
Just a wild guess, but how long do your computations on the packet take? Is it possible that the allocated buffer for the socket fills up and starts to drop the packages?
I know, this sounds unlikely for a transfer rate at about 4 KB/s... But if your computations take longer than 250 ms than this would occur sooner or later. This would also explain why some devices work like a charm, and others don't.
Have you tried to remove the computations and just print the "package received" message for debugging?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With