Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The XBox 360 TCP stack does not respond to TCP Zero Window Probes with a 0-byte payload

I'm experimenting with an Android app that streams Music via UPnP to an XBox. The streaming works for the most part, but quite frequently, after a minute or two, the streaming stalls, especially when there is other activity on the network. This never happens when streaming to other non-XBox devices. I've confirmed this behavior with a number of different UPnP server apps.

After analyzing lots of Wireshark traces, I've found the root cause. It seems that after the TCP receiver window has filled on the XBox, it only explicitly re-announces a window update in response to Zero Window Probes that contain 1 byte of payload data.

While Windows-based machines send Zero Window probes that contain a 1-byte payload, Linux-based machines send probes that contain 0-byte payloads (pure ACKs).

Under ideal network conditions, this isn't a problem, since a receiver will always send ana single Window Update ACK message once it's freed up enough space in its window to avoid the silly window syndrome. However, if that single Window Update packet is missed, it will never respond again to a linux-based Android device, because the TCP stack on those devices uses Zero Window Probes with a 0-byte payload (they look like Keep Alive packets to Wirehsark).

A TCP stall between the XBox and WMP looks like this:


   4966 92.330358   10.0.2.214            10.0.2.133            TCP      [TCP ZeroWindow] 27883 > 10243 [ACK] Seq=183 Ack=1723007 Win=0 Len=0
   4971 92.648068   10.0.2.133            10.0.2.214            TCP      [TCP ZeroWindowProbe] 10243 > 27883 [ACK] Seq=1723007 Ack=183 Win=64240 Len=1
   4972 92.649009   10.0.2.214            10.0.2.133            TCP      [TCP ZeroWindowProbeAck] [TCP ZeroWindow] 27883 > 10243 [ACK] Seq=183 Ack=1723007 Win=0 Len=0
   4977 93.256579   10.0.2.133            10.0.2.214            TCP      [TCP ZeroWindowProbe] 10243 > 27883 [ACK] Seq=1723007 Ack=183 Win=64240 Len=1
   4978 93.263118   10.0.2.214            10.0.2.133            TCP      [TCP ZeroWindowProbeAck] [TCP ZeroWindow] 27883 > 10243 [ACK] Seq=183 Ack=1723007 Win=0 Len=0
   4999 94.310534   10.0.2.214            10.0.2.133            TCP      [TCP Window Update] 27883 > 10243 [ACK] Seq=183 Ack=1723007 Win=16384 Len=0

Note that the Xbox is actively responding to the Zero Window Probe packets.

A normal TCP stall between the XBox and the Android client looks like this:


7099 174.844077  10.0.2.214            10.0.2.183            TCP [TCP ZeroWindow] [TCP ACKed lost segment] 20067 > ssdp [ACK] Seq=143 Ack=2962598 Win=0 Len=0
 7100 175.067981  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=2962597 Ack=143 Win=6912 Len=0
 7107 175.518024  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=2962597 Ack=143 Win=6912 Len=0
 7108 175.894079  10.0.2.214            10.0.2.183            TCP [TCP Window Update] 20067 > ssdp [ACK] Seq=143 Ack=2962598 Win=16384 Len=0

Note that the XBox does not respond to the KeepAlive packets.

A TCP stall between the XBox and my Android device look like this if the initial Window Update announcement is missed:


 7146 175.925019  10.0.2.214            10.0.2.183            TCP [TCP ZeroWindow] 20067 > ssdp [ACK] Seq=143 Ack=3000558 Win=0 Len=0
 7147 176.147901  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 7155 176.597820  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 7165 177.498087  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 7218 179.297763  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 7297 182.897804  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 7449 190.097780  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 7759 204.498070  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 8412 233.298081  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
 9617 290.898134  10.0.2.183            10.0.2.214            TCP [TCP Keep-Alive|TCP Keep-Alive] ssdp > 20067 [ACK] Seq=3000557 Ack=143 Win=6912 Len=0
11326 358.047838  10.0.2.214            10.0.2.183            TCP      20067 > ssdp [FIN, ACK] Seq=143 Ack=3000558 Win=16384 Len=0

Note that the XBox never re-announces its open window, and eventually terminates the connection.

I've confirmed my theory by writing a small packet-injection program. When I get a stall, I can fire off a hand-crafted TCP Zero Window Probe packet. When do this, the XBox instantly springs back to life and continues on as normal. Unfortunately, I can't do this from my application, because crafting such a packet requires the CAP_NET_RAW capability, and I'm not able to grant that to my application.

Here's the above case, with a manually-injected Zero Window Probe (packet 7258). The right seq/ack numbers aren't even required. The only thing that's required is one byte of data.


   7253 373.274394  10.0.2.214            10.0.2.186            TCP      [TCP ZeroWindow] 39378 > ssdp [ACK] Seq=3775184695 Ack=1775679761 Win=0 Len=0
   7254 375.367317  10.0.2.186            10.0.2.214            TCP      [TCP Keep-Alive] ssdp > 39378 [ACK] Seq=1775679760 Ack=3775184695 Win=3456 Len=0
   7255 379.562480  10.0.2.186            10.0.2.214            TCP      [TCP Keep-Alive] ssdp > 39378 [ACK] Seq=1775679760 Ack=3775184695 Win=3456 Len=0
   7256 387.953095  10.0.2.186            10.0.2.214            TCP      [TCP Keep-Alive] ssdp > 39378 [ACK] Seq=1775679760 Ack=3775184695 Win=3456 Len=0
   7257 404.703312  10.0.2.186            10.0.2.214            TCP      [TCP Keep-Alive] ssdp > 39378 [ACK] Seq=1775679760 Ack=3775184695 Win=3456 Len=0
   7258 406.571301  10.0.2.186            10.0.2.214            TCP      [TCP ACKed lost segment] [TCP Retransmission] ssdp > 39378 [ACK] Seq=1 Ack=1 Win=1 Len=1
   7259 406.603512  10.0.2.214            10.0.2.186            TCP      39378 > ssdp [ACK] Seq=3775184695 Ack=1775679761 Win=16384 Len=0

Since the TCP Seq/Ack numbers are incorrect, Wireshark interprets the pack as a wayward data transmission with an invalid ACK, but the XBox nonetheless snaps back to life, and starts streaming again.

  • Is there any way to get CAP_NET_RAW capabilities in an Android app without requiring the device to be rooted?
  • Is there any other trick I can use to force the Linux TCP layer to send its Zero Window Probes with 1 byte of payload data?
  • Is there any other obscure TCP option I could try that would let me wake the XBox's TCP stack up?
  • Is there some other out-of-band approach to convincing the XBox to send another Window update?
  • Is there some other completely unrelated approach that I might consider?

Edit: This is a description of why the provided suggestions won't work.

  1. TCP_NODELAY only affects how packets are sent while the window is open. Specifically, setting this option prevents the TCP stack from waiting for a few ms for more data in an attempt to create a TCP packet that fills up the MSS. It doesn't allow data to be sent when the receiver window is closed.

  2. TCP_QUICKACK affects the way the host ACKs packets it's receiving. The problem I'm facing is that I need to change the way the sender ACKs the packets it is receiving.

  3. MSG_OOB only sets the TCP urgent flag. Urgent data isn't treated any differently as far as windowing goes, and still will not be sent when the receiver's window is closed.

  4. Changing the TCP congestion control algorithm won't help either. Because the XBox is forcibly limiting the data send rate to the play rate of the MP3, it's virtually impossible to avoid filling the congestion window. It might be possible to reduce the congestion window by inferring the throughput, but this would only reduce the likelyhood of a filled congestion window, not prevent it completely.

  5. Using UDP is not an option, since using the UPnP stack is a requirement, and UPnP delivers data via HTTP, and thus, TCP.

like image 646
Jason LeBrun Avatar asked Jan 28 '11 23:01

Jason LeBrun


1 Answers

I found a few things that may help:

  1. TCP ioctl(2) TCP_NODELAY will cause the kernel to send an immediate PSH packet. It might unstick the connection.

  2. TCP ioctl(2) TCP_QUICKACK will do something funny with ACK packets. It might unstick the connection.

  3. If you use send(2) you can set the MSG_OOB flag, which might poke the XBox right in the eye, get its attention, and maybe things can start over. CISCO wrote a nice summary of how different platforms respond to TCP URG, and their advice is to avoid using URG, but it's crazy enough it just might work.

  4. TCP socket option TCP_CONGESTION lets you select different congestion-avoidance algorithms. Maybe you could find one that helps avoids the filled windows in the first place? (At least TCP Vegas is implemented as a module, it might not be possible to change away from the default congestion avoidance algorithm on the android platform.)

like image 87
sarnold Avatar answered Oct 31 '22 17:10

sarnold