Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is cURL returning "additional stuff not fine"?

Tags:

I am writing a Python application that queries social media APIs via cURL. Most of the different servers I query (Google+, Reddit, Twitter, Facebook, others) have cURL complaining:

additional stuff not fine transfer.c:1037: 0 0

The unusual thing is that when the application first starts, each service's response will throw this line once or twice. After a few minutes, the line will appear several several times. Obviously cURL is identifying something that it doesn't like. After about half an hour, the servers begin to time out and this line is repeated many tens of times, so it is showing a real problem.

How might I diagnose this? I tried using Wireshark to capture the request and response headers to search for anomalies that might cause cURL to complain, but for all Wireshark's complexity there does not seem to be a way to isolate and display only the headers.

Here is the relevant part of the code:

output = cStringIO.StringIO() c = pycurl.Curl() c.setopt(c.URL, url) c.setopt(c.USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0') c.setopt(c.WRITEFUNCTION, output.write) c.setopt(c.CONNECTTIMEOUT, 10)  c.setopt(c.TIMEOUT, 15)  c.setopt(c.FAILONERROR, True) c.setopt(c.NOSIGNAL, 1)  try:     c.perform()     toReturn = output.getvalue()     output.close()     return toReturn  except pycurl.error, error:     errno, errstr = error     print 'The following cURL error occurred: ', errstr 
like image 442
dotancohen Avatar asked Dec 18 '12 23:12

dotancohen


1 Answers

I'm 99.99% sure this is not actually in any HTTP headers, but is rather being printed to stderr by libcurl. Possibly this happens in the middle of you logging the headers, which is why you were confused.

Anyway, a quick search for "additional stuff not fine" curl transfer.c turned up a recent change in the source where the description is:

Curl_readwrite: remove debug output

The text "additional stuff not fine" text was added for debug purposes a while ago, but it isn't really helping anyone and for some reason some Linux distributions provide their libcurls built with debug info still present and thus (far too many) users get to read this info.

So, this is basically harmless, and the only reason you're seeing it is that you got a build of libcurl (probably from your linux distro) that had full debug logging enabled (despite the curl author thinking that's a bad idea). So you have three options:

  1. Ignore it.
  2. Upgrade to a later version of libcurl.
  3. Rebuild libcurl without debug info.

You can look at the libcurl source for transfer.c (as linked above) to try to understand what curl is complaining about, and possibly look for threads on the mailing list for around the same time—or just email the list and ask.

However, I suspect that actually may not relevant to the real problem at all, given that you're seeing this even right from the start.

There are three obvious things that could be going wrong here:

  1. A bug in curl, or the way you're using it.
  2. Something wrong with your network setup (e.g., your ISP cuts you off for making too many outgoing connections or using too many bytes in 30 minutes).
  3. Something you're doing is making the servers think you're a spammer/DoS attacker/whatever and they're blocking you.

The first one actually seems the least likely. If you want to rule it out, just capture all of the requests you make, and then write a trivial script that uses some other library to replay the exact same requests, and see if you get the same behavior. If so, the problem obviously can't be in the implementation of how you make your requests.

You may be able to distinguish between cases 2 and 3 based on the timing. If all of the services time out at once—especially if they all do so even when you start hitting them at different times (e.g., you start hitting Google+ 15 minutes after Facebook, and yet they both time out 30 minutes after you hit Facebook), it's definitely case 2. If not, it could be case 3.

If you rule out all three of these, then you can start looking for other things that could be wrong, but I'd start here.

Or, if you tell us more about exactly what your app does (e.g., do you try to hit the servers over and over as fast as you can? do you try to connect on behalf of a slew of different users? are you using a dev key or an end-user app key? etc.), it might be possible for someone else with more experience with those services to guess.

like image 133
abarnert Avatar answered Sep 23 '22 08:09

abarnert