Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing the '+' character in a POST request in Python

I am trying to do some automation in a Python script and I have run into a problem. I am trying to do a POST to a server.

url = 'http://www.example.com'
params = {'arg0': 'value', 'arg1': '+value'}

f = urllib.urlopen(url, urllib.urlencode(params))
print f.read()

I have done a wireshark capture of the equivalent browser operation, where the second arg, arg1 is passed as +value, however when I do it with Python the + gets changed to %2B, i.e.

Line-based text data: application/x-www-form-urlencoded
arg0=value&arg1=%2Bvalue

when it should be:

Line-based text data: application/x-www-form-urlencoded
arg0=value&arg1=+value

I have also used the Requests module and it seems to do the same thing.

url = 'http://www.example.com'
params = {'arg0': 'value', 'arg1': '+value'}

f = requests.post(url, params)

Google is not your friend when you have a problem related to '+' as it seems to be a catch all for so much else.

like image 339
Douglas Kastle Avatar asked Sep 21 '12 09:09

Douglas Kastle


2 Answers

urllib2.quote(' ')     # '%20'
urllib2.unquote('%20') # ' '

So why not just unquote the parameter part:

f = urllib.urlopen(url, urllib.unquote(urllib.urlencode(params)))
like image 126
Andy Hayden Avatar answered Oct 10 '22 21:10

Andy Hayden


The + character is the proper encoding for a space when quoting GET or POST data. Thus, a literal + character needs to be escaped as well, lest it be decoded to a space on the other end. See RFC 2396, section 2.2, section 3.4 and the HTML specification, application/x-www-form-urlencoded section:

Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2.

If you are posting data to an application that does not decode a + character to a space but instead treats such data as literal plus signs instead, you need to encode your parameters yourself using the urllib.quote function instead, specifying that the + character is not to be encoded:

import urllib
def urlencode_withoutplus(query):
    if hasattr(query, 'items'):
        query = query.items()
    l = []
    for k, v in query:
        k = urllib.quote(str(k), safe=' /+')
        v = urllib.quote(str(v), safe=' /+')
        l.append(k + '=' + v)
    return '&'.join(l)

Demo:

>>> urlencode_withoutplus({'arg0': 'value', 'arg1': '+value'})
'arg0=value&arg1=+value'

When using requests, you can simply pass in the result of the above function as the data value, but in that case you need to manually set the content type:

requests.post(url, urlencode_withoutplus(query),
    headers={'Content-Type': 'application/x-www-form-urlencoded'})
like image 22
Martijn Pieters Avatar answered Oct 10 '22 19:10

Martijn Pieters