I am trying to encode and store, and decode arguments in Python and getting lost somewhere along the way. Here are my steps: 1) I use google toolkit's <code>gtm_stringByEscapingForURLArgument</code> to convert an NSString properly for passing into HTTP arguments. 2) On my server (python), I store these string arguments as something like <code>u'1234567890-/:;()$&@".,?!\'[]{}#%^*+=_\\|~<>\u20ac\xa3\xa5\u2022.,?!\''</code> (note that these are the standard keys on an iphone keypad in the "123" view and the "#+=" view, the <code>\u</code> and <code>\x</code> chars in there being some monetary prefixes like pound, yen, etc) 3) I call <code>urllib.quote(myString,'')</code> on that stored value, presumably to %-escape them for transport to the client so the client can unpercent escape them. The result is that I am getting an exception when I try to log the result of % escaping. Is there some crucial step I am overlooking that needs to be applied to the stored value with the \u and \x format in order to properly convert it for sending over http? Update: The suggestion marked as the answer below worked for me. I am providing some updates to address the comments below to be complete, though. The exception I received cited an issue with <code>\u20ac</code>. I don't know if it was a problem with that specifically, rather than the fact that it was the first unicode character in the string. That <code>\u20ac</code> char is the unicode for the 'euro' symbol. I basically found I'd have issues with it unless I used the urllib2 <code>quote</code> method.

url encoding a "raw" unicode doesn't really make sense. What you need to do is <code>.encode("utf8")</code> first so you have a known byte encoding and then <code>.quote()</code> that. The output isn't very pretty but it should be a correct uri encoding. <pre class="prettyprint"><code>>>> s = u'1234567890-/:;()$&@".,?!\'[]{}#%^*+=_\|~<>\u20ac\xa3\xa5\u2022.,?!\'' >>> urllib2.quote(s.encode("utf8")) '1234567890-/%3A%3B%28%29%24%26%40%22.%2C%3F%21%27%5B%5D%7B%7D%23%25%5E%2A%2B%3D_%5C%7C%7E%3C%3E%E2%82%AC%C2%A3%C2%A5%E2%80%A2.%2C%3F%21%27' </code></pre> Remember that you will need to both <code>unquote()</code> and <code>decode()</code> this to print it out properly if you're debugging or whatever. <pre class="prettyprint"><code>>>> print urllib2.unquote(urllib2.quote(s.encode("utf8"))) 1234567890-/:;()$&@".,?!'[]{}#%^*+=_\|~<>â&sbquo;¬Â£Â¥â€¢.,?!' >>> # oops, nasty Â means we've got a utf8 byte stream being treated as an ascii stream >>> print urllib2.unquote(urllib2.quote(s.encode("utf8"))).decode("utf8") 1234567890-/:;()$&@".,?!'[]{}#%^*+=_\|~<>€£¥•.,?!' </code></pre> This is, in fact, what the django functions mentioned in another answer do. <blockquote> The functions django.utils.http.urlquote() and django.utils.http.urlquote_plus() are versions of Python’s standard urllib.quote() and urllib.quote_plus() that work with non-ASCII characters. (The data is converted to UTF-8 prior to encoding.) </blockquote> Be careful if you are applying any further quotes or encodings not to mangle things.

URL encoding/decoding with Python

I am trying to encode and store, and decode arguments in Python and getting lost somewhere along the way. Here are my steps:

1) I use google toolkit's gtm_stringByEscapingForURLArgument to convert an NSString properly for passing into HTTP arguments.

2) On my server (python), I store these string arguments as something like u'1234567890-/:;()$&@".,?!\'[]{}#%^*+=_\\|~<>\u20ac\xa3\xa5\u2022.,?!\'' (note that these are the standard keys on an iphone keypad in the "123" view and the "#+=" view, the \u and \x chars in there being some monetary prefixes like pound, yen, etc)

3) I call urllib.quote(myString,'') on that stored value, presumably to %-escape them for transport to the client so the client can unpercent escape them.

The result is that I am getting an exception when I try to log the result of % escaping. Is there some crucial step I am overlooking that needs to be applied to the stored value with the \u and \x format in order to properly convert it for sending over http?

Update: The suggestion marked as the answer below worked for me. I am providing some updates to address the comments below to be complete, though.

The exception I received cited an issue with \u20ac. I don't know if it was a problem with that specifically, rather than the fact that it was the first unicode character in the string.

That \u20ac char is the unicode for the 'euro' symbol. I basically found I'd have issues with it unless I used the urllib2 quote method.

How do you encode a URL in Python?

In Python 3+, You can URL encode any string using the quote() function provided by urllib. parse package. The quote() function by default uses UTF-8 encoding scheme.

How do I remove 20 from a URL in Python?

replace('%20+', '') will replace '%20+' with empty string.

url encoding a "raw" unicode doesn't really make sense. What you need to do is .encode("utf8") first so you have a known byte encoding and then .quote() that.

The output isn't very pretty but it should be a correct uri encoding.

>>> s = u'1234567890-/:;()$&@".,?!\'[]{}#%^*+=_\|~<>\u20ac\xa3\xa5\u2022.,?!\'' >>> urllib2.quote(s.encode("utf8")) '1234567890-/%3A%3B%28%29%24%26%40%22.%2C%3F%21%27%5B%5D%7B%7D%23%25%5E%2A%2B%3D_%5C%7C%7E%3C%3E%E2%82%AC%C2%A3%C2%A5%E2%80%A2.%2C%3F%21%27'

Remember that you will need to both unquote() and decode() this to print it out properly if you're debugging or whatever.

>>> print urllib2.unquote(urllib2.quote(s.encode("utf8"))) 1234567890-/:;()$&@".,?!'[]{}#%^*+=_\|~<>â‚¬Â£Â¥â€¢.,?!' >>> # oops, nasty Â means we've got a utf8 byte stream being treated as an ascii stream >>> print urllib2.unquote(urllib2.quote(s.encode("utf8"))).decode("utf8") 1234567890-/:;()$&@".,?!'[]{}#%^*+=_\|~<>€£¥•.,?!'

This is, in fact, what the django functions mentioned in another answer do.

The functions django.utils.http.urlquote() and django.utils.http.urlquote_plus() are versions of Python’s standard urllib.quote() and urllib.quote_plus() that work with non-ASCII characters. (The data is converted to UTF-8 prior to encoding.)

Be careful if you are applying any further quotes or encodings not to mangle things.

URL encoding/decoding with Python

Tags:

python

url-encoding

Joey

People also ask

1 Answers

pycruft

Recent Activity

Donate For Us

URL encoding/decoding with Python

Tags:

python

url-encoding

Joey

People also ask

1 Answers

pycruft

Related questions

Recent Activity

Donate For Us