Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python encoding characters with urllib.quote

Tags:

I'm trying to encode non-ASCII characters so I can put them inside an url and use them in urlopen. The problem is that I want an encoding like JavaScript (that for example encodes ó as %C3%B3):

encodeURIComponent(ó) '%C3%B3' 

But urllib.quote in python returns ó as %F3:

urllib.quote(ó) '%F3' 

I want to know how to achieve an encoding like javascript's encodeURIComponent in Python, and also if I can encode non ISO 8859-1 characters like Chinese. Thanks!

like image 361
Saúl Pilatowsky-Cameo Avatar asked Jun 21 '11 19:06

Saúl Pilatowsky-Cameo


2 Answers

You want to make sure you're using unicode.

Example:

import urllib  s = u"ó" print urllib.quote(s.encode("utf-8")) 

Outputs:

%C3%B3

like image 193
Bryan Avatar answered Sep 19 '22 13:09

Bryan


in Python 3 the urllib.quote has been renamed to urllib.parse.quote.

Also in Python 3 all strings are unicode strings (the byte strings are called bytes).

Example:

from urllib.parse import quote  print(quote('ó')) # output: %C3%B3 
like image 45
Messa Avatar answered Sep 18 '22 13:09

Messa