I am trying to build a url so that I can send a get request to it using <code>urllib</code> module. Let's suppose my <code>final_url</code> should be <pre class="prettyprint"><code>url = "www.example.com/find.php?data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value" </code></pre> Now to achieve this I tried the following way: <pre class="prettyprint"><code>>>> initial_url = "http://www.stackoverflow.com" >>> search = "Generate+value" >>> params = {"data":initial_url,"search":search} >>> query_string = urllib.urlencode(params) >>> query_string 'search=Generate%2Bvalue&data=http%3A%2F%2Fwww.stackoverflow.com' </code></pre> Now if you compare my <code>query_string</code> with the format of <code>final_url</code> you can observer two things 1) The order of params are reversed instead of <code>data=()&search=</code> it is <code>search=()&data=</code> 2) <code>urlencode</code> also encoded the <code>+</code> in <code>Generate+value</code> I believe the first change is due to the random behaviour of dictionary. So, I though of using <code>OrderedDict</code> to reverse the dictionary. As, I am using <code>python 2.6.5</code> I did <pre class="prettyprint"><code>pip install ordereddict </code></pre> But I am not able to use it in my code when I try <pre class="prettyprint"><code>>>> od = OrderedDict((('a', 'first'), ('b', 'second'))) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'OrderedDict' is not defined </code></pre> So, my question is what is the correct way to use <code>OrderedDict</code> in python 2.6.5 and how do I make <code>urlencode</code> ignores the <code>+</code> in <code>Generate+value</code>. Also, is this the correct approach to build <code>URL</code>.

You shouldn't worry about encoding the <code>+</code> it should be restored on the server after unescaping the url. The order of named parameters shouldn't matter either. Considering OrderedDict, it is not Python's built in. You should import it from <code>collections</code>: <pre class="prettyprint"><code>from urllib import urlencode, quote # from urllib.parse import urlencode # python3 from collections import OrderedDict initial_url = "http://www.stackoverflow.com" search = "Generate+value" query_string = urlencode(OrderedDict(data=initial_url,search=search)) url = 'www.example.com/find.php?' + query_string </code></pre> if your python is too old and does not have OrderedDict in the module <code>collections</code>, use: <pre class="prettyprint"><code>encoded = "&".join( "%s=%s" % (key, quote(parameters[key], safe="+")) for key in ordered(parameters.keys())) </code></pre> Anyway, the order of parameters should not matter. Note the <code>safe</code> parameter of <code>quote</code>. It prevents <code>+</code> to be escaped, but it means , server will interpret <code>Generate+value</code> as <code>Generate value</code>. You can manually escape <code>+</code> by writing <code>%2B</code>and marking <code>%</code> as safe char:

First, the order of parameters in a http request should be completely irrelevant. If it isn't then the parsing library on the othe side is doing something wrong. Second, of course the <code>+</code> is encoded. <code>+</code> is used as placeholder for a space in an encoded url, so if yor raw string contains a <code>+</code>, this has to be escaped. <code>urlencode</code> expects an unencoded string, you can't pass it a string that is already encoded.

Some comments on the question and other answers: <ol> <li>If you want to preserve order with <code>urllib.urlencode</code>, submit an ordered sequence of k/v pairs instead of mapping(dict). when you pass in a dict, <code>urlencode</code> just calls <code>foo.items()</code> to grab an iterable sequence.</li> </ol> <code> # urllib.urlencode accepts a mapping or sequence # the output of this can vary, because `items()` is called on the dict urllib.urlencode({"data": initial_url,"search": search}) # the output of this will not vary urllib.urlencode((("data", initial_url), ("search", search))) </code> you can also pass in a secondard <code>doseq</code> argument to adjust how iterable values are handled. <ol start="2"> <li> The order of parameters is not irrelevant. take these two urls for example: https://example.com?foo=bar&bar=foo https://example.com?bar=foo&foo=bar A http server should consider the order of these parameters irrelevant, but a function designed to compare URLs would not. In order to safely compare urls, these params would need to be sorted. However, consider duplicate keys: https://example.com?foo=3&foo=2&foo=1 </li> </ol> The URI specs support duplicate keys, but don't address precedence or ordering. In a given application, these could each trigger different results and be valid as well: <pre class="prettyprint"><code>https://example.com?foo=1&foo=2&foo=3 https://example.com?foo=1&foo=3&foo=2 https://example.com?foo=2&foo=3&foo=1 https://example.com?foo=2&foo=1&foo=3 https://example.com?foo=3&foo=1&foo=2 https://example.com?foo=3&foo=2&foo=1 </code></pre> <ol> <li>The <code>+</code> is a reserved character that represents a space in a urlencoded form (vs <code>%20</code> for part of the path). <code>urllib.urlencode</code> escapes using <code>urllib.quote_plus()</code>, not <code>urllib.quote()</code>. The OP most likely wanted to just do this:</li> </ol> <code> initial_url = "http://www.stackoverflow.com" search = "Generate value" urllib.urlencode((("data", initial_url), ("search", search))) </code> Which produces: <code> data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value </code> as the output.

Build query string using urlencode python

Tags:

python

dictionary

urllib

I am trying to build a url so that I can send a get request to it using urllib module.

Let's suppose my final_url should be

Click to copy

url = "www.example.com/find.php?data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value"

Now to achieve this I tried the following way:

Click to copy

>>> initial_url = "http://www.stackoverflow.com"
>>> search = "Generate+value"
>>> params = {"data":initial_url,"search":search}
>>> query_string = urllib.urlencode(params)
>>> query_string
'search=Generate%2Bvalue&data=http%3A%2F%2Fwww.stackoverflow.com'

Now if you compare my query_string with the format of final_url you can observer two things

1) The order of params are reversed instead of data=()&search= it is search=()&data=

2) urlencode also encoded the + in Generate+value

I believe the first change is due to the random behaviour of dictionary. So, I though of using OrderedDict to reverse the dictionary. As, I am using python 2.6.5 I did

Click to copy

pip install ordereddict

But I am not able to use it in my code when I try

Click to copy

>>> od = OrderedDict((('a', 'first'), ('b', 'second')))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'OrderedDict' is not defined

So, my question is what is the correct way to use OrderedDict in python 2.6.5 and how do I make urlencode ignores the + in Generate+value.

Also, is this the correct approach to build URL.

699

asked May 26 '12 11:05

RanRag

3 Answers

You shouldn't worry about encoding the + it should be restored on the server after unescaping the url. The order of named parameters shouldn't matter either.

Considering OrderedDict, it is not Python's built in. You should import it from collections:

Click to copy

from urllib import urlencode, quote
# from urllib.parse import urlencode # python3
from collections import OrderedDict

initial_url = "http://www.stackoverflow.com"
search = "Generate+value"
query_string = urlencode(OrderedDict(data=initial_url,search=search))
url = 'www.example.com/find.php?' + query_string

if your python is too old and does not have OrderedDict in the module collections, use:

Click to copy

encoded = "&".join( "%s=%s" % (key, quote(parameters[key], safe="+")) 
    for key in ordered(parameters.keys()))

Anyway, the order of parameters should not matter.

Note the safe parameter of quote. It prevents + to be escaped, but it means , server will interpret Generate+value as Generate value. You can manually escape + by writing %2Band marking % as safe char:

116

answered Oct 18 '22 22:10

Aleš Kotnik

First, the order of parameters in a http request should be completely irrelevant. If it isn't then the parsing library on the othe side is doing something wrong.

Second, of course the + is encoded. + is used as placeholder for a space in an encoded url, so if yor raw string contains a +, this has to be escaped. urlencode expects an unencoded string, you can't pass it a string that is already encoded.

answered Oct 18 '22 23:10

mata

Some comments on the question and other answers:

If you want to preserve order with urllib.urlencode, submit an ordered sequence of k/v pairs instead of mapping(dict). when you pass in a dict, urlencode just calls foo.items() to grab an iterable sequence.

# urllib.urlencode accepts a mapping or sequence # the output of this can vary, because `items()` is called on the dict urllib.urlencode({"data": initial_url,"search": search}) # the output of this will not vary urllib.urlencode((("data", initial_url), ("search", search)))

you can also pass in a secondard doseq argument to adjust how iterable values are handled.

The order of parameters is not irrelevant. take these two urls for example:

https://example.com?foo=bar&bar=foo https://example.com?bar=foo&foo=bar

A http server should consider the order of these parameters irrelevant, but a function designed to compare URLs would not. In order to safely compare urls, these params would need to be sorted.

However, consider duplicate keys:

https://example.com?foo=3&foo=2&foo=1

The URI specs support duplicate keys, but don't address precedence or ordering.

In a given application, these could each trigger different results and be valid as well:

Click to copy

https://example.com?foo=1&foo=2&foo=3
https://example.com?foo=1&foo=3&foo=2
https://example.com?foo=2&foo=3&foo=1
https://example.com?foo=2&foo=1&foo=3
https://example.com?foo=3&foo=1&foo=2
https://example.com?foo=3&foo=2&foo=1

The + is a reserved character that represents a space in a urlencoded form (vs %20 for part of the path). urllib.urlencode escapes using urllib.quote_plus(), not urllib.quote(). The OP most likely wanted to just do this:

initial_url = "http://www.stackoverflow.com" search = "Generate value" urllib.urlencode((("data", initial_url), ("search", search)))

Which produces:

data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value

as the output.

answered Oct 18 '22 23:10

Jonathan Vanasco

Related questions
                            
                                How to prevent PytestCollectionWarning when testing class Testament via pytest
                            
                                Getting only decoded payload from JWT in python
                            
                                Clone base environment in anaconda
                            
                                Getting AttributeError: module 'pandas' has no attribute 'json_normalize' while calling method "Access OutbreakLocation data"
                            
                                Python plotting: How can I make matplotlib.pyplot stop forcing the style of my markers?
                            
                                How do I use easy_install and buildout when pypi is down?
                            
                                Reversing Django URLs With Extra Options
                            
                                Parse an HTTP request Authorization header with Python
                            
                                SQLAlchemy Many-to-Many Relationship on a Single Table
                            
                                Programmatic Python Browser with JavaScript
                            
                                How can I add a Picture to a QWidget in PyQt4
                            
                                Python, store a dict in a database
                            
                                In Python, how do you use decimal module in a script rather than the interpreter?
                            
                                python logger logging same entry numerous times
                            
                                What is a good place to store configuration in Google AppEngine (python)
                            
                                Checking if an ISBN number is correct
                            
                                Sending Meeting Invitations With Python
                            
                                testing for empty/null string in django
                            
                                How to change the dtype of certain columns of a numpy recarray?
                            
                                What is the advantage of using the native C++ Qt over PyQt [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Build query string using urlencode python

Tags:

python

dictionary

urllib

RanRag

People also ask

3 Answers

Aleš Kotnik

mata

Jonathan Vanasco

Recent Activity

Donate For Us