Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python requests call with URL using parameters

I am trying to make a call to the import.io API. This call needs to have the following structure:

'https://extraction.import.io/query/extractor/{{crawler_id}}?_apikey=xxx&url=http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35'

You can see in that call, the parameter "url" has to be also included:

http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35

It just so happens that this secondary URL also needs parameters. But if I pass it as a normal string like in the example above, the API response only includes the part before the first parameter when I get the API response:

http://www.example.co.uk/items.php?sortby=Price_LH

And this is not correct, it appears as if it would be making the call with the incomplete URL instead of the one I passed in.

I am using Python and requests to do the call in the following way:

import requests
import json

row_dict = {'url': u'http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35', 'crawler_id': u'zzz'}
url_call = 'https://extraction.import.io/query/extractor/{0}?_apikey={1}&url={2}'.format(row_dict['crawler_id'], auth_key, row_dict['url'])
r = requests.get(url_call)
rr = json.loads(r.content)

And when I print the reuslt:

"url" : "http://www.example.co.uk/items.php?sortby=Price_LH",

but when I print r.url:

https://extraction.import.io/query/extractor/zzz?_apikey=xxx&url=http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35

So in the URL it all seems to be fine but not in the response.

I tried this with other URLs and all get cut after the first parameter.

like image 644
johan855 Avatar asked Jul 20 '16 08:07

johan855


People also ask

How do you pass parameters in URL in Python?

To send parameters in URL, write all parameter key:value pairs to a dictionary and send them as params argument to any of the GET, POST, PUT, HEAD, DELETE or OPTIONS request. then https://somewebsite.com/?param1=value1&param2=value2 would be our final url.

Can we pass URL parameters in GET request?

When the GET request method is used, if a client uses the HTTP protocol on a web server to request a certain resource, the client sends the server certain GET parameters through the requested URL. These parameters are pairs of names and their corresponding values, so-called name-value pairs.


2 Answers

The requests library will handle all of your URL encoding needs. This is the proper way to add parameters to a URL using requests:

import requests

base_url = "https://extraction.import.io/query/extractor/{{crawler_id}}"
params = dict()
params["_apikey"] = "xxx"
params["url"] = "http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35"

r = requests.get(base_url, params=params)
print(r.url)

An arguably more readable way to format your parameters:

params = {
    "_apikey" : "xxx",
    "url" : "http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35"
}

Note that the {{crawler_id}} piece above is not a URL parameter but part of the base URL. Since Requests is not performing general string templating something else should be used to address that (see comments below).

like image 103
Demitri Avatar answered Oct 05 '22 15:10

Demitri


you will need to URL encode the URL you are sending to the API.

The reason for this is that the ampersands are interpretted by the server as markers for parameters for the URL https://extraction.import.io/query/extractor/XXX?

This is why they are getting stripped in the url:

http://www.example.co.uk/items.php?sortby=Price_LH

Try the following using urllib.quote(row_dict['url']):

import requests
import json
import urllib

row_dict = {
  'url': u'http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35',
  'crawler_id': u'zzz'}
url_call = 'https://extraction.import.io/query/extractor/{0}?_apikey={1}&url={2}'.format(
  row_dict['crawler_id'], auth_key, urllib.quote(row_dict['url']))
r = requests.get(url_call)
rr = json.loads(r.content)
like image 26
Bam4d Avatar answered Oct 05 '22 14:10

Bam4d