Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an alternative to parse_qs that handles semi-colons?

TL;DR

What libraries/calls are available to handle query strings containing semi-colons differently than parse_qs?

>>> urlparse.parse_qs("tagged=python;ruby")
>>> {'tagged': ['python']}

Full Background

I'm working with the StackExchange API to search for tagged questions.

Search is laid out like so, with tags separated by semi-colons:

/2.1/search?order=desc&sort=activity&tagged=python;ruby&site=stackoverflow

Interacting with the API is just fine. The problem comes in when I want to test the calls, particularly when using httpretty to mock HTTP.

Under the hood, httpretty is using urlparse.parse_qs from the python standard libraries to parse the querystring.

>>> urlparse.parse_qs("tagged=python;ruby")
{'tagged': ['python']}

Clearly that doesn't work well. That's the small example, here's a snippet of httpretty (outside of testing context).

import requests
import httpretty

httpretty.enable()

httpretty.register_uri(httpretty.GET, "https://api.stackexchange.com/2.1/search", body='{"items":[]}')
resp = requests.get("https://api.stackexchange.com/2.1/search", params={"tagged":"python;ruby"})
httpretty_request = httpretty.last_request()
print(httpretty_request.querystring)

httpretty.disable()
httpretty.reset()

I want to use the machinery from httpretty, but need a workaround for parse_qs. I can monkey patch httpretty for now, but would love to see what else can be done.

like image 504
Kyle Kelley Avatar asked Nov 11 '22 15:11

Kyle Kelley


1 Answers

To get around this, I temporarily monkey patched httpretty.core.unquote_utf8 (technically httpretty.compat.unquote_utf8).

#
# To get around how parse_qs works (urlparse, under the hood of
# httpretty), we'll leave the semi colon quoted.
# 
# See https://github.com/gabrielfalcao/HTTPretty/issues/134
orig_unquote = httpretty.core.unquote_utf8
httpretty.core.unquote_utf8 = (lambda x: x)

# It should handle tags as a list
httpretty.register_uri(httpretty.GET,
                       "https://api.stackexchange.com/2.1/search",
                       body=param_check_callback({'tagged': 'python;dog'}))
search_questions(since=since, tags=["python", "dog"], site="pets")

...

# Back to normal for the rest
httpretty.core.unquote_utf8 = orig_unquote
# Test the test by making sure this is back to normal
assert httpretty.core.unquote_utf8("%3B") == ";"

This assumes you don't need anything else unquoted. Another option is to only leave the semi-colons percent-encoded before it reaches parse_qs.

like image 192
Kyle Kelley Avatar answered Nov 15 '22 06:11

Kyle Kelley