Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you create a dict-like object that returns duplicate values via items()

Tags:

python

Framework

I am using the hyper framework to generate HTTP/2 traffic. When generating requests and responses, I'm currently using the hyper.HTTP20Connection.request and h2.H2Connection.send_headers to send HTTP/2 requests and responses, respectively.

My Requirement

I need to be able to send HTTP/2 requests and responses with duplicated fields. Here, for example, is a YAML specified request that contains two x-test-duplicate fields:

  headers:
    fields:
    - [ :method, GET, equal ]
    - [ :scheme, https, equal ]
    - [ :authority, example.data.com, equal ]
    - [ :path, '/a/path?q=3', equal ]
    - [ Accept, '*/*' ]
    - [ Accept-Language, en-us ]
    - [ Accept-Encoding, gzip ]
    - [ x-test-duplicate, first ]
    - [ x-test-duplicate, second ]
    - [ Content-Length, "0" ]

Note that, per the HTTP/2 specification, this is explicitly allowed. See for example RFC 7541 section 2.3.2:

The dynamic table can contain duplicate entries (i.e., entries with the same name and same value). Therefore, duplicate entries MUST NOT be treated as an error by a decoder.

My Problem

The problem is that while h2.H2Connection.send_headers properly handles an iterable of tuples that can contain duplicate fields (e.g., (("name1", "value1"), ("name2", "value2"), ("name1", "another_value"))), hyper.HTTP20Connection.request requires a dictionary which, of course, is not designed for duplicate keys. The documentation isn't clear about the type requirements upon headers, but in the source code for HTTP20Connection:request, line 261, items() is called off of it. If I pass an iterable of tuples, I get AttributeError: 'tuple' object has no attribute 'items'. Note how sad this is: the hyper framework forces the user to pass in a dictionary, which doesn't allow duplicates, then immediately turns that dictionary into an iterable of tuples via items(), the latter of which would allow duplicate fields. If it just took an iterable of tuples to start with, like h2's interface, I would not have this problem.

My Question

I filed issue 437 in the hyper github project about this limitation. In the meantime I am hoping that I can work around this problem. I have an iterable of tuples representing the HTTP/2 headers with duplicate fields. Can I somehow wrap that in an object such that when HTTP20Connection:request, line 261 calls items() against it, it just returns the iterable of tuples?

like image 957
firebush Avatar asked Jan 25 '23 13:01

firebush


2 Answers

Python typically uses an EmailMessage instance for this (docs).

The EmailMessage dictionary-like interface is indexed by the header names, which must be ASCII values. The values of the dictionary are strings with some extra methods. Headers are stored and returned in case-preserving form, but field names are matched case-insensitively. Unlike a real dict, there is an ordering to the keys, and there can be duplicate keys. Additional methods are provided for working with headers that have duplicate keys.

In basic usage, you would set items using EmailMessage.add_header and retrieve them with EmailMessage.items. You may leave the payload blank.

>>> from email.message import EmailMessage
>>> headers = EmailMessage()
>>> headers.add_header("x-test-duplicate", "first")
>>> headers.add_header("x-test-duplicate", "second")
>>> headers.items()
[('x-test-duplicate', 'first'), ('x-test-duplicate', 'second')]

This is battle tested in urllib itself, which uses the same class for HTTPResponse headers.

The advantage over rolling your own multidict or using a simple list of pairs is that you'll get the correct behavior for headers (RFC 5322 and RFC 6532 style field names and values), for example case-insensitivity:

>>> headers.add_header("aBc", "val1")
>>> headers.add_header("AbC", "val2")
>>> headers.get_all("ABC")
['val1', 'val2']
like image 189
wim Avatar answered Jan 27 '23 03:01

wim


Since Python supports Duck Typing, a class that has an items method is sufficient:

class Wrapper:
    __slots__ = ["pairs"]  # Minor optimization
    def __init__(self, pairs: Iterable[Tuple[str, str]]):
        self.pairs = pairs

    def items(self) -> Iterable[Tuple[str, str]]:
        return self.pairs

Then:

header_pairs = [("name1", "value1"), ("name2", "value2"), ("name1", "another_value")]
wrapped = Wrapper(header_pairs)
request(method, url, body, headers=wrapped)

The major issue with this is it relies entirely on implementation details of request. If they chose in the future to use headers for other dict-like purposes, this will break quite easily.

like image 23
Carcigenicate Avatar answered Jan 27 '23 02:01

Carcigenicate