Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insert Binary data into Mongo field in pymongo

I'm trying to do something that I feel aught to be pretty trivial, so forgive me if there's some easy solution out there elsewhere.

I'm writing tests for some content indexing and for this I'm trying to insert some binary data (a pdf) into a mongo collection that I have. However, I'm having a good deal of trouble with this. This is the current state of my relevant code

pseudo_file = StringIO()
pdf = pisa.CreatePDF("This is a test", pseudo_file)
test = {"data": pseudo_file}
test.update({"files_id": {"name": "random_asset_name"}, "category": "asset"})
self.chunk_collection.insert(json.dumps(test))

I managed to find an old thread on the Pymongo google group addressing this problem (https://groups.google.com/forum/#!topic/mongodb-user/uBAbY1wdQbs), but I can't seem to find the Binary object that was used to fix that problem and it doesn't seem to be included in Python (I'm using 2.7)

Right now the problem I'm getting is that the StringIO object is not JSON serializable, which is sensible, but pymongo needs a valid utf8 object passed to it. I tried using a base64 encoding of the StringIO.getvalue(), and just directly serializing the same value.

Of course the pdf is not value utf8, so I'm wondering if there's another way to have pymongo recognize that I am sending it a raw binary. Any help is appreciated.

like image 297
Slater Victoroff Avatar asked Aug 13 '13 15:08

Slater Victoroff


1 Answers

The Google group is actually correct however, sometime after the post on there the binary class was moved to the bson namespace as such you must import it from there.

Good examples exist on the documentation page: http://api.mongodb.org/python/current/api/bson/binary.html

like image 73
Sammaye Avatar answered Oct 04 '22 08:10

Sammaye