Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python tempfile + gzip + json dump

I want to dump very large dictionary in to a compressed json file using python3 (3.5).

import gzip
import json
import tempfile

data = {"verylargedict": True}

with tempfile.NamedTemporaryFile("w+b", dir="/tmp/", prefix=".json.gz") as fout:
    with gzip.GzipFile(mode="wb", fileobj=fout) as gzout:
        json.dump(data, gzout)

I got this error though.

Traceback (most recent call last):
  File "test.py", line 13, in <module>
    json.dump(data, gzout)
  File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
    fp.write(chunk)
  File "/usr/lib/python3.5/gzip.py", line 258, in write
    data = memoryview(data)
TypeError: memoryview: a bytes-like object is required, not 'str'

Any thoughts?

like image 628
Amir Avatar asked Nov 17 '25 08:11

Amir


1 Answers

Gzip object has no text mode. So I would create a wrapper to pass as the filehandle object. This wrapper takes data from json and encodes it as binary to write in the gzip file:

class wrapper:
    def __init__(self,gzout):
        self.__handle = gzout
    def write(self,data):
        self.__handle.write(data.encode())

use like this:

json.dump(data, wrapper(gzout))

each time json.dump wants to write to the object, the wrapper.write method is called, which converts text to binary and writes to the binary stream

(some built-in wrappers from io module may fit too, but this implementation is simple and works)

like image 89
Jean-François Fabre Avatar answered Nov 19 '25 22:11

Jean-François Fabre



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!