Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Append list of Python dictionaries to a file without loading it

Suppose I need to have a database file consisting of a list of dictionaries:

file:

[
  {"name":"Joe","data":[1,2,3,4,5]},
  {   ...                         },
           ...
]

I need to have a function that receives a list of dictionaries as shown above and appends it to the file. Is there any way to achieve that, say using json (or any other method), without loading the file?

EDIT1: Note: What I need, is to append new dictionaries to an already existing file on the disc.

like image 860
jazzblue Avatar asked Aug 06 '13 18:08

jazzblue


People also ask

How do I concatenate a dictionary list?

To merge multiple dictionaries, the most Pythonic way is to use dictionary comprehension {k:v for x in l for k,v in x. items()} to first iterate over all dictionaries in the list l and then iterate over all (key, value) pairs in each dictionary.

Is there an append function for dictionaries?

To append an element to an existing dictionary, you have to use the dictionary name followed by square brackets with the key name and assign a value to it.

Can I add list as dictionary value Python?

By using ” + ” operator we can append the lists of each key inside a dictionary in Python.


2 Answers

If you are looking to not actually load the file, going about this with json is not really the right approach. You could use a memory mapped file… and never actually load the file to memory -- a memmap array can open the file and build an array "on-disk" without loading anything into memory.

Create a memory-mapped array of dicts:

>>> import numpy as np
>>> a = np.memmap('mydict.dat', dtype=object, mode='w+', shape=(4,))
>>> a[0] = {'name':"Joe", 'data':[1,2,3,4]}
>>> a[1] = {'name':"Guido", 'data':[1,3,3,5]}
>>> a[2] = {'name':"Fernando", 'data':[4,2,6,9]}
>>> a[3] = {'name':"Jill", 'data':[9,1,9,0]}
>>> a.flush()
>>> del a

Now read the array, without loading the file:

>>> a = np.memmap('mydict.dat', dtype=object, mode='r')

The contents of the file are loaded into memory when the list is created, but that's not required -- you can work with the array on-disk without loading it.

>>> a.tolist()
[{'data': [1, 2, 3, 4], 'name': 'Joe'}, {'data': [1, 3, 3, 5], 'name': 'Guido'}, {'data': [4, 2, 6, 9], 'name': 'Fernando'}, {'data': [9, 1, 9, 0], 'name': 'Jill'}]

It takes a negligible amount of time (e.g. nanoseconds) to create a memory-mapped array that can index a file regardless of size (e.g. 100 GB) of the file.

like image 193
Mike McKerns Avatar answered Oct 20 '22 14:10

Mike McKerns


You can use json to dump the dicts, one per line. Now each line is a single json dict that you've written. You loose the outer list, but you can add records with a simple append to the existing file.

import json
import os

def append_record(record):
    with open('my_file', 'a') as f:
        json.dump(record, f)
        f.write(os.linesep)

# demonstrate a program writing multiple records
for i in range(10):
    my_dict = {'number':i}
    append_record(my_dict)

The list can be assembled later

with open('my_file') as f:
    my_list = [json.loads(line) for line in f]

The file looks like

{"number": 0}
{"number": 1}
{"number": 2}
{"number": 3}
{"number": 4}
{"number": 5}
{"number": 6}
{"number": 7}
{"number": 8}
{"number": 9}
like image 33
tdelaney Avatar answered Oct 20 '22 14:10

tdelaney