Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to append data to existing LMDB?

I have around 1 million images to put in this dataset 10000 at a time appended to the set.

I"m sure the map_size is wrong with ref from this article

used this line to create the set

env = lmdb.open(Path+'mylmdb', map_size=int(1e12)

use this line every 10000 sample to write data to file where X and Y are placeholders for the data to be put in the LMDB.

env = create(env, X[:counter,:,:,:],Y,counter)


def create(env, X,Y,N):
    with env.begin(write=True) as txn:
        # txn is a Transaction object
        for i in range(N):
            datum = caffe.proto.caffe_pb2.Datum()
            datum.channels = X.shape[1]
            datum.height = X.shape[2]
            datum.width = X.shape[3]
            datum.data = X[i].tostring()  # or .tostring() if numpy < 1.9
            datum.label = int(Y[i])
            str_id = '{:08}'.format(i)

            # The encode is only essential in Python 3
            txn.put(str_id.encode('ascii'), datum.SerializeToString())
        #pdb.set_trace()
    return env

How can I edit this code such that new data is added to this LMDB and not replaced as this present method replaces it in the same position. I have check the length after generation with the env.stat().

like image 286
Arsenal Fanatic Avatar asked Jan 16 '16 00:01

Arsenal Fanatic


1 Answers

Le me expand on my comment above.

All entries in LMDB are stored according to unique keys and your database already contains keys for i = 0, 1, 2, .... You need a way to find unique keys for each i. The simplest way to do that is to find the largest key in existing DB and keep adding to it.

Assuming that existing keys are consecutive,

max_key = env.stat()["entries"]

Otherwise, a more thorough approach is to iterate over all keys. (Check this.)

max_key = 0
for key, value in env.cursor():
    max_key = max(max_key, key)

Finally, simply replace line 7 of your for loop,

str_id = '{:08}'.format(i)

by

str_id = '{:08}'.format(max_key + 1 + i)

to append to the existing database.

like image 123
Sudeep Juvekar Avatar answered Sep 29 '22 00:09

Sudeep Juvekar