Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I count and enumerate the keys in an lmdb with python?

Tags:

python

lmdb

import lmdb
env = lmdb.open(path_to_lmdb)

Now I seem to need to create a transaction and a cursor, but how do I get a list of keys that I can iterate over?

like image 376
Doug Avatar asked Sep 09 '15 21:09

Doug


3 Answers

As Sait pointed out, you can iterate over a cursor to collect all keys. However, this may be a bit inefficient, as it would also load the values. This can be avoided, by using on the cursor.iternext() function with values=False.

with env.begin() as txn:
  keys = list(txn.cursor().iternext(values=False))

I did a short benchmark between both methods for a DB with 2^20 entries, each with a 16 B key and 1024 B value.

Retrieving keys by iterating over the cursor (including values) took 874 ms in average for 7 runs, while the second method, where only the keys are returned took 517 ms. These results may differ depending on the size of keys and values.

like image 64
critop Avatar answered Oct 04 '22 07:10

critop


A way to get the total number of keys without enumerating them individually, counting also all sub databases:

with env.begin() as txn:
    length = txn.stat()['entries']

Test result with a hand-made database of size 1000000 on my laptop:

  • the method above is instantaneous (0.0 s)
  • the iteration method takes about 1 second.
like image 15
sytrus Avatar answered Nov 14 '22 03:11

sytrus


Are you looking for something like this:

with env.begin() as txn:
    with txn.cursor() as curs:
        # do stuff
        print 'key is:', curs.get('key')

Update:

This may not be the fastest:

with env.begin() as txn:
   myList = [ key for key, _ in txn.cursor() ]
   print(myList)

Disclaimer: I don't know anything about the library, just searched its docs and searched for key in the docs.

like image 7
Sait Avatar answered Nov 14 '22 02:11

Sait