Is anyone having experience working with pycassa I have a doubt with it. How do I get all the keys that are stored in the database?
well in this small snippet we need to give the keys in order to get the associated columns (here the keys are 'foo' and 'bar'),that is fine but my requirement is to get all the keys (only keys) at once as Python list or similar data structure.
cf.multiget(['foo', 'bar'])
{'foo': {'column1': 'val2'}, 'bar': {'column1': 'val3', 'column2': 'val4'}}
Thanks.
try:
list(cf.get_range().get_keys())
more good stuff here: http://github.com/vomjom/pycassa
You can try: cf.get_range(column_count=0,filter_empty=False)
.
# Since get_range() returns a generator - print only the keys.
for value in cf.get_range(column_count=0,filter_empty=False):
print value[0]
get_range([start][, finish][, columns][, column_start][, column_finish][, column_reversed][, column_count][, row_count][, include_timestamp][, super_column][, read_consistency_level][, buffer_size])
Get an iterator over rows in a specified key range.
http://pycassa.github.com/pycassa/api/pycassa/columnfamily.html#pycassa.columnfamily.ColumnFamily.get_range
Minor improvement on Santhosh's solution
dict(cf.get_range(column_count=0,filter_empty=False)).keys()
If you care about order:
OrderedDict(cf.get_range(column_count=0,filter_empty=False)).keys()
get_range returns a generator. We can create a dict from the generator and get the keys from that.
column_count=0 limits results to the row_key. However, because these results have no columns we also need filter_empty.
filter_empty=False will allow us to get the results. However empty rows and range ghosts may be included in our result now.
If we don't mind more overhead, getting just the first column will resolve the empty rows and range ghosts.
dict(cf.get_range(column_count=1)).keys()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With