Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra Hector: How to retrieve all rows of a column family?

I am looking for a code example to retrieve all rows and all columns of a column family. Something like:

SELECT * FROM MyTable

I see that this can be done using a RangeSlicesQuery, but you still have to provide a certain range. And I think you have to specify the column names too. Is there a clean and safe way to do this?

Using Hector 1.0 and Cassandra 1.0.

like image 977
J. Volkya Avatar asked Dec 07 '11 16:12

J. Volkya


2 Answers

Try something like this:

public class Dumper {
    private final Cluster cluster;
    private final Keyspace keyspace;

    public Dumper() {
        this.cluster = HFactory.getOrCreateCluster("Name", "hostname");
        this.keyspace = HFactory.createKeyspace("Keyspace", cluster, new QuorumAllConsistencyLevelPolicy());
    }

    public void run() {
        int row_count = 100;

        RangeSlicesQuery<UUID, String, Long> rangeSlicesQuery = HFactory
            .createRangeSlicesQuery(keyspace, UUIDSerializer.get(), StringSerializer.get(), LongSerializer.get())
            .setColumnFamily("Column Family")
            .setRange(null, null, false, 10)
            .setRowCount(row_count);

        UUID last_key = null;

        while (true) {
            rangeSlicesQuery.setKeys(last_key, null);
            System.out.println(" > " + last_key);

            QueryResult<OrderedRows<UUID, String, Long>> result = rangeSlicesQuery.execute();
            OrderedRows<UUID, String, Long> rows = result.get();
            Iterator<Row<UUID, String, Long>> rowsIterator = rows.iterator();

            // we'll skip this first one, since it is the same as the last one from previous time we executed
            if (last_key != null && rowsIterator != null) rowsIterator.next();   

            while (rowsIterator.hasNext()) {
              Row<UUID, String, Long> row = rowsIterator.next();
              last_key = row.getKey();

              if (row.getColumnSlice().getColumns().isEmpty()) {
                continue;
              }


              System.out.println(row);
            }

            if (rows.getCount() < row_count)
                break;
        }
    }

    public static void main(String[] args) {
        new Dumper().run();
    }
}

This will page through the column family in pages of 100 rows. It will only fetch 10 columns for each row (you will want to page very long rows too).

This is for a column family with uuids for row keys, strings for column names and longs for values. Hopefully it should be obvious how to change this.

like image 103
tom.wilkie Avatar answered Sep 29 '22 20:09

tom.wilkie


Try this out:

    int rowCount = MAX;
    RangeSlicesQuery<String, String, String> rangeSlicesQuery = HFactory
            .createRangeSlicesQuery(keyspace2, STRINGSERIALIZER,
                    STRINGSERIALIZER, STRINGSERIALIZER)
            .setColumnFamily(columnFamily)
            .setRange(null, null, false, rowCount).setRowCount(rowCount);
    String lastKey = null;
    // Query to iterate over all rows of cassandra Column Family
    rangeSlicesQuery.setKeys(lastKey, null);
    QueryResult<OrderedRows<String, String, String>> result = rangeSlicesQuery
            .execute();
    OrderedRows<String, String, String> rows = result.get();
    for (Row<String, String, String> row : rows) {
        String cassandra_key = row.getKey();
    }

}
like image 34
Jimmy Avatar answered Sep 29 '22 19:09

Jimmy