Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inserting arbitrary columns in Cassandra using CQL3

Prior to CQL3 one could insert arbitrary columns such as columns that are named by a date:

cqlsh:test>CREATE TABLE seen_ships (day text PRIMARY KEY)
                WITH comparator=timestamp AND default_validation=text;
cqlsh:test>INSERT INTO seen_ships (day, '2013-02-02 00:08:22')
                VALUES ('Tuesday', 'Sunrise');

Per this post It seems that things are different in CQL3. Is it still somehow possible to insert arbitrary columns? Here's my failed attempt:

cqlsh:test>CREATE TABLE seen_ships (
    day text,
    time_seen timestamp,
    shipname text,
    PRIMARY KEY (day, time_seen)
);

cqlsh:test>INSERT INTO seen_ships (day, 'foo') VALUES ('Tuesday', 'bar');

Here I get Bad Request: line 1:29 no viable alternative at input 'foo'

So I try a slightly different table because maybe this is a limitation of compound keys:

cqlsh:test>CREATE TABLE seen_ships ( day text PRIMARY KEY );
cqlsh:test>INSERT INTO seen_ships (day, 'foo') VALUES ('Tuesday', 'bar');

Again with the Bad Request: line 1:29 no viable alternative at input 'foo'

What am I missing here?

like image 998
JnBrymn Avatar asked Jul 03 '13 03:07

JnBrymn


People also ask

How do I add a column in Cassandra?

You can add a column in the table by using the ALTER command. While adding column, you have to aware that the column name is not conflicting with the existing column names and that the table is not defined with compact storage option.

How do I insert multiple rows in Cassandra?

There is a batch insert operation in Cassandra. You can batch together inserts, even in different column families, to make insertion more efficient. In Hector, you can use HFactory. createMutator then use the add methods on the returned Mutator to add operations to your batch.

How do you add values to a table in Cassandra?

Cassandra Create Data INSERT command is used to insert data into the columns of the table. Syntax: INSERT INTO <tablename> (<column1 name>, <column2 name>....)


1 Answers

There's a good blog post over on the Datastax blog about this: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows

The answer is that yes, CQL3 supports dynamic colums, just not the way it worked in earlier versions of CQL. I don't really understand your example, you mix datestamps with strings in a way I don't see how it worked in CQL2 either. If I understand you correctly you want to make a timeline of ship sightings, where the partition key (row key) is the day and each sighting is a time/name pair. Here's a suggestion:

CREATE TABLE ship_sightings (
  day TEXT,
  time TIMESTAMP,
  ship TEXT,
  PRIMARY KEY (day, time)
)

And you insert entries with

INSERT INTO ship_sightings (day, time, ship) VALUES ('Tuesday', NOW(), 'Titanic')

however, you should probably use a TIMEUUID instead of TIMESTAMP (and the primary key could be a DATE), since otherwise you might add two sightings with the same timestamp and only one will survive.

This was an example of wide rows, but then there's the issue of dynamic columns, which isn't exactly the same thing. Here's an example of that in CQL3:

CREATE TABLE ship_sightings_with_properties (
  day TEXT,
  time TIMEUUID,
  ship TEXT,
  property TEXT,
  value TEXT,
  PRIMARY KEY (day, time, ship, property)
)

which you can insert into like this:

INSERT INTO ship_sightings_with_properties (day, time, ship, property, value)
VALUES ('Sunday', NOW(), 'Titanic', 'Color', 'Black')
# you need to repeat the INSERT INTO for each statement, multiple VALUES isn't
# supported, but I've not included them here to make this example shorter
VALUES ('Sunday', NOW(), 'Titanic', 'Captain', 'Edward John Smith')
VALUES ('Sunday', NOW(), 'Titanic', 'Status', 'Steaming on')
VALUES ('Monday', NOW(), 'Carapathia', 'Status', 'Saving the passengers off the Titanic')

The downside with this kind of dynamic columns is that the property names will be stored multiple times (so if you have a thousand sightings in a row and each has a property called "Captain", that string is saved a thousand times). On-disk compression takes away most of that overhead, and most of the time it's nothing to worry about.

Finally a note about collections in CQL3. They're a useful feature, but they are not a way to implement wide rows or dynamic columns. First of all they have a limit of 65536 items, but Cassandra can't enforce this limit, so if you add too many elements you might not be able to read them back later. Collections are mostly for small multi-values fields -- the canonical example is an address book where each row is an entry and where entries only have a single name, but multiple phone numbers, email addresses, etc.

like image 144
Theo Avatar answered Sep 27 '22 23:09

Theo