How can I write a query to find all records in a table that have a null/empty field? I tried tried the query below, but it doesn't return anything.
SELECT * FROM book WHERE author = 'null';
You cannot query by nulls in Cassandra (like you can in a relational database), because: Cassandra requires all fields in the WHERE clause to be part of the primary key. Cassandra will not allow a part of a primary key to hold a null value.
SELECT * FROM yourTableName WHERE yourSpecificColumnName IS NULL OR yourSpecificColumnName = ' '; The IS NULL constraint can be used whenever the column is empty and the symbol ( ' ') is used when there is empty value.
null
fields don't exist in Cassandra unless you add them yourself.
You might be thinking of the CQL data model, which hides certain implementation details in order to have a more understandable data model. Cassandra is sparse, which means that only data that is used is actually stored. You can visualize this by adding in some test data to Cassandra through CQL.
cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 } ;
cqlsh> use test ;
cqlsh:test> CREATE TABLE foo (name text, age int, pet text, primary key (name)) ;
cqlsh:test> insert into foo (name, age, pet) values ('yves', 81, 'german shepherd') ;
cqlsh:test> insert into foo (name, pet) values ('coco', 'ferret') ;
cqlsh:test> SELECT * FROM foo ;
name | age | pet
-----+-----+------------------
coco | null | ferret
yves | 81 | german shepherd
So even it appears that there is a null value, the actual value is nonexistent -- CQL is showing you a null
because this makes more sense, intuitively.
If you take a look at the table from the Thrift side, you can see that the table contains no such value for coco
's age.
$ bin/cassandra-cli
[default@unknown] use test;
[default@test] list foo;
RowKey: coco
=> (name=, value=, timestamp=1389137986090000)
=> (name=age, value=00000083, timestamp=1389137986090000)
-------------------
RowKey: yves
=> (name=, value=, timestamp=1389137973402000)
=> (name=age, value=00000051, timestamp=1389137973402000)
=> (name=pet, value=6765726d616e207368657068657264, timestamp=1389137973402000)
Here, you can clearly see that yves
has two columns: age
and pet
, while coco
only has one: age
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With