Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How insert in Cassandra without null value in Column

I'm trying to store some tweets in Cassandra Database using Python and DataStax driver ( Python -> Cassandra ).

Everything works well, but there's something that I can't understand. How to insert a row without null value ?

As example,

CREATE TABLE tweets (

id_tweet text PRIMARY KEY,
texttweet text,
hashtag text,
url text,
)

If I want to insert a row without url value, it's working but in Cassandra I'll see "null" in url column.

I check this doc :

http://datastax.github.io/python-driver/getting_started.html#passing-parameters-to-cql-queries

So I tried 2 differents ways :

First one, I create the String as a full String, and execute it.

requete = "insert into Tweets(id_tweet,texttweet,hashtag,url) values ('%s','%s','%s','%s')"%(id_tweet,texttweet,hashtag,url)
session.execute(requete)

Or
I send parameters in the execute function.

requete2 = "insert into Tweets(id_tweet,texttweet,hashtag,url) values ('%s','%s','%s','%s')"
session.execute(requete2,(id_tweet,id_texttweet,hashtag,url))

Problem is, the 2differents ways give me null value if i get no URL or Hashtag in my tweet as example.

Is it possible to not see the column if it's empty in a row, like I see in lot of tutorials ?

enter image description here

Thanks.

like image 455
Gohmz Avatar asked Jan 06 '16 16:01

Gohmz


People also ask

Can we insert NULL value in Cassandra?

But did you know that inserting a null value into a field from a CQL statement also generates a tombstone? This happens because Cassandra cannot decide whether inserting a null value means that we are trying to void a field that previously had a value or that we do not want to insert a value for that specific field.

How can you insert null values in a column while inserting the data?

You also can specify the NULL keyword in the VALUES clause to indicate that a column should be assigned a NULL value. The following example inserts values into three columns of the orders table: INSERT INTO orders (orders_num, order_date, customer_num) VALUES (0, NULL, 123);


1 Answers

This is something you can do if you are using Cassandra 2.2 or later. In Cassandra 2.2 the concept of 'UNSET' was introduced. This allows you to use the same statement to insert a row, even if you don't want to provide some of the values, here's how you would do it:

from cassandra.query import UNSET_VALUE
ps = session.prepare("insert into tweets(id_tweet,texttweet,hashtag,url) values (?,?,?,?)")
session.execute(ps, ("id", "hello world!", UNSET_VALUE, UNSET_VALUE));

This would indicate to cassandra that you don't want to insert these values as null, rather they should be ommitted all together so no 'null' values (internally these are tombstones) are inserted into cassandra.

On your side, I think you would need to do some preprocessing logic to convert any incoming None values into UNSET_VALUE. The pre 2.2 solution would be to adjust your query based on what columns are absent, i.e insert into tweets(id_tweet,texttweet) values (?,?) if hashtag and url are None.

On the retrieval end, there should technically be away to distinguish between null and unset values (I'll look into this), but I don't think such a mechanism exists in the python driver. I'll open up a ticket if its possible to do in the protocol but the feature isn't present in the driver. EDIT: It doesn't look like cassandra differentiates between values that were explicitly set to null (which are marked internally as tombstones) and those that were never set when returning data.

You can read more about 'UNSET' and other 2.2 features in the python driver in this blog post.

like image 161
Andy Tolbert Avatar answered Oct 27 '22 08:10

Andy Tolbert