Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting correct timestamp from cassandra using datastax python-driver

I am retrieving timestamps from a table using the datastax python-driver. What I am trying to do is store the previously retrieved timestamp in a var and use it in the next query to retrieve a timestamp greater than the previous one. The query basically looks like this:

cqlsh> SELECT insert_time, message FROM cf WHERE message_key='q1' AND insert_time>'2013-10-30 10:32:44+0530' ORDER BY insert_time ASC LIMIT 1;


 insert_time              | message
--------------------------+----------------------------------
 2013-10-30 10:32:45+0530 | 83500612412011e3ab6c1c3e84abd9db

As you can see the timestamp from CQL is 2013-10-30 10:32:45+0530. But when i retrieve it via python-driver the results are different( I am executing the python query on a different system and not on any of the cass nodes ):

>>> from cassandra.cluster import Cluster
>>> c = Cluster( [10.60.60.2] )
>>> session = c.connect()
>>> q = "SELECT insert_time, message FROM cf WHERE message_key='q1' AND insert_time>'2013-10-30 10:32:44+0530' ORDER BY insert_time ASC LIMIT 1"
>>> rows = session.execute(q)
>>> print rows
[Row(insert_time=datetime.datetime(2013, 10, 30, 5, 2, 45, 4000), message=u'83500612412011e3ab6c1c3e84abd9db')]
>>> timestamp = rows[0][0]
>>> print t
2013-10-30 05:02:45.004000

As you can see the timestamp from python-driver is 2013-10-30 05:02:45.004000, which is different from the CQL one. Not only the time is different but the representation has changed. This cannot be used for comparing in subsequent queries.

QUESTIONS

  1. What am I doing wrong while retrieving timestamps in python ?
  2. Is there a way to output epoch time as int instead of the datetime format ?
  3. Is this something to do with clock syncs or timezone related ?
  4. Can anyone help me with this so that the python retrieved timestamps can be reused to compare against cass timestamps ?

Thanks in advance. Appreciate your help

SETUP

  • single host machine running vms;
  • cass sandbox- 3 headless vms running as single dc cluster;
  • python code being executed from host machine;
  • VMs date,time synchronized with host using ntp
  • [cqlsh 4.0.0 | Cassandra 2.0.0 | CQL spec 3.1.0 | Thrift protocol 19.37.0]
like image 317
quicksilvermd Avatar asked Oct 30 '13 07:10

quicksilvermd


People also ask

How is timestamp stored in Cassandra?

In Cassandra 3.4 and later, timestamps are displayed in cqlsh in sub-second precision by default, as shown below. Applications reading a timestamp may use the sub-second portion of the timestamp, as Cassandra stored millisecond-precision timestamps in all versions.

What is the default timezone in Cassandra?

I am working on cassandra 3.0 and I am struck at how to change the default time zone in cassandra. Default is Greenwich time or UTC or +00:00 offset.

How do I change timezone in Cassandra?

In Linux, you can change the TZ by setting the following environment variable in the same shell used to start cqlsh: export TZ='GMT' .

How do I use python with Cassandra?

Python module for working with Cassandra database is called Cassandra Driver. It is also developed by Apache foundation. This module contains an ORM API, as well as a core API similar in nature to DB-API for relational databases. Installation of Cassandra driver is easily done using pip utility.


1 Answers

It looks like cqlsh is displaying the timestamp in your local timezone (which is +0530). The python driver returns datetimes in UTC. For what it's worth, the data is stored in Cassandra as a unix timestamp, which doesn't have a concept of timezones.

My suggestion is that you always use UTC for datetimes until just before displaying it to the user.

like image 130
Tyler Hobbs Avatar answered Oct 14 '22 00:10

Tyler Hobbs