Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do a join queries with 2 or more tables in cassandra cql

Tags:

cassandra

cql

I am new to cassandra. Here I have two tables EVENTS and TOWER. I need to join those for some queries. But I'm not enable to do it.

Structure of EVENTS table:

eid int PRIMARY KEY,
a_end_tow_id text,
a_home_circle text,
a_home_operator text,
a_imei text,
a_imsi text,

Structure of TOWER table:

 tid int PRIMARY KEY,
 tower_address_1 text,
 tower_address_2 text,
 tower_azimuth text,
 tower_cgi text,
 tower_circle text,
 tower_id_no text,
 tower_lat_d text,
 tower_long_d text,
 tower_name text,

Now, I want to join these table with respect to EID and TID so that I can fetch the data of both tables.

like image 228
BlueShark Avatar asked Jun 22 '13 07:06

BlueShark


People also ask

How do I join two tables in Cassandra?

You cannot perform joins in Cassandra. If you have designed a data model and find that you need something like a join, you'll have to either do the work on the client side, or create a denormalized second table that represents the join results for you. This latter option is preferred in Cassandra data modeling.

How do I use group by in Cassandra?

The GROUP BY option can condense all selected rows that share the same values for a set of columns into a single row. Using the GROUP BY option, rows can be grouped at the partition key or clustering column level. Consequently, the GROUP BY option only accepts primary key columns in defined order as arguments.

Does Cassandra support nested queries?

Cassandra cannot do joins or subqueries. Rather, Casssandra emphasizes denormalization through features like collections. A column family (called "table" since CQL 3) resembles a table in a RDBMS (Relational Database Management system).

How the relationships are handled in Cassandra?

Handling One to One Relationship in CassandraOne to one relationship means two tables have one to one correspondence. For example, the student can register only one course, and I want to search on a student that in which course a particular student is registered in.

Is it possible to join two Cassandra tables with CQL?

It won’t work as we can’t do joins natively with CQL, and we would need to run at least 2 queries, one to each table, and then programmatically connect that data. If we try running the same query in the Spark Shell, then it will work. We will get results of a join statement that joins 2 Cassandra tables based off of the journey_id in both tables.

How to use join query in SQL with examples?

How to Use Join Query in SQL with Examples. 1. Left Join. Left Join = All rows from left table + INNER Join. Let us consider two tables and apply Left join on the tables: –. 2. RIGHT Join. 3. INNER Join. 4. FULL OUTER Join.

How do I create join queries on Cassandra data outside spark?

You can create join queries on Cassandra data outside of Spark by using DataStax’s free ODBC driver (we also supply an ODBC driver for Spark). This means that any developer/DBA/BI/ETL tool that has ODBC connectivity can connect to and query data in Cassandra.

Are join operations now possible with Apache Cassandra™?

While this is true for some N oSQL databases, we thought it would be helpful to remind Apache Cassandra™ users that join operations are indeed now possible with Cassandra. There are a couple of ways that you can join tables together in Cassandra and query them:


2 Answers

Cassandra = No Joins. Your model is 100% relational. You need to rethink it for Cassandra. I would advice you take a look at these slides. They dig deep into how to model data for cassandra. Also here is a webinar covering the topic. But stop thinking foreign keys and joining tables, because if you need relations cassandra isn't the tool for the job.

But Why?
Because then you need to check consistency and do many other things that relational databases do and so you loose the performance and scalability that cassandra offers.

What can I do?
DENORMALIZE! Lots of data in one table? But the table will have too many columns!
So? Cassandra can handle a very large number of columns in a table.

The other thing you can do is to simulate the join in your client application. Match the two datasets in your code, but this will be very slow because you'll have to iterate over all your information.

Another way is to carry out multiple queries. Select the event you want, then the matching tower.

like image 89
Lyuben Todorov Avatar answered Oct 20 '22 12:10

Lyuben Todorov


There are a couple of ways that you can join tables together in Cassandra and query them. But of course you have to rethink the data model part.

  1. Use Apache Spark’s SparkSQL™ with Cassandra (either open source or in DataStax Enterprise – DSE).
  2. Use DataStax provided ODBC connectors with Cassandra and DSE.
like image 33
Mayank Raghav Avatar answered Oct 20 '22 13:10

Mayank Raghav