I have tables like Job, JobInfo. And i want to perform queries like below -
"SELECT J.JobID FROM Job J, JobInfo B WHERE B.JobID = J.JobID AND BatchID=5850 AND B.Status=0 AND J.JobType<>2"
How shall i go about writing my redis data types so that i can map such queries in redis?
IF i try to map the rows of table job in a redis hash for e.g. (hash j jobid 1 status 2) & similarly the rows of table JobInfo in again a redis hash as (hash jinfo jobid 1 jobtype 3.)
So my tables can be a set of hashes. Job table can be set with entries JobSet:jobid & JobInfo table can be set with entries like JobInfoSet:jobid
But i am confused in when i will do a SINTER on JobSet & JobInfoSet. how am i going to query that hash to get keys? As in the hash content of set jobSet is not identical to hash content of table JobInfoSet (they may have different key value pair.
So what exactly am i going to get as an output of SINTER? And how am i going to query that output as key-value pair?
So the tables will be a collection of redis hashes
Redis supports Pub/Sub with pattern matching and a variety of data structures such as lists, sorted sets, and hashes.
Redis is an open-source, in-memory key-value data store. A NoSQL database, Redis doesn't use structured query language, otherwise known as SQL. Redis instead comes with its own set of commands for managing and accessing data.
While short keys will obviously consume a bit less memory, your job is to find the right balance. Try to stick with a schema. For instance "object-type:id" is a good idea, as in "user:1000". Dots or dashes are often used for multi-word fields, as in "comment:1234:reply.to" or "comment:1234:reply-to".
Redis is not designed to structure the data in SQL way. Beside a in-memory key value store, it supports five types of data structures: Strings, Hashes, Lists, Sets and Sorted Sets. At high level this is a sufficient hint that Redis is designed to solve performance problems that arises due to high computation in relational data models. However, if you want to execute sql query in a in-memory structure, you may want to look at memsql.
Let's break down the SQL statement into different components and I'll try to show how redis can accomplish various parts.
Select J.JobID, J.JobName from Job J;
We translate each row in "Job" into a hash in redis using the SQL primary index as the redis natural index in redis. For example: SQL
==JobId==|==Name==
123 Fred
Redis HSET Job:123 Name Fred which can be conceptualized as
Job-123 => {"Name":"Fred"}
Thus we can store columns as hash fields in redis
Let's say we do the same thing for JobInfo. Each JobInfo object has its own ID
JobInfo-876 => {"meta1": "some value", "meta2": "bla", "JobID": "123"}
In sql normally we would make a secondary index on JobInfo.JobID but in NoSql land we maintain our own secondary indexes.
Sorted Sets are great for this. Thus when we want to fetch JobInfo objects by some field, JobId in this case we can add it to a sorted set like this ZADD JobInfo-JobID 123 JobInfo-876
This results in a set with 1 element in it {JobInfo-876} which has a score of 123. I realize that forcing all JobIDs into the float range for the score is a bad idea, but work with me here.
Now when we want to find all JobInfo objects for a given JobID we just do a log(N) lookup into the index. ZRANGEBYSCORE JobInfo-JobID 123 123 which returns "JobInfo-876"
Now to implement simple joins we simply reuse this JobInfo-JobID index by storing Job keys by their JobIDs. ZADD JobInfo-JobID 123 Job-123
Thus when doing something akin to SELECT J.JobID, J.Name, B.meta1 FROM Job, JobInfo USING (JobID).
This would translate to scanning through the JobInfo-JobID secondary index and reorganizing the Job and JobInfo objects returned. ZRANGEBYSCORE JobInfo-JobID -inf +inf WITHSCORES 5 -> (Job-123, JobInfo-876)
These objects all share the same JobID. CLient side you'd then asynchronously fetch the needed fields. Or you could embed these lookups in a lua script. This lua script could make redis hang for a long time. Normally redis tries to be fair with clients and prefers you to have short batched queries instead of one long query.
Now we come to a big problem, what if we want to combine secondary indexes. Let's say we have a secondary index on JobInfo.Status, and another on Job.JobType. If we make a set of all jobs with the right JobType and use that as a filter on the JobInfo-JobID shared secondary index then we not only eliminate the bad Job elements but also every JobInfo element. We could, I guess fetch the scores(JobID) on the intersection and refetch all JobInfo objects with those scores, but we lose some of the filtering we did.
It is at this point where redis breaks down.
Here is an article on secondary indexes from the creator of redis himself: http://redis.io/topics/indexes He touches multi-dimensional indexes for filtering purposes. As you can see he designed the data structures in a very versatile way. One that is the most appealing is the fact that sorted set elements with the same score are stored in lexicographical order. Thus you can easily have all elements have a score of 0 and piggyback on Redis's speed and use it more like cockroachDB, which relies on a global order to implement many SQL features.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With