Should I migrate to Redshift?

Question

I'm currently struggling querying be chunk of data that is stored in partitioned table (partition per date)

the data looks like that:

date, product_id, orders
2019-11-01, 1, 100
2019-11-01, 2, 200
2019-11-02, 1, 300

I have hundreds of date-partitions and millions of rows per date.

Now, if I want to query, for instance, total orders for product id 1 and 2 for period of 2 weeks, and group by date (to show in a graph per date), the db has to go to 2 weeks of partitions and fetch the data for them.

That process might be taking a long time when the number of products is big or the time frame required is long.

I have read that AWS Redshift is suitable for this kind of tasks. I'm considering shifting my partitioned tables (aggregated analytics per date) to that technology but I wonder if that's really what I should do to make those queries to run much faster.

Thanks!

Jahirul Islam · Accepted Answer

As per your use case Redshift is really a good choice for you. To gain the best performance out of Redshift, it is very important to set proper distribution and sort key. In your case "date" column should be distribution key and "productid" should be sort key. Another important note, Do not encode "date" and "productid" column. You should get better performance.

Should I migrate to Redshift?

Tags:

postgresql

nosql

amazon-redshift

bigdata

Yuval Kaufman

1 Answers

Jahirul Islam

Recent Activity

Donate For Us

Should I migrate to Redshift?

Tags:

postgresql

nosql

amazon-redshift

bigdata

Yuval Kaufman

1 Answers

Jahirul Islam

Related questions

Recent Activity

Donate For Us