Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon Redshift : Best way to compare dates

I have a sample table in redshift. I want to generate a report with the month wise data. I found the below three ways so far.

  1. trunc(created_at_date) between '2014-06-01' and '2014-06-30'

  2. created_at_date like '2014-06%'

  3. created_at_date >= '2014-06-01 00:00:00' and created_at_date <= '2014-06-30 23:59:59'

    What is the best and optimal way to do so?

like image 431
untitled Avatar asked Jan 27 '15 11:01

untitled


People also ask

How do you find the difference between two dates in Redshift?

SELECT DATEDIFF ('datepart', <date | time | timetz | timestamp>, <date | time | timetz | timestamp>); Here above syntax is used to find the difference between the specified dates wised columns in a table.

How do you filter dates in Redshift?

select extract(minute from sysdate); -- hour, day, month, year, century select date_part(minute, sysdate); -- hour, day, month, year, century -- returns 0-6 (integer), where 0 is Sunday and 6 is Saturday SELECT extract(dow from sysdate); SELECT extract(dow, sysdate); -- returns a string like monday, tuesday, etc select ...

Is Redshift good for real time data?

You can get real-time insights in seconds without managing complex pipelines. Amazon Redshift with Kinesis Data Streams is fully managed, and you can run your streaming applications without requiring infrastructure management.

What is the most efficient and fastest way to load data into Redshift?

A COPY command is the most efficient way to load a table. You can also add data to your tables using INSERT commands, though it is much less efficient than using COPY. The COPY command is able to read from multiple data files or multiple data streams simultaneously.


2 Answers

Not the 1st one as it perform unnecessary (unless you really have such unprocessed data) truncate.

 1. trunc(created_at_date) between '2014-06-01' and '2014-06-30';

Definitely not this one for obvious reasons (like)

 2. created_at_date like '2014-06%'

May be this one:

 3. created_at_date >= '2014-06-01 00:00:00' and created_at_date <= '2014-06-30 23:59:59'

However, since the requirement is to generate monthly reports which I would assume to be a recurring task and on multiple data sources, I would suggest creating a one time calendar table.

This table would have mapping of a date to a month value and then you can simply join your source data with that table and group by the "month" column.

P.S. Just realized I replied to a very 'ancient' question :p

like image 160
SwapSays Avatar answered Oct 22 '22 12:10

SwapSays


http://docs.aws.amazon.com/redshift/latest/dg/r_DATE_CMP.html

select caldate, '2008-01-04', date_cmp(caldate,'2008-01-04') from date

like image 24
CodingMatters Avatar answered Oct 22 '22 11:10

CodingMatters