Is there a sequence number generation function in redshift ? Or a function that takes combination of values and gives out a numerical hash key ?
In Redshift, when we need a sequence of dates between two given days, we can create it using the generate_series function and use it as a table in a FROM or JOIN clause. It is useful when we need to display a table of dates and values, but we don't have a value for each of those days.
Redshift does not support sequences.
Redshift row_number() function usually assigns a row number to each row by means of the partition set and the order by clause specified in the statement.
select * from sales order by log(1 - random()) / pricepaid limit 10; This example uses the SET command to set a SEED value so that RANDOM generates a predictable sequence of numbers. To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.
Here is another way to generate 1 million numbers
with seq_0_9 as (
select 0 as num
union all select 1 as num
union all select 2 as num
union all select 3 as num
union all select 4 as num
union all select 5 as num
union all select 6 as num
union all select 7 as num
union all select 8 as num
union all select 9 as num
), seq_0_999 as (
select a.num + b.num * 10 + c.num * 100 as num
from seq_0_9 a, seq_0_9 b, seq_0_9 c
)
select a.num + b.num * 1000 as num
from seq_0_999 a, seq_0_999 b
order by num
There is no concept of sequences (as seen in Oracle) at the moment.
You have a few options:
I am new to Redshift, and I found this article looking for a common sequence, that is not supported on Amazon database. I found this solution I will report with a complete example using ROW_NUMBER.
I have schemas sta and dim. In sta I have staging tables, while in dim I have dimension tables I want to populate with ids. I have a source of information that has fields trk_key, name containing for instance some publishers.
CREATE TABLE sta.publisher (
trk_key VARCHAR(20),
name VARCHAR(225)
);
CREATE TABLE dim.publisher (
id SMALLINT,
trk_key VARCHAR(20),
name VARCHAR(255),
PRIMARY KEY (id)
);
First I truncate sta.publisher table and load there a csv file. Then I launch the following query
-- This query is idempotent:
-- it will insert a publisher found in sta.publisher table only if
-- it is not already in dim.publisher table.
INSERT INTO dim.publisher
SELECT
-- Generate id using max id found in dim.publisher.
-- Start with id=1 if dim.publisher is empty.
(
SELECT NVL(MAX(id), 0)
FROM dim.publisher
) + ROW_NUMBER() OVER() AS id,
trk_key,
name
FROM sta.publisher
-- Only insert record if trk_key is not found in dim.publisher table.
WHERE trk_key NOT IN (
SELECT trk_key
FROM dim.publisher
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With