Can Snowflake handle thousands of columns?
Is there a way I can generate test data to test Snowflake's performance when handling, let's say, 2000 columns?
With this script you can create a table with 2000 (or any number) of columns, giving each a default random value:
CREATE or replace PROCEDURE create_wide_table()
RETURNS VARCHAR
LANGUAGE javascript
AS
$$
ss = 'create or replace table wide2000 (id int';
for (const x of Array(2000).keys()) {
ss += ', a' +x + ' int default uniform(1, 10000, random())';
}
ss += ');'
//return ss;
var rs = snowflake.execute( { sqlText: ss } );
return 'Done.';
$$;
That will generate and execute a SQL query that looks like:
create or replace table wide2000
(id int
, a0 int default uniform(1, 10000, random())
, a1 int default uniform(1, 10000, random())
, ...
);
You can modify the script to generate data that better resembles your type of data and sparsity.
What's cool about this table is that it has default random numbers for most columns. Therefore adding a million rows to it becomes really easy.
Create the table first:
call create_wide_table();
Generate a million rows:
insert into wide2000 (id)
select seq8()
from table(generator(rowcount => 1000000))
Run your experiments:
select *
from wide2000
limit 10
For more complex data generation needs, check https://github.com/RobertFehrmann/fehrminator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With