Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to create and populate a numbers table?

I've seen many different ways to create and populate a numbers table. However, what is the best way to create and populate one? With "best" being defined from most to least important:

  • Table created with optimal indexing
  • rows generated fastest
  • simple code used to create and populate

If you don't know what a numbers table is, look here: Why should I consider using an auxiliary numbers table?

like image 370
KM. Avatar asked Sep 08 '09 13:09

KM.


People also ask

What are two methods that can be used to populate tables with data?

After a database has been created, there are two ways of populating the tables – either from existing data or through the use of the user applications developed for the database.

Which helps us to create and populate a new table with the data of an existing table?

Description. You can also use the SQL CREATE TABLE AS statement to create a table from an existing table by copying the existing table's columns. It is important to note that when creating a table in this way, the new table will be populated with the records from the existing table (based on the SELECT Statement).


2 Answers

here are some code examples taken from the web and from answers to this question.

For Each Method, I have modified the original code so each use the same table and column: NumbersTest and Number, with 10,000 rows or as close to that as possible. Also, I have provided links to the place of origin.

METHOD 1 here is a very slow looping method from here
avg 13.01 seconds
ran 3 times removed highest, here are times in seconds: 12.42, 13.60

DROP TABLE NumbersTest DECLARE @RunDate datetime SET @RunDate=GETDATE() CREATE TABLE NumbersTest(Number INT IDENTITY(1,1))  SET NOCOUNT ON WHILE COALESCE(SCOPE_IDENTITY(), 0) < 100000 BEGIN      INSERT dbo.NumbersTest DEFAULT VALUES  END SET NOCOUNT OFF -- Add a primary key/clustered index to the numbers table ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number) PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE())/1000.0)+' seconds' SELECT COUNT(*) FROM NumbersTest 

METHOD 2 here is a much faster looping one from here
avg 1.1658 seconds
ran 11 times removed highest, here are times in seconds: 1.117, 1.140, 1.203, 1.170, 1.173, 1.156, 1.203, 1.153, 1.173, 1.170

DROP TABLE NumbersTest DECLARE @RunDate datetime SET @RunDate=GETDATE() CREATE TABLE NumbersTest (Number INT NOT NULL); DECLARE @i INT; SELECT @i = 1; SET NOCOUNT ON WHILE @i <= 10000 BEGIN     INSERT INTO dbo.NumbersTest(Number) VALUES (@i);     SELECT @i = @i + 1; END; SET NOCOUNT OFF ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number) PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE())/1000.0)+' seconds' SELECT COUNT(*) FROM NumbersTest 

METHOD 3 Here is a single INSERT based on code from here
avg 488.6 milliseconds
ran 11 times removed highest, here are times in milliseconds: 686, 673, 623, 686,343,343,376,360,343,453

DROP TABLE NumbersTest DECLARE @RunDate datetime SET @RunDate=GETDATE() CREATE TABLE NumbersTest (Number  int  not null)   ;WITH Nums(Number) AS (SELECT 1 AS Number  UNION ALL  SELECT Number+1 FROM Nums where Number<10000 ) insert into NumbersTest(Number)     select Number from Nums option(maxrecursion 10000) ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number) PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds' SELECT COUNT(*) FROM NumbersTest 

METHOD 4 here is a "semi-looping" method from here avg 348.3 milliseconds (it was hard to get good timing because of the "GO" in the middle of the code, any suggestions would be appreciated)
ran 11 times removed highest, here are times in milliseconds: 356, 360, 283, 346, 360, 376, 326, 373, 330, 373

DROP TABLE NumbersTest DROP TABLE #RunDate CREATE TABLE #RunDate (RunDate datetime) INSERT INTO #RunDate VALUES(GETDATE()) CREATE TABLE NumbersTest (Number int NOT NULL); INSERT NumbersTest values (1); GO --required INSERT NumbersTest SELECT Number + (SELECT COUNT(*) FROM NumbersTest) FROM NumbersTest GO 14 --will create 16384 total rows ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number) SELECT CONVERT(varchar(20),datediff(ms,RunDate,GETDATE()))+' milliseconds' FROM #RunDate SELECT COUNT(*) FROM NumbersTest 

METHOD 5 here is a single INSERT from Philip Kelley's answer
avg 92.7 milliseconds
ran 11 times removed highest, here are times in milliseconds: 80, 96, 96, 93, 110, 110, 80, 76, 93, 93

DROP TABLE NumbersTest DECLARE @RunDate datetime SET @RunDate=GETDATE() CREATE TABLE NumbersTest (Number  int  not null)   ;WITH   Pass0 as (select 1 as C union all select 1), --2 rows   Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows   Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows   Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows   Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows   --I removed Pass5, since I'm only populating the Numbers table to 10,000   Tally as (select row_number() over(order by C) as Number from Pass4) INSERT NumbersTest         (Number)     SELECT Number         FROM Tally         WHERE Number <= 10000 ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number) PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds' SELECT COUNT(*) FROM NumbersTest 

METHOD 6 here is a single INSERT from Mladen Prajdic answer
avg 82.3 milliseconds
ran 11 times removed highest, here are times in milliseconds: 80, 80, 93, 76, 93, 63, 93, 76, 93, 76

DROP TABLE NumbersTest DECLARE @RunDate datetime SET @RunDate=GETDATE() CREATE TABLE NumbersTest (Number  int  not null)   INSERT INTO NumbersTest(Number) SELECT TOP 10000 row_number() over(order by t1.number) as N FROM master..spt_values t1      CROSS JOIN master..spt_values t2 ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number); PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds' SELECT COUNT(*) FROM NumbersTest 

METHOD 7 here is a single INSERT based on the code from here
avg 56.3 milliseconds
ran 11 times removed highest, here are times in milliseconds: 63, 50, 63, 46, 60, 63, 63, 46, 63, 46

DROP TABLE NumbersTest DECLARE @RunDate datetime SET @RunDate=GETDATE() SELECT TOP 10000 IDENTITY(int,1,1) AS Number     INTO NumbersTest     FROM sys.objects s1       --use sys.columns if you don't get enough rows returned to generate all the numbers you need     CROSS JOIN sys.objects s2 --use sys.columns if you don't get enough rows returned to generate all the numbers you need ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number) PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds' SELECT COUNT(*) FROM NumbersTest 

After looking at all these methods, I really like Method 7, which was the fastest and the code is fairly simple too.

like image 173
KM. Avatar answered Oct 19 '22 12:10

KM.


i use this which is fast as hell:

insert into Numbers(N) select top 1000000 row_number() over(order by t1.number) as N from   master..spt_values t1         cross join master..spt_values t2 
like image 28
Mladen Prajdic Avatar answered Oct 19 '22 12:10

Mladen Prajdic