I've written a stored procedure to generate random SMS records/events.
When inserting 1.2 million rows, the query takes hundreds of minutes
exec insert_random_sms 1200000
I've coded the stored procedure in a 'procedural' way. But, from what I see, SQL is not very efficient in this respect.
create proc insert_random_sms
@number_of_records int
as
begin
declare @cnt int = 0; -- loop counter
declare @phone_id int;
declare @dest_id int;
while (@cnt < @number_of_records)
begin
declare @charge int = rand() * 100; -- will generate a random charge value between 0 and 100.
declare @tarrif_plan int = round(rand() * 5, 0);
select top 1 @phone_id = phone_no
from tbl_phone_agenda
order by newid();
select top 1 @dest_id = phone_no
from tbl_phone_agenda
order by newid();
insert into tbl_sms (phone_id, dest_id, charge, tarrif_plan)
values (@phone_id, @dest_id, @charge,
convert(nvarchar(50), @tarrif_plan));
set @cnt += 1;
end
end
go
What is the way to optimize this stored procedure?
The method I like to use for generating x number of records is the stacked CTE method (having read this article by Aaron Bertrand who credits Itzik Ben-Gan for the stacked CTE approach):
WITH N1 (N) AS
( SELECT 1
FROM (VALUES
(1), (1), (1), (1), (1),
(1), (1), (1), (1), (1)
) n (Number)
),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
N4 (N) AS (SELECT 1 FROM N3 AS N1 CROSS JOIN N3 AS N2)
SELECT COUNT(*)
FROM N4
This simply starts of with 10 rows, and keeps cross joining, until in the case of the above there are 100,000,000 rows. This would be how I would generate your rows,
When you are using a set based approach you can no longer use RAND()
on its own because it is a run time constant, in order to get a new evaluation for each row you need to combine RAND()
with NEWID()
which is unique per row, so the following will generate a random number between 0 and 100 that is different for each row:
SELECT CAST(ROUND(RAND(CHECKSUM(NEWID())) * 100, 0) AS INT)
The next thing I would do is put all your phonenumbers into a temp table so that they have a sequential ID (this will be used to allocate randomly):
CREATE TABLE #Phone
(
ID INT IDENTITY NOT NULL PRIMARY KEY,
PhoneNo VARCHAR(50) NOT NULL
);
INSERT #Phone (PhoneNo)
SELECT PhoneNo
FROM tbl_phone_agenda;
So your final query will be
CREATE PROC insert_random_sms @number_of_records IN
AS
BEGIN
CREATE TABLE #Phone
(
ID INT IDENTITY NOT NULL PRIMARY KEY,
PhoneNo VARCHAR(50) NOT NULL
);
INSERT #Phone (PhoneNo)
SELECT PhoneNo
FROM tbl_phone_agenda;
-- NEEDED SO WE KNOW WHAT NUMBER TO GENERATE A RANDOM
-- NUMBER IN THE RIGHT RANGE LATER
DECLARE @PhoneCount INT = (SELECT COUNT(*) FROM #Phone);
WITH N1 (N) AS
( SELECT 1
FROM (VALUES
(1), (1), (1), (1), (1),
(1), (1), (1), (1), (1)
) n (Number)
),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
N4 (N) AS (SELECT 1 FROM N3 AS N1 CROSS JOIN N3 AS N2)
INSERT tbl_sms (phone_id, dest_id, charge, tarrif_plan)
SELECT TOP (@number_of_records)
p.PhoneNo,
d.PhoneNo,
Charge = CAST(ROUND(RAND(CHECKSUM(NEWID())) * 100, 0) AS INT),
tarrif_plan = CAST(ROUND(RAND(CHECKSUM(NEWID())) * 5, 0) AS INT)
FROM N4
INNER JOIN #Phone p
ON p.ID = CAST(CEILING(RAND(CHECKSUM(NEWID())) * @PhoneCount) AS INT)
INNER JOIN #Phone d
ON d.ID = CAST(CEILING(RAND(CHECKSUM(NEWID())) * @PhoneCount) AS INT)
END
In my tests this ran in about 20-30 seconds to generate 1.2m records, looking up against 100,000 phone numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With