Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate 1000000 rows with random data?

Tags:

sql

mysql

Background

Im working on some kind of data logger.

I want to try how much storage space I need for 1000000 (1M) rows and how Raspberry Pi deals with such big table. I want to run some queries with grouping, calculating averages and other performance experiments.

My table looks like this:

CREATE TABLE `data` 
(
  `id`         bigint(20) NOT NULL      AUTO_INCREMENT,
  `datetime`   timestamp  NULL          DEFAULT CURRENT_TIMESTAMP,
  `channel`    int(11)                  DEFAULT NULL,
  `value`      float                    DEFAULT NULL,

  PRIMARY KEY (`id`)
)

Question

How can I fill it with 1000000 million rows in MySQL?

Requirements:

  • data.datetime field: random timestamps but only from one year
  • data.value field: random float numbers from given range (0.00-100.00 for example)
  • data.id is autoincrement, no need to care about that
  • data.channel is always 1, no need to care about that too

I know SQL a bit, but I'm not good in PL/SQL, loops in SQL etc.

EDIT:

To make it clear - im using MySQL 5.5.

Mentioned PL/SQL was my mistake, I thought PL/ stands for procedural features in SQL in general, not just Oracle.

like image 562
Kamil Avatar asked Aug 02 '14 19:08

Kamil


People also ask

How can I add 1 million records in SQL Server?

Firstly, I wrote a prc to insert row by row. Then I generate some random data for insert with the NVARCHAR(MAX) column to be a string of 1000 characters. Then use a loop to call the prc to insert the rows.

How do I select a random row in SQL query?

To get a single row randomly, we can use the LIMIT Clause and set to only one row. ORDER BY clause in the query is used to order the row(s) randomly. It is exactly the same as MYSQL. Just replace RAND( ) with RANDOM( ).


1 Answers

Try it with a stored procedure (replace 1000 with desired amount of rows, and 2014 with test year, also see generate random timestamps in mysql)

CREATE TABLE `data` 
(
  `id`         bigint(20) NOT NULL      AUTO_INCREMENT,
  `datetime`   timestamp  NULL          DEFAULT CURRENT_TIMESTAMP,
  `channel`    int(11)                  DEFAULT NULL,
  `value`      float                    DEFAULT NULL,

  PRIMARY KEY (`id`)
);


DELIMITER $$
CREATE PROCEDURE generate_data()
BEGIN
  DECLARE i INT DEFAULT 0;
  WHILE i < 1000 DO
    INSERT INTO `data` (`datetime`,`value`,`channel`) VALUES (
      FROM_UNIXTIME(UNIX_TIMESTAMP('2014-01-01 01:00:00')+FLOOR(RAND()*31536000)),
      ROUND(RAND()*100,2),
      1
    );
    SET i = i + 1;
  END WHILE;
END$$
DELIMITER ;

CALL generate_data();

Modify to your needs. To delete the procedure:

DROP PROCEDURE generate_data;

Maybe this can give you a start!

like image 82
Jonny 5 Avatar answered Sep 27 '22 17:09

Jonny 5