Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I generate random sample data in my Oracle database?

Does anyone know of a tool that can inspect a specified schema and generate random data based on the tables and columns of that schema?

like image 597
Justin Kredible Avatar asked May 31 '11 14:05

Justin Kredible


People also ask

How do I randomly SELECT rows in Oracle?

SELECT columns FROM table ORDER BY RAND() LIMIT n; The RAND() function generates a random number between 0 and 1 for each row in the table and the ORDER BY clause will order the rows by their random number.

What is sample in Oracle?

SAMPLE SQL Clause Oracle Database provides the SAMPLE clause that can be used with a SELECT statement over a table. In the following query, we're randomly selecting records from the CUSTOMERS table with a 20% probability.

How do I import a sample into Oracle SQL Developer?

Once the file is downloaded, simply right click on the Tables node in SQL Developer's tree, and select 'Import Data…' Then navigate to the file you've downloaded (you'll need to extract it from the Zip after your torrent has finished.) You should be here now.

Which is fastest way to load data into Oracle?

Option 1: Perform a SQL-Loader with direct=TRUE loading the flat file into a temp table first. Then, create an index. After that load the data using Informatica PUSH Transformation into Fact table. Option 2: Using Informatica directly reading the flat file and load the data into Oracle table.


3 Answers

Another alternative is Swingbench Data Generator

It is useful to use the SAMPLE clause (for example generating order lines for a random combination of orders and products)

like image 188
Gary Myers Avatar answered Oct 03 '22 00:10

Gary Myers


This is an interesting question. It is easy enough to generate random values - a simple loop round the data dictionary with calls to DBMS_RANDOM would do the trick.

Except for two things.

One is, as @FrustratedWithForms points out, there is the complication of foreign key constraints. Let's tip lookup values (reference data) into the mix too.

The second is, random isn't very realistic. The main driver for using random data is a need for large volumes of data, probably for performance testing. But real datasets aren't random, they contain skews and clumps, variable string lengths, and of course patterns (especially where dates are concerned).

So, rather than trying to generate random data I suggest you try to get a real dataset. Ideally your user/customer will be able to provide one, preferably anonymized. Otherwise try taking something which is already in the public domain, and massage it to fit your specific requirements. The Info Chimps are the top bananas when it comes to these matters. Check them out.

like image 33
APC Avatar answered Oct 03 '22 01:10

APC


Allround Automation's PL/SQL Developer has a data generator tool. But be warned: it's a bit flaky - it seems to work fine on a single-table basis but gets tripped up when there are dependencies between tables.

I admit that eventually I just started writing my own SQL scripts to generate data. Turned out to be much more stable.

like image 3
FrustratedWithFormsDesigner Avatar answered Oct 03 '22 01:10

FrustratedWithFormsDesigner