Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select random sampling from sqlserver quickly

I have a huge table of > 10 million rows. I need to efficiently grab a random sampling of 5000 from it. I have some constriants that reduces the total rows I am looking for to like 9 millon.

I tried using order by NEWID(), but that query will take too long as it has to do a table scan of all rows.

Is there a faster way to do this?

like image 854
Byron Whitlock Avatar asked Mar 16 '09 20:03

Byron Whitlock


People also ask

How do I randomly select a sample in SQL?

To get a single row randomly, we can use the LIMIT Clause and set to only one row. ORDER BY clause in the query is used to order the row(s) randomly. It is exactly the same as MYSQL. Just replace RAND( ) with RANDOM( ).

What is Tablesample in SQL Server?

Introduced in SQL Server 2015 TABLESAMPLE is a clause for a query which can be used to select a pseudo-random number of rows from a table, based upon a percentage or a number of rows and an optional seed number – if a repeatable result is required. It can only be used against local tables.


2 Answers

If you can use a pseudo-random sampling and you're on SQL Server 2005/2008, then take a look at TABLESAMPLE. For instance, an example from SQL Server 2008 / AdventureWorks 2008 which works based on rows:

USE AdventureWorks2008; 
GO 


SELECT FirstName, LastName
FROM Person.Person 
TABLESAMPLE (100 ROWS)
WHERE EmailPromotion = 2;

The catch is that TABLESAMPLE isn't exactly random as it generates a given number of rows from each physical page. You may not get back exactly 5000 rows unless you limit with TOP as well. If you're on SQL Server 2000, you're going to have to either generate a temporary table which match the primary key or you're going to have to do it using a method using NEWID().

like image 96
K. Brian Kelley Avatar answered Oct 18 '22 14:10

K. Brian Kelley


Have you looked into using the TABLESAMPLE clause?

For example:

select *
from HumanResources.Department tablesample (5 percent)
like image 27
John Sansom Avatar answered Oct 18 '22 13:10

John Sansom