Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using subquery to pull random value reveals same value every time

I want to add a random value to a column in a query using T-SQL for Microsoft SQL Server 2008 R2. To do this, I use a subquery and pull a random record. Every time I run the query, it pulls a new random record, but the value for every row is identical.

How can I get a new random value for each row?

Sample query:

SELECT column1,
    (SELECT TOP 1 column1 as c2 FROM Table2 ORDER BY NewID())
FROM Table1

Whereas table2 looks like

column1
value1
value2
value3

It always returns the same value for column2, instead of a new random record for each table.

First run:

column1, c2
1, value1
2, value1
3, value1

Second run:

column1, c2
1, value2
2, value2
3, value2

How can I get c2 (the subquery) to pull a new random record for each row?

Ideal scenario:

column1, c2
1, value2
2, value1
3, value2
like image 629
Lucas Avatar asked Apr 04 '11 15:04

Lucas


People also ask

How do you handle subquery returning more than one value?

You may use the IN, ANY, or ALL operator in outer query to handle a subquery that returns multiple rows. Contents: Using IN operator with a Multiple Row Subquery. Using NOT IN operator with a Multiple Row Subquery.

How many times does a subquery execute?

A correlated SQL subquery is just a subquery that is executed many times—once for each record (row) returned by the outer (main) query. In other words, the outer query returns a table with multiple rows; the inner query then runs once for each of those rows.

Why subquery is better than join?

The optimizer is more mature for MYSQL for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as join. We cannot modify a table and select from the same table within a subquery in the same SQL statement.


2 Answers

Try:

SELECT column1,
    (SELECT TOP 1 column1 as c2 FROM Table2 WHERE Table1.column1 IS NOT NULL ORDER BY NewID())
FROM Table1
like image 192
Thomas Li Avatar answered Sep 19 '22 05:09

Thomas Li


You could try

SELECT T1.column1,
    (SELECT TOP 1 column1 as c2 FROM Table2 ORDER BY NewID(), LEN(T1.column1))
FROM Table1 T1

To force it to re-evaluate the selection each time. This will be very inefficient though. Also quite fragile (without the LEN it didn't work!) Would it meet your requirement to sort the T2 values randomly once (or possibly more if Table1 is larger than Table2) then join by row_number?

The main difference I can see with the methods is that your current method will allow the same random row to be selected multiple times my proposed method wouldn't.

like image 32
Martin Smith Avatar answered Sep 22 '22 05:09

Martin Smith