Refactoring a tsql view which uses row_number() to return rows with a unique column value

Tags:

I have a sql view, which I'm using to retrieve data. Lets say its a large list of products, which are linked to the customers who have bought them. The view should return only one row per product, no matter how many customers it is linked to. I'm using the row_number function to achieve this. (This example is simplified, the generic situation would be a query where there should only be one row returned for each unique value of some column X. Which row is returned is not important)

CREATE VIEW productView AS
SELECT * FROM 
    (SELECT 
        Row_number() OVER(PARTITION BY products.Id ORDER BY products.Id) AS product_numbering,
        customer.Id
        //various other columns
    FROM products
    LEFT OUTER JOIN customer ON customer.productId = prodcut.Id
    //various other joins
    ) as temp
WHERE temp.prodcut_numbering = 1

Now lets say that the total number of rows in this view is ~1 million, and running select * from productView takes 10 seconds. Performing a query such as select * from productView where productID = 10 takes the same amount of time. I believe this is because the query gets evaluated to this

SELECT * FROM 
    (SELECT 
        Row_number() OVER(PARTITION BY products.Id ORDER BY products.Id) AS product_numbering,
        customer.Id
        //various other columns
    FROM products
    LEFT OUTER JOIN customer ON customer.productId = prodcut.Id
    //various other joins
    ) as temp
WHERE prodcut_numbering = 1 and prodcut.Id = 10

I think this is causing the inner subquery to be evaluated in full each time. Ideally I'd like to use something along the following lines

SELECT 
    Row_number() OVER(PARTITION BY products.productID ORDER BY products.productID) AS product_numbering,
    customer.id
    //various other columns
FROM products
    LEFT OUTER JOIN customer ON customer.productId = prodcut.Id
    //various other joins
WHERE prodcut_numbering = 1

But this doesn't seem to be allowed. Is there any way to do something similar?

EDIT -

After much experimentation, the actual problem I believe I am having is how to force a join to return exactly 1 row. I tried to use outer apply, as suggested below. Some sample code.

CREATE TABLE Products (id int not null PRIMARY KEY)
CREATE TABLE Customers (
        id int not null PRIMARY KEY,
        productId int not null,
        value varchar(20) NOT NULL)

declare @count int = 1
while @count <= 150000
begin
        insert into Customers (id, productID, value)
        values (@count,@count/2, 'Value ' + cast(@count/2 as varchar))      
        insert into Products (id) 
        values (@count)
        SET @count = @count + 1
end

CREATE NONCLUSTERED INDEX productId ON Customers (productID ASC)

With the above sample set, the 'get everything' query below

select * from Products
outer apply (select top 1 * 
            from Customers
            where Products.id = Customers.productID) Customers

takes ~1000ms to run. Adding an explicit condition:

select * from Products
outer apply (select top 1 * 
            from Customers
            where Products.id = Customers.productID) Customers
where Customers.value = 'Value 45872'

Takes some identical amount of time. This 1000ms for a fairly simple query is already too much, and scales the wrong way (upwards) when adding additional similar joins.

963

asked Oct 18 '11 11:10

John

2 Answers

Try the following approach, using a Common Table Expression (CTE). With the test data you provided, it returns specific ProductIds in less than a second.

create view ProductTest as 

with cte as (
select 
    row_number() over (partition by p.id order by p.id) as RN, 
    c.*
from 
    Products p
    inner join Customers c
        on  p.id = c.productid
)

select * 
from cte
where RN = 1
go

select * from ProductTest where ProductId = 25

answered Sep 27 '22 16:09

Derek Kromm

What if you did something like:

SELECT ...
FROM products
OUTER APPLY (SELECT TOP 1 * from customer where customerid = products.buyerid) as customer
...

Then the filter on productId should help. It might be worse without filtering, though.

answered Sep 27 '22 18:09

GilM

Related questions
                            
                                MySQL UNION 2 queries containing ORDER BYs
                            
                                Is there a find sql statement behind a view
                            
                                Best way to fetch tree of objects stored in an RDBMS
                            
                                how to regex match string escaped with sql style?
                            
                                Best way to store articles in a database? (php and sql)
                            
                                @table variable or #temp table : Performance
                            
                                SQL: OPENROWSET, can't build for the request string?
                            
                                Microsoft SQL Server 2008 - Dates
                            
                                do I have to specify integer length when creating an id field in MySQL through phpMyAdmin?
                            
                                Can we specify degree of parallelism dynamically?
                            
                                unable to cast value as float
                            
                                Some clarifications on different Isolation level in database transaction?
                            
                                Delete tables older than 12 months using table name
                            
                                Is this LINQ based search safe against SQL injection / XSS attack?
                            
                                ASP.NET Formview not correctly updating SQL Database
                            
                                MERGE - conditional "WHEN MATCHED THEN UPDATE"
                            
                                Clean way to externalize long (+20 lines sql) when using spring jdbc? [closed]
                            
                                DateDiff to output hours and minutes
                            
                                Round to n Significant Figures in SQL
                            
                                SQL: Find the max record per group [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Refactoring a tsql view which uses row_number() to return rows with a unique column value

Tags:

sql

sql-server

tsql

sql-view

John

People also ask

2 Answers

Derek Kromm

GilM

Recent Activity

Donate For Us