SQL Server query - not performing as expected, not behaving like I thought it would

Tags:

I have an advanced SQL question for your SQL perf gurus out there :-)

I'm currently trying to understand some behavior in a larger application, but it boils down to a query against these two tables:

Users table - roughly 750 entries, UserId (varchar(50)) as clustered PK
ActionLog table - millions of entries, includes UserId - but no FK relationship

For a grid in my ASP.NET application, I'm trying to get all users plus their last log entry's date.

The SQL statement that is currently being used looks something like this:

SELECT
     UserId, (other columns),
     LastLogDate = (SELECT TOP (1) [Timestamp] FROM dbo.ActionLog a WHERE a.UserId = u.UserId ORDER BY [Timestamp] DESC) 
FROM
     dbo.Users u;

and it returns the rows to display - but it's fairly slow (approx. 20 seconds).

My first thought was to add an index on the ActionLog table on UserId and to include the Timestamp column in it:

CREATE NONCLUSTERED INDEX [IDX_UserId]
ON [dbo].[ActionLog]([UserId] ASC)
INCLUDE ([Timestamp])

The rows are now returned very quickly - under 2 seconds, with 350'000 entries in the ActionLog table, and my index is being used just fine, as the execution plan shows me. All seems fine.

Now, for approximating the production scenario, we loaded roughly 2 million rows into the ActionLog table, 95% or more of which refer to a non-existing user (i.e. these rows have a UserId that doesn't exist in the Users table).

Now suddenly, the query becomes extremely slow (24 minutes!), and the index isn't being used anymore.

I assumed that since the vast majority of the entries in the ActionLog table don't line up with an existing user, I would see performance gains if I use a filtered index - to "weed out" all those messy entries without a corresponding user - so I created this index (replacing the other one that existed before):

CREATE NONCLUSTERED INDEX [IDX_UserId]
ON [dbo].[Log]([UserId] ASC)
INCLUDE ([Timestamp])
WHERE UserId <> 'user'    -- that's the fixed, non-existing "UserId" I wanted to avoid

But to my dismay - the query is still about the same - takes over 20 minutes to complete. I updated the statistics - no change - still extremely slow.

Funny thing (to me) is: when I dropped the index and re-created it -> now the query was really fast again (again less than 3 seconds). WOW!

But as soon as I start adding more entries again, the query "tilts" and becomes really really slow.......

I don't fully understand why this is happening - I was thinking that with a filtered index that eliminates all those "rogue" entries, I would be seeing good performance on trying to find the newest ActionLog entry for existing users - but that doesn't seem to be the case.

WHY NOT?

Any ideas? Thoughts? Things to try??

475

asked Nov 11 '14 20:11

marc_s

2 Answers

First, INCLUDE here is not the best choice. You sort by entry date, but included columns aren't sorted. Better solution would be:

CREATE NONCLUSTERED INDEX [IX_ActionLog_UserIdTimestamp] ON [dbo].[ActionLog]
([UserId], [Timestamp]);

Second, looks like you might need to update statistics on your index more often than an automatic update would. I have seen cases when, in a situation akin to yours, I had to update stats every 10 minutes, due to excessive inserts. That was back on 2005, though.

answered Sep 28 '22 20:09

Roger Wolf

Try this query and see how it performs with your original index or with the modified suggested by @Roger Wolf:

SELECT u.UserId, a.LastLogDate 
FROM dbo.Users u
INNER JOIN (
    SELECT UserId, Max([TimeStamp]) AS LastLogDate
    FROM dbo.ActionLog 
    WHERE userid <> 'user' -- the user to filter out
    GROUP BY UserId
) a ON a.UserId = u.UserId

If it sucks I'll delete the answer :)

answered Sep 28 '22 19:09

jpw

Related questions
                            
                                Create a query to fill the gaps in a table due to bad data
                            
                                row_number reset
                            
                                How to create a Database Diagram in SQL Server involving 2 databases? [closed]
                            
                                How to insert PHP array values using an stored procedure SQL SERVER 2008 R2(mssql)
                            
                                Simultaneous calls
                            
                                Alter Procedure statement in Visual Studio gives 'This statement is not recognized in this context message'
                            
                                Strategy to optimize this large SQL insert via C#?
                            
                                SQL Server full-text search using thesaurus and prefix form together
                            
                                SQL Most effective way to store every word in a document separately
                            
                                Get number of times keyword is present
                            
                                In hibernate how to programmatically set the isolation level of a transaction, or how to create two transactions with different isolation levels
                            
                                SQL Server install bug : Failed to launch local ScenarioEngine.exe
                            
                                Solution for preventing DB deadlock in ColdFusion?
                            
                                MS SQL Invalid Object Name
                            
                                Can't access a network share when executing a package from the SSIS Catalog
                            
                                Difference between processor affinity and I/O affinity
                            
                                Denormalized access: views or materialized tables?
                            
                                Precedence of numeric types in T-SQL
                            
                                How to get the exact "constraint name" from SQLException
                            
                                will trigger for UPDATE on the Table will be recursive

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SQL Server query - not performing as expected, not behaving like I thought it would

Tags:

performance

sql-server

query-performance

marc_s

People also ask

2 Answers

Roger Wolf

jpw

Recent Activity

Donate For Us