Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this Sql Statement (with 2 table joins) takes 5 mins to complete?

Updates : 3 updates added below

The following sql statement takes 5 mins to complete. I. Just. Don't. Get. It :( First table has 6861534 rows in it. Second table has a little bit less .. and third table (which contains 4 GEOGRAPHY FIELDS) has the same as the first.

Those GEOGRAPHY fields in the 3rd table .. they shouldn't be messin' with the sql statement ... should it? Could it be because the table is so large (due to the GEOGRAPHY fields) that it has huge page sizes or something .. thus destroying the table scan a COUNT does?

SELECT COUNT(*)
FROM [dbo].[Locations] a
    inner join [dbo].[MyUSALocations] b on a.LocationId = b.LocationId
    inner join [dbo].[GeographyBoundaries] c on a.locationid = c.LocationId

alt text

alt text

alt text

alt text

Update

As requested, here's some more info about the GeographyBoundaries table... alt text

/****** Object:  Index [PK_GeographyBoundaries]    Script Date: 11/16/2010 12:42:36 ******/
ALTER TABLE [dbo].[GeographyBoundaries] ADD  CONSTRAINT [PK_GeographyBoundaries] PRIMARY KEY CLUSTERED 
(
    [LocationId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

Update #2 - After adding the Non-Clustered Index

After adding the non-clustered index, it's now dropped down to 4 seconds! Which is awesome. But why ?

alt text

What Zee Frak?

Update 3 - even more interesting and confusing info!

Now, when i just do ONE join and force the INDEX .. it goes back to 5 mins. I did this to

  • Make sure the MyUSALocations table wasn't screwing things around with the Joins.
  • Make sure the PK is doing weird things.

.

SELECT COUNT(*)
FROM [dbo].[Locations] a 
        INNER JOIN [dbo].[GeographyBoundaries] c
            WITH (INDEX(PK_GeographyBoundaries)) ON a.locationid = c.LocationId
like image 813
Pure.Krome Avatar asked Oct 14 '22 21:10

Pure.Krome


1 Answers

This ain't right.

I have two possibilities:

1) Statistics is out of date on the tables. Rebuild indexes and update stats.

2) As you said, Geography table records are big spanning many pages (not that one record spanning multiple pages since it can't but the record is close to the 8K mark). In this case, funny enough, creating another non-clustered index on the clustered index might help.

UPDATE

I am pleased that it has worked. Now some explanation.

First of all, if something is not really right and execution plan looks weird, always looks at statistics and rebuild indexes.

Creating a non-clustered index for the clustered index usually should not provide any benefit but when the table has many records and the record is close to its 8K limit, it is helpful. As you know, SQL when it goes to the disk to load a record, it loads an 8K page. In a similar way going to indexes it will load an 8K page. Now with index being a 4-byte integer this means loading ID for 2000 records while it is going to load handful records if it uses clustered index (bear in mind all we need is the ID for the JOIN bit). Now with this being a binary search, I don't expect it to hugely help only a bit. So perhaps something else is not quite right, but difficult to guess not having seen the system.

like image 61
Aliostad Avatar answered Oct 19 '22 19:10

Aliostad