Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server aggregates for very large tables

We have a table with 17Mil rows containing product attributes, let's say they're:

brandID, sizeID, colorID, price, shapeID

And we need to query for aggregates by brand and size. Currently we query and filter this data by doing something like this:

select brandID, sizeID, count(*) 
from table where colorID in (1,2,3) and price=10 and shapeID=17
--"additional complex where clause here"
group by brandID, sizeID
order by brandID, sizeID

And we report this data. The problem is, it takes 10 seconds or so to run this query (and this is a very simple example) in spite of the fact that the actual data returned will be just a few hundred rows.

I think we've reached our capacity for indexing this table so I don't think any amount of indexes will get us to near-instant results.

I know very little about OLAP or other analysis services, but what's out there for SQL Server that can pre-filter or pre-aggregate this table so that queries like the above (or similar returning equivalent data) can be performed? OR What's the best way to handle arbitrary where clauses on a very large table?

like image 410
Jody Powlette Avatar asked Oct 15 '22 13:10

Jody Powlette


1 Answers

I think this is a perfect candidate for an olap cube. I have fact data with 100s of millions of rows. I was doing the kind of queries you described above and queries were coming back in minutes. I moved this into an OLAP cube and queries are now almost instantaneous. There is a bit of a learning curve for olap. I'd strongly suggest you find a tutorial on some simple cube building just to get your head around it. DBA colleagues had been telling me about cubes for years and I never quite got it. Now I don't know why I went so long without it.

In addition to OLAP, you may also want to research indexed views but if you are slicing the data in several ways, that may not be feasible.

like image 83
Matt Wrock Avatar answered Oct 19 '22 01:10

Matt Wrock