The scenario is quite simple, there are about 100M records in a table with 10 columns (kind of analytics data), and I need to be able to perform queries on any combination of those 10 columns. For example something like this:
a = 3 && b > 100
are there in past 3 months?Basically all of the queries are going to be a kind of how many records with attributes X
are there in time interval Y
, where X
can be any combination of those 10 columns.
The data will keep coming in, it is not just a pre-given set of 100M records, but it is growing over time.
Since the column selection can be completely random, creating indexes for popular combinations is most likely not possible.
The question has two parts:
Oracle Database Oracle has provided high-quality database solutions since the 1970s. The most recent version of Oracle Database was designed to integrate with cloud-based systems, and it allows you to manage massive databases with billions of records. Traditionally, Oracle has offered RDBMS solutions.
Use temp tables Speed up query execution in your SQL server by taking any data needed out of the large table, transferring it to a temp table and join with that. This reduces the power required in processing.
Use the SQL Server BCP to export big tables data This table includes 100 million rows and it's size is about 7.5 GB. In our first testing, we will run the SQL Server BCP with default values in order to export 100 M rows.
Without indexes your options for tuning an RDBMS to support this kind of processing are severely limited. Basically you need massive parallelism and super-fast kit. But clearly you're not storing realtional data so an RDBMS is the wrong fit.
Pursuing the parallel route, the industry standard is Hadoop. You can still use SQL style queries through Hive.
Another noSQL option would be to consider a columnar database. These are an alternative way of organising data for analytics without using cubes. They are good at loading data fast. Vectorwise is the latest player in the arena. I haven't used it personally, but somebody at last night's LondonData meetup was raving to me about it. Check it out.
Of course, moving away from SQL databases - in whatever direction you go - will incur a steep learning curve.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With