Database design for very large amount of data

Tags:

I am working on a project involving large amount of data from the delicious website. The data available is "Date, UserId, Url, Tags" (for each bookmark).

I normalized my database to a 3NF, and because of the nature of the queries that we wanted to use in combination, I came down to 6 tables... The design looks fine, however, now that a large amount of data is in the database, most of the queries need to join at least 2 tables together to get the answer, sometimes 3 or 4. At first, we didn't have any performance issues, because for testing matters we had not added too much data to the database. Now that we have a lot of data, simply joining extremely large tables takes a lot of time and for our project, which has to be real-time, this is a disaster.

I was wondering how big companies solve these issues. Looks like normalizing tables just adds complexity, but how does the big company handle large amounts of data in their databases, don't they use normalization?

Thanks.

824

asked Apr 09 '10 23:04

Hossein

1 Answers

Since you asked about how big companies (generally) approaches this:

They usually have a dba(database administrator) who lives and breathes the database the company uses.

This means they have people that know everything from how to design the tables optimally, profile and tune the queries/indexes/OS/server to knowing what firmware revision of the RAID controller that can cause problems for the database.

You don't talk much about what kind of tuning you've done, e.g.

Are you using MyISAM or InnoDB tables ? Their performance(and not the least their features) is radically different for different workloads.
Are the tables properly indexed according to the queries you run ?
run EXPLAIN on all your queries - which will help you identify keys that could be added/removed, wether the proper keys are selected, compare queries(SQL leaves you with lots of way to accomplish the same things)
Have you tuned the query-cache ? For some workloads the query cache(default on) can cause considerable slowdown.
How much memory do your box have , and is mysql tuned to take advantage of this ?
Do you use a file system and raid setup geared towards the database ?
Sometimes a little de-normalization is needed.
Different database products will have different charasteristics, MySQL might be blazingly fast for some worlkoads, and slow for others.

186

answered Oct 14 '22 06:10

leeeroy

Related questions
                            
                                Are there any functions in MySQL like dense_rank() and row_number() like Oracle?
                            
                                Is the JDBC ResultSet an application-level query cursor
                            
                                How to connect mysql in a rails application with docker?
                            
                                How can I locate database in xampp path?
                            
                                Read Committed Vs Repeatable Reads in MySQL?
                            
                                How do I rewrite my MySQL update statement to eliminate "Truncated incorrect INTEGER value" warnings?
                            
                                Laravel one to one relationship without foreign key
                            
                                Some keys not working properly on MySQL Workbench for Mac
                            
                                Table name specified twice both as a target for update and separate source for data
                            
                                How to seed pivot table in Laravel 5.4?
                            
                                pandas DataFrame to_sql Python
                            
                                How much difference does BLOB or TEXT make in comparison with VARCHAR()?
                            
                                Hierarchical tagging in SQL
                            
                                Fetching linked list in MySQL database
                            
                                Mysql ORDER BY using date data row
                            
                                MySQL: use the id of the row being inserted in the insert statement itself
                            
                                How to deliberately lock a MySQL row such that even SELECT will return an error?
                            
                                What's the best way to store a MySQL database in source control?
                            
                                What is an ORM in a web application?
                            
                                Website. VoteUp or VoteDown Videos. How to restrict users voting multiple times?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Database design for very large amount of data

Tags:

performance

database

mysql

database-design

Hossein

People also ask

1 Answers

leeeroy

Recent Activity

Donate For Us