Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large MySql table putting too much load on server

I have a MySql table which consists of:

  1. ~25million rows (CURRENTLY)
  2. 3 indexes
  3. Each day, a crawler adds ~3million rows
  4. I'm currently not looking too far, but a final estimate of the db can be ~CONST*e9 rows
  5. Currently 9.5giga
  6. innodb and it is being read from while inserting

The data itself consists of a text of ~100 chars + several fields with meta data about it. The indexes are the unique id, the writer name, and the writer id.

Till now, everything went smooth, but now the server is having a hard time handling the inserts of the new data (~10seconds for each insert which adds ~3k lines). I'm trying to find ways to overcome this issue. Things I consider:

  1. Doing the index while inserting takes effort. Maybe not doing it while inserting, and only after X inserts adding the indexes.
  2. Partitioning the data into different tables.
  3. Crawling into a small db, and each X minutes/days, moving the data into the big db.
  4. Moving to a different db. I'm not enough acquainted with NoSql, will that help me resolve these problems? Is it a big effort to use it?

Each option has its sub-options and dilemmas, but I think I should firstly focus on having a direction. Which route should I take and why? Is there a different road I should think of?

BTW - There is also an option to not keep all the data, and only the parts I really display, but that will make it impossible to do some functional changes in the process that data is going through before being displayed.

like image 731
Noam Avatar asked Jun 20 '11 17:06

Noam


2 Answers

is the current engine optimal for the usage?

Have you concidered http://dev.mysql.com/doc/refman/5.1/en/partitioning-management.html

like image 121
Imre L Avatar answered Oct 29 '22 07:10

Imre L


If you're adding 3,000,000 rows a day, and 3000 rows takes a 10 second transaction, you're talking about 1,000 transactions a day, which should take about 170 minutes a day. That's really not that much.

I think I'd first try

  1. reducing the number of INSERT transactions by inserting more rows per transaction
  2. tuning the server

You might find that inserting more rows per transaction actually takes less overall time. And if not, it's easy to revert. If you stash the rows somewhere else first, you can run the INSERT transactions during times of low load.

Tuning the server is probably a good idea regardless. For reference, see the MySQL docs on Tuning Server Parameters.

like image 24
Mike Sherrill 'Cat Recall' Avatar answered Oct 29 '22 07:10

Mike Sherrill 'Cat Recall'