Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a better place to store large amounts of unused data than a the database?

So the application we've got calls the API's of all the major carriers (UPS, FedEx, etc) for tracking data.

We save the most recent version of the XML feed we get from them in a TEXT field in a table in our database.

We really hardly ever (read, never so far) access that data, but have it "just in case."

It adds quite a bit of additional weight to the database. Right now a 200,000 row table is coming in at around 500MB...the large majority of which is compromised of all that XML data.

So is there a more efficient way to store all that XML data? I had thought about saving them as actual text/xml files, but we update the data every couple of hours, so wasn't sure if it would make sense to do that.

like image 999
Shpigford Avatar asked Nov 09 '09 04:11

Shpigford


People also ask

Where should you store database data?

Database storage structure All the information in a database is organized and structured in database tables. These tables are stored on the hard disk of the database server.

What type of data you should not store in a database select all that apply?

Finally, you shouldn't store credit card information in your database unless you absolutely need to. This includes credit card owner names, numbers, CVV numbers, and expiration dates.

What is difference between storage and database?

Generally, the services grouped under storage are more open-ended, while database services focus specifically on managing database software and storage.

Which is the most common approach to store data in a file?

Object-based storage has emerged as a preferred method for data archiving and backing-up today's digital communications—unstructured media and web content like email, videos, image files, web pages, and sensor data produced by the Internet of Things (IoT).


3 Answers

Assuming it's data there's no particular reason not to keep it in your database (unless it's impeding your backups). But it would be a good idea to keep it in a separate table from the actual data that you do need to read on a regular basis — just the XML, a FK back to the original table, and possibly an autonumbered PK column.

like image 114
Larry Lustig Avatar answered Oct 03 '22 23:10

Larry Lustig


It has been my experience that the biggest trouble with TEXT/BLOB columns that are consistently large, is that people are not careful to prevent reading them when scanning many rows. On MyISAM, this will waste your VFS cache, and on InnoDB, it will waste your InnoDB buffer pool.

A secondary problem is that as tables get bigger, they become harder to maintain.. adding a column or index can rebuild the whole table, and a 500MB table rebuilds a lot slower than a 5MB table.

I've had good success moving things like this off into offline key/value storage such as MogileFS, and/or TokyoTyrant.

If you don't need to be crazy scalable, or you must value transactional consistency over performance, then simply moving this column into another table with a 1:1 relationship with the original table will at least require a join to blow out the buffer pool, and allow you to maintain the original table w/o having to tip-toe around the 500MB gorilla.

like image 43
SpamapS Avatar answered Oct 03 '22 22:10

SpamapS


if its really unused, try:

/dev/null

like image 44
SpaghettiMonster Avatar answered Oct 04 '22 00:10

SpaghettiMonster