Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which database should one use for stats tracking and archiving emails sent via PHP

The question has two sides.

  1. We host a lot of static files for public download. PDFs, Zips, images, people download thousands of each every day. We track the counters in our MySQL database, with details being tracked in MongoDB (details like where the download came from and when).

  2. We send a lot of emails via PHP. Our application tends to send out hundreds of thousands of emails every month, many of which are newsletters, notifications and invitations for projects. These sent emails are saved into out MySQL database with their crucial data serialized (never the body or actual email content, just the headers, recipient, time of sending etc.)

Is MySQL an ok choice for this? Is Mongo? Should we use something else? Right now both our emails archive table and the download stats table are rapidly approaching 2GB each.

Note: The data we store is accessed regularly, so something to store it in and forget about it is out of the question. We use the download stats to notify the authors of the content that their download count has reached X, and we use the email archive to check the delivery status etc and display it to our employees who track this on a regular basis. (we use Sendgrid for delivery metrics)

like image 897
Swader Avatar asked Oct 23 '22 08:10

Swader


2 Answers

I think mysql can serve your purpose well. it is more flexible for web, for tracking your log you can use mysql ARCHIVE db engine. mysql has some different db engine for different purpose. i think archive will be the best fit for your structure.

in recent i manage a mysql database of 60GB. its was high scaled database and performance is good.

like image 96
Nikson Kanti Paul Avatar answered Nov 08 '22 10:11

Nikson Kanti Paul


My two cents:

There is this rumor going around that MySQL does not scale very well with the number of rows in a table, and that postgres manages large tables much better, in terms of performance. I would definitely prefer to use postgres for an application with huge tables. (However this article says that it's more important how you define and use your database, whatever system you choose.)

If you are feeling adventurous and want to do something more modern and distributed, perhaps look into hadoop and hive, which at the same time can also solve the problem of big file storage, but requires you to learn some new things.

like image 31
alexg Avatar answered Nov 08 '22 10:11

alexg