Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Database for remote sensor data logging

I need to choose a Database for storing data remotely from a big number (thousands to tens of thousands) of sensors that would generate around one entry per minute each.

The said data needs to be queried in a variety of ways from counting data with certain characteristics for statistics to simple outputting for plotting.

I am looking around for the right tool, I started with MySQL but I feel like it lacks the scalability needed for this project, and this lead me to noSQL databases which I don't know much about.

Which Database, either relational or not would be a good choice?

Thanks.

like image 616
Mike Avatar asked Jun 29 '12 09:06

Mike


People also ask

Which DB is best for logging?

If you are only logging lots and lots of simple log messages, MongoDB is a very good choice as it scales so good.

Which database is good for IoT data?

MongoDB—and its SaaS version, MongoDB Atlas—is the database of choice to help you meet your business needs with regard to IoT. It will help you with the challenges encountered with IoT databases and provide you with additional features that will make your IoT architecture more efficient.

What is sensor database?

A sensor database involves stored data and sensor data. Stored data include the set of sensors that participate in the sensor database together with characteristics of the sensors (e.g., their location) or characteristics of the physical environment. These stored data are best represented as relations.


2 Answers

There is usually no "best" database since they all involve trade-offs of one kind or another. Your question is also very vague because you don't say anything about your performance needs other than the number of inserts per minute (how much data per insert?) and that you need "scalability".

It also looks like a case of premature optimization because you say you "feel like [MySQL] lacks the scalability needed for this project", but it doesn't sound like you've run any tests to confirm whether this is a real problem. It's always better to get real data rather than base an important architectural decision on "feelings".

Here's a suggestion:

  1. Write a simple test program that inserts 10,000 rows of sample data per minute
  2. Run the program for a decent length of time (a few days or more) to generate a sizable chunk of test data
  3. Run your queries to see if they meet your performance needs (which you haven't specified -- how fast do they need to be? how often will they run? how complex are they?)

You're testing at least two things here: whether your database can handle 10,000 inserts per minute and whether your queries will run quickly enough once you have a huge amount of data. With large datasets these will become competing priorities since you need indexes for fast queries, but indexes will start to slow down your inserts over time. At some point you'll need to think about data archival as well (or purging, if historical data isn't needed) both for performance and for practical reasons (finite storage space).

These will be concerns no matter what database you select. From what little you've told us about your retrieval needs ("counting data with certain characteristics" and "simple outputting for plotting") it sounds like any type of database will do. It may be that other concerns are more important, such as ease of development (what languages and tools are you using?), deployment, management, code maintainability, etc.

Since this is sensor data we're talking about, you may also want to look at a round robin database (RRD) such as RRDTool to see if that approach better serves your needs.

like image 147
John Landahl Avatar answered Sep 30 '22 12:09

John Landahl


Found this question while googling for "database for sensor data" One of very helpful search-results (along with this SO question) was this blog:

Actually I've started a similar project (http://reatha.de) but realized too late, that I'm using not the best technologies available. My approach was similar MySQL + PHP. Finally I realized that this is not scalable and stopped the project.

Additionally, a good starting point is looking at the list of data-bases in Heroku: If they use one, then it should be not the worst one.

I hope this helps.

like image 25
Valentin H Avatar answered Sep 30 '22 13:09

Valentin H