Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Log viewing utility database choice

I will be implementing log viewing utility soon. But I stuck with DB choice. My requirements are like below:

  • Store 5 GB data daily
  • Total size of 5 TB data
  • Search in this log data in less than 10 sec

I know that PostgreSQL will work if I fragment tables. But will I able to get this performance written above. As I understood NoSQL is better choice for log storing, since logs are not very structured. I saw an example like below and it seems promising using hadoop-hbase-lucene: http://blog.mgm-tp.com/2010/03/hadoop-log-management-part1/

But before deciding I wanted to ask if anybody did a choice like this before and could give me an idea. Which DBMS will fit this task best?

like image 492
denizeren Avatar asked Nov 19 '12 08:11

denizeren


People also ask

What is log file in database?

A log file is an event that took place at a certain time and might have metadata that contextualizes it. Log files are a historical record of everything and anything that happens within a system, including events such as transactions, errors and intrusions.

How many types of logs are there in MySQL?

MySQL database logs offer three formats for binary logging. Statement-based logging: In this format, MySQL records the SQL statements that produce data changes. Statement-based logging is useful when many rows are affected by an event because it is more efficient to log a few statements than many rows.


1 Answers

My logs are very structured :)

I would say you don't need database you need search engine:

  • Solr based on Lucene and it packages everything what you need together
  • ElasticSearch another Lucene based search engine
  • Sphinx nice thing is that you can use multiple sources per search index -- enrich your raw logs with other events
  • Scribe Facebook way to search and collect logs

Update for @JustBob: Most of the mentioned solutions can work with flat file w/o affecting performance. All of then need inverted index which is the hardest part to build or maintain. You can update index in batch mode or on-line. Index can be stored in RDBMS, NoSQL, or custom "flat file" storage format (custom - maintained by search engine application)

like image 64
mys Avatar answered Oct 02 '22 04:10

mys