Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is your strategy to write logs in your software to deal with possible HUGE amount of log messages?

Tags:

c++

linux

logging

Thanks for your time and sorry for this long message!

My work environment

Linux C/C++(but I'm new to Linux platform)

My question in brief

In the software I'm working on we write a LOT of log messages to local files which make the file size grow fast and finally use up all the disk space(ouch!). We want these log messages for trouble-shooting purpose, especially after the software is released to the customer site. I believe it's of course unacceptable to take up all the disk space of the customer's computer, but I have no good idea how to handle this. So I'm wondering if somebody has any good idea here. More info goes below.

What I am NOT asking

1). I'm NOT asking for a recommended C++ log library. We wrote a logger ourselves.

2). I'm NOT asking about what details(such as time stamp, thread ID, function name, etc) should be written in a log message. Some suggestions can be found here.

What I have done in my software

I separate the log messages into 3 categories:

  • SYSTEM: Only log the important steps in my software. Example: an outer invocation to the interface method of my software. The idea behind is from these messages we could see what is generally happening in the software. There aren't many such messages.

  • ERROR: Only log the error situations, such as an ID is not found. There usually aren't many such messages.

  • INFO: Log the detailed steps running inside my software. For example, when an interface method is called, a SYSTEM log message is written as mentioned above, and the entire calling routine into the internal modules within the interface method will be recorded with INFO messages. The idea behind is these messages could help us identify the detailed call stack for trouble-shooting or debugging. This is the source of the use-up-disk-space issue: There are always SO MANY INFO messages when the software is running normally.

My tries and thoughts

1). I tried to not record any INFO log messages. This resolves the disk space issue but I also lose a lot of information for debugging. Think about this: My customer is in a different city and it's expensive to go there often. Besides, they use an intranet that is 100% inaccessible from outside. Therefore: we can't always send engineers on-site as soon as they meet problems; we can't start a remote debug session. Thus log files, I think, are the only way we could make use to figure out the root of the trouble.

2). Maybe I could make the logging strategy configurable at run-time(currently it's before the software runs), that is: At normal run-time, the software only records SYSTEM and ERROR logs; when a problem arises, somebody could change the logging configuration so the INFO messages could be logged. But still: Who could change the configuration at run-time? Maybe we should educate the software admin?

3). Maybe I could always turn the INFO message logging on but pack the log files into a compressed package periodically? Hmm...

Finally...

What is your experience in your projects/work? Any thoughts/ideas/comments are welcome!

EDIT

THANKS for all your effort!!! Here is a summary of the key points from all the replies below(and I'll give them a try):

1). Do not use large log files. Use relatively small ones.

2). Deal with the oldest ones periodically(Either delete them or zip and put them to a larger storage).

3). Implement run-time configurable logging strategy.

like image 704
yaobin Avatar asked Mar 06 '12 11:03

yaobin


People also ask

What is log strategy?

Logs contain critical data needed to detect threats, but not all logs are relevant when it comes to a strong detection and response strategy. A strong logging strategy can help you monitor your environment, identify threats faster, and help you determine where you can get the most value from your data.


1 Answers

There are two important things to take note of:

  • Extremely large files are unwieldy. They are hard to transmit, hard to investigate, ...
  • Log files are mostly text, and text is compressible

In my experience, a simple way to deal with this is:

  • Only write small files: start a new file for a new session or when the current file grows past a preset limit (I have found 50 MB to be quite effective). To help locate the file in which the logs have been written, make the date and time of creation part of the file name.
  • Compress the logs, either offline (once the file is finished) or online (on the fly).
  • Put up a cleaning routine in place, delete all files older than X days or whenever you reach more than 10, 20 or 50 files, delete the oldest.

If you wish to keep the System and Error logs longer, you might duplicate them in a specific rotating file that only track them.

Put altogether, this gives the following log folder:

 Log/
   info.120229.081643.log.gz // <-- older file (to be purged soon)
   info.120306.080423.log // <-- complete (50 MB) file started at log in
                                 (to be compressed soon)
   info.120306.131743.log // <-- current file

   mon.120102.080417.log.gz // <-- older mon file
   mon.120229.081643.log.gz // <-- older mon file
   mon.120306.080423.log // <-- current mon file (System + Error only)

Depending on whether you can schedule (cron) the cleanup task, you may simply spin up a thread for cleanup within your application. Whether you go with a purge date or a number of files limit is a choice you have to make, either is effective.

Note: from experience, a 50MB ends up weighing around 10MB when compressed on the fly and less than 5MB when compressed offline (on the fly is less efficient).

like image 87
Matthieu M. Avatar answered Sep 19 '22 21:09

Matthieu M.