Handling log and configuration files when load balancing apache

Tags:

So, I am currently rebuilding my web platform from a single-machine to a cluster of machines, and I will be using Apache load balancing to do this., but I have two questions that I need a good answer to before proceeding. I have Googled and searched here in SO, but didn't find anything good.

My setup will be one Debian machine running the Apache load balancing server (i.e. Apache with mod_proxy) and then any number of "slave" machines, that are balancing members. All of these are VPS inside a VMWare machine, so setting up new slaves as needed will be trivial.

Log Files The first question is that of log files. In order to troubleshoot my platform, I sometimes need to analyze log files, both access logs and error logs, from Apache. When the load is evenly distributed (i.e. I don't know if I'll even use sticky balancing, any host could probably handle any request at any time), so would the log files for each slave Apache instance. Is there a way to consolidate these live, meaning that my live log analyzer could see the log files from all hosts? I certainly understand that doing so while the files being on several hosts would be difficult, so is there a way to make sure that all log files are kept on one server?

I'm thinking about two things myself, but I would greatly appreciate your input.

syslogd The first is syslogd, where in it would be possible for several hosts to write to one logging host. The problem with this is that in my current setup, each virtual host in apache, has its own log file. That could probably be fixed in some manner though. My main usage for this is for troubleshooting, not keeping separate logs for each host (albeit if both goals could be met, that would certainly be a bonus).

NFS My next thought was about NFS, i.e. having a NFS share on the LAN where each slave can write to the same log file. I'm going to go ahead and assume that this will be difficult since slave 1 would open the log file and then slave 2 wouldn't be able to write to it.

As I said, your input is greatly appreciated since I feel I'm stuck in how to solve this.

Configuration files This is another thing altogether. Each slave will respond to each request as if acting as one single server. That is the entire idea. But what about making changes to the apache configuration files, adding virtual hosts, setting up other parameters? What if I have ten slaves, or fifty? Is there a way to make sure that all these slaves are always in sync? I am already using a NFS export to make sure they all have the same files, but should I use the same approach with the configuration files? Or should I have these as some form of repository and then use rsync to copy them out to the slaves? One problem is that I have built an interface in my web platform that edits these configuration files (namely the file with the virtual hosts), and since that action would take place on one of the slaves, the most current copy of this file could potentially be on one slave.

I realize that this was a long and wieldy post, and I apologize. I just wanted to make sure that all the parameters of my problem were expressed.

I hope someone out there can help me, as you have before! Thank you in advance!

523

asked Jul 28 '11 08:07

Sandman

2 Answers

I suggest not using NFS for logging as it can be a real performance killer. Instead use rsyslog with remote logging enabled. In you apache2.conf you can setup a LogFormat that includes the VirtualHost name and then pipe the log to rsyslog telling it to write the output to a remote host.

In apache2.conf:

LogFormat "%v %{X-FORWARDED-FOR}i %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
CustomLog "|/usr/bin/logger -t apache2 -p local7.info" vhost_combined

In rsyslog.conf on the webserver:

local7.* @<remote host ip>

In rsyslog.conf on the remote host:

local7.*    /var/log/webfrontends.log;precise

As for the Apache configuration files, we use NFS.
apache2.conf is a link to a remote file (different files for different machines if needed) and in apache2.conf we use an Include directive to read specific site configurations (different dirs for different machines if needed)

on the NFS server the NFS exported dir /NFS_EXPORT/etc/apache2/ contains:

 - webserver1_apache2.conf
 - webserver2_apache2.conf
 - webserver1_vhosts (dir)
 - webserver2_vhosts (dir)

Both webserver1_apache2.conf and webserver2_apache2.conf contain Include "/etc/apache2/vhosts"

on WebServer 1

ln -s /NFS_EXPORT/etc/apache2/webserver1_apache2.conf /etc/apache2/apache2.conf
ln -s /NFS_EXPORT/etc/apache2/webserver1_vhosts/ /etc/apache2/vhosts

on WebServer 2

ln -s /NFS_EXPORT/etc/apache2/webserver2_apache2.conf /etc/apache2/apache2.conf
ln -s /NFS_EXPORT/etc/apache2/webserver2_vhosts/ /etc/apache2/vhosts

If all your webservers are the same in terms of hardware specs and serve the same sites/applications then there is no need to differentiate the configs.

Of course you will need a script or some other mechanism to restart apache on all your server once you modify a configuration. Also, upgrading your apache2 software can be tricky unless you have root access to your NFS exports beacause typically your package management system will complain about not being able to modify some configuration file.

134

answered Oct 22 '22 05:10

jeremyjr

NFS will not help you with log files, for exactly the reasons you describe above. You should use syslogd (or some other solution like Splunk) to centralize the logging. It's trivial to include information about what host the log entry comes from, so you can still winnow down to per-host data when troubleshooting.

Configuration files: you need to either centralize them (a "master" copy), or have a way of distributing changes made on any server to all the others. I recommend centralization as the simpler approach. NFS will do the job here, or, as you suggest, a repository from which all hosts are periodically updated. There are a lot of options here, running all the way up to version control (SVN, git, etc) or even configuration servers (Chef, etc).

Please note that moving from a single server to a cluster has many implications. In both cases above (logging, config files), there is potential to introduce single points of failure if done naively. Since you have that already (one server), you're not worse off, but you should try to be aware of and plan for the failure scenarios you may need to respond to.

answered Oct 22 '22 06:10

Zac Thompson

Related questions
                            
                                Python Logging module: custom loggers
                            
                                Writing a unit test for Python logger formatted output
                            
                                Can I add custom levels to SLF4J?
                            
                                Windows Phone: Log to console
                            
                                Debug Android spring framework: print the sent requests
                            
                                Why does NLog miss some messages when logging a large number of messages?
                            
                                extend Android.util.Log to write to file
                            
                                Python Celery - Worker ignoring loglevel INFO
                            
                                How to enable pretty logging of SOAP messages in JBoss 7
                            
                                python logging: Different formatters for the same log file
                            
                                How to track/log connections in tomcat dbcp pool and detect code that does not return connection to the pool
                            
                                What is celery.utils.log.ProcessAwareLoggerobject doing in logging.Logger.manager.loggerDict
                            
                                Junit testing log statements
                            
                                Why is it forbidden to override log record attributes?
                            
                                && (logical and) and || (logical or) operators in Logback configuration (if statement)
                            
                                What is os.log for?
                            
                                Log from multiple python files into single log file in Python
                            
                                How to color text in log files in linux
                            
                                Problem with System.Diagnosis.TextWriterTraceListener not writing any log to the filesystem
                            
                                C++ library for log parsing [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Handling log and configuration files when load balancing apache

Tags:

logging

apache

load-balancing

nfs

Sandman

People also ask

2 Answers

jeremyjr

Zac Thompson

Recent Activity

Donate For Us