Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write an efficient hit counter for websites

I want to write a hit counter script to keep track of hits on images on a website and the originating IPs. Impressions are upwards of hundreds of thousands per day, so the counters will be incremented many times a second.

I'm looking for a simple, self-hosted method (php, python scripts, etc.). I was thinking of using MySQL to keep track of this, but I'm guessing there's a more efficient way. What are good methods of keeping counters?

like image 493
Lin Avatar asked Oct 08 '09 02:10

Lin


People also ask

How do you implement a hit counter on a website?

To implement a hit counter you can make use of the Application Implicit object and associated methods getAttribute() and setAttribute(). This object is a representation of the JSP page through its entire lifecycle.

Should I put a counter on my website?

By making a Web counter public, you are enabling others to learn almost as much as you are about traffic to the site. In addition, if visitors can see your site has little or no traffic, it may discourage them from spending more time on the site or buying something from you.

What is a website counter?

A web counter or hit counter is a publicly displayed running tally of the number of visits a webpage has received. Web counters are usually displayed as an inline digital image or in plain text. Image rendering of digits may use a variety of fonts and styles, with a classic design imitating the wheels of an odometer.


4 Answers

A fascinating subject. Incrementing a counter, simple as it may be, just has to be a transaction... meaning, it can lock out the whole DB for longer than makes sense!-) It can easily be the bottleneck for the whole system.

If you need rigorously exact counts but don't need them to be instantly up-to-date, my favorite approach is to append the countable information to a log (switching logs as often as needed for data freshness purposes). Once a log is closed (with thousands of countable events in it), a script can read it and update all that's needed in a single transaction -- maybe not intuitive, but much faster than thousands of single locks.

Then there's extremely-fast counters that are only statistically accurate -- but since you don't say that such imprecision is acceptable, I'm not going to explain them in more depth.

like image 170
Alex Martelli Avatar answered Nov 09 '22 00:11

Alex Martelli


You could take your webserver's Access log (Apache: access.log) and evaluate it time and again (cronjob) in case you do not need to have the data at hand at the exact moment in time when someone visits your site.

Usually, the access.log is generated anyway and contains the requested resource as well as time, date and the user's IP. This way you do not have to route all trafic through a php-script. Lean, mean counting machine.

like image 30
middus Avatar answered Nov 09 '22 00:11

middus


Without a doubt, Redis is perfect for this problem. It requires about a minute to setup and install, supports atomic increments, is incredibly fast, has client libs for python and php (and many other languages), is durable (snapshots, journal, replication).

Store each counter to its own key. Then simply

INCR key
like image 21
z8000 Avatar answered Nov 09 '22 01:11

z8000


There are two really easy ways:

  1. Parse it out of your web logs in batch.
  2. Run the hits through beanstalkd or gearmand and have a worker do the hard stuff in a controlled way.

Option 1 works with off-the-shelf tools. Option 2 requires just a bit of programming, but gives you something closer to realtime updates without causing you to fall over when the traffic spikes (such as you'll find in your direct mysql case).

like image 31
Dustin Avatar answered Nov 09 '22 00:11

Dustin