Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting unique hit count in php

I want to add a unique hit counter to my website using PHP. This counter will save visitor's IP for each page in a database. I have a database structure like this:

Table hits with two columns:

ip
page_url

My question is: after getting the visitor's IP in a PHP file, which is better (for performance)?

  1. To check if the IP address is already in the database. And when not already in the database add it
  2. Just add all visitors IPs (without duplicate check) and then get distinct IPs for relevant page to get the unique hit count?
like image 857
John Avatar asked Dec 22 '22 03:12

John


2 Answers

If you are on MySQL you might want to abuse the combination of PRIMARY KEY and ON DUPLICATE KEY UPDATE:

CREATE TABLE hits (
ip VARCHAR(15),
page_url VARCHAR(200),
PRIMARY KEY (ip,page_url),
hitcount INT NOT NULL DEFAULT 0
)

Now on a page hit you do

INSERT INTO hits(ip, page_url,hitcount) VALUES('$ip','$url',1)
ON DUPLICATE KEY UPDATE hitcount=hitcount+1

Why this?

  • Another unique key is POISON for a write-heavy table, so avoid it. Really.
  • INSERT ... ON DUPLICATE KEY UPDATE only locks the row once

You may also want to record the timestamp of last access:

ALTER TABLE hits ADD COLUMN lastseen TIMESTAMP();
like image 74
Eugen Rieck Avatar answered Dec 24 '22 02:12

Eugen Rieck


Presuming you're comfortable with simple programming.

NOT TRULY "REAL TIME" BUT ALMOST REAL TIME Highly recommend you to write out to a log with your own format to a text file (if you're not comfortable with Apache's [customlog][1] feature).

Then, set up a cronjob every 5 minutes, or even once every 1 minute if you want close to "live", import the text into a MySql temporary table in a big gulp with LOAD DATA INFILE and then update your visitcounts table based on GROUP BY ip.

FULLY REAL TIME This can be a huge drag on your server but given that you have light traffic just create two tables in MySQL. One just records the article/page ID being read + IP + time (log table). The other contains article/page ID and visit counts--where the counts are updated GROUP BY ip in the first table.

like image 29
PKHunter Avatar answered Dec 24 '22 01:12

PKHunter