Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Monitor Uptime of 20 Websites (Ping or HTTP) in Node.js/RoR

What's the best way to ping a list of 20 websites every 5 minutes (for example) in order to know if the site responds with HTTP 202 or not?

The no brainer idea is to save the 20 URLS in a database and just run the database and ping each one. However, what happen when one doesn't answers? What happens to the ones after that?

Also, is there better but no-brainer solution for this? I'm afraid the list can grow to 20000 websites and then there's not enough time to ping them all in the 5 minutes I need to be pinging.

Basically, I'm describing how PingDom, UptimeRobot, and the likes work.

I'm building this system using node.js and Ruby on Rails. I'm also inclined to use MongoDB to save the history of all the pings and monitoring results.

Suggestions?

Thanks a bunch!

like image 458
donald Avatar asked Jan 17 '11 01:01

donald


2 Answers

Github

I really like node.js and I would like to tackle this problem and hopefully soon share some code on github to achieve this. Keep in mind that I only have a veryy basic setup right now hosted at https://github.com/alfredwesterveld/freakinping

What's the best way to ping a list of 20 websites every 5 minutes (for example) in order to know if the site responds with HTTP 202 or not?

PING(ICMP)

First I would like to know if you want to really do a ping(ICMP) or if you just want to know if the website returns with code 200(OK) and measure the time it takes. I believe from the context that you don't really want to do a ping, but just an http request and measure the time. I ask this because(I believe) pinging from node.js/ruby/python can't be done from normal user because we need raw sockets(root user) to do the pinging(ICMP) from programming language. I for example found this ping script in python(I also believe I saw a simple ruby script somewhere although I am not a really big ruby programmer) but requires root access. I don't believe there is even yet a ping module out there for node.js.

Message Queue

Also, is there better but no-brainer solution for this? I'm afraid the list can grow to 20000 websites and then there's not enough time to ping them all in the 5 minutes I need to be pinging.

Basically, I'm describing how PingDom, UptimeRobot, and the likes work.

What you need to achieve this kind of scale is to use a message queue like for example redis, beanstalkd or gearmand. At the scale of PingDom one worker process is not going to cut it, but in your case it(I assume) one worker will do. I think(assume) redis will be the fastest message queue because of the C(node.js) extension but then again I should benchmark it against beanstalkd, which is another popular message queue(but does not yet have a C extension).

I'm afraid the list can grow to 20000 websites

If you get at that scale you might have to have host multiple boxes(a lot of worker threads/processes) to handle the load but you aren't at that scale yet and node.js is insane fast. It might even be able to handle that load with even one single box, although I don't know for sure(you need to do/run some benchmarks).

Datastore/Redis

I think this could be achieved pretty easily in node.js(I really like node.js). The way I would do this is use redis as my datastore because it is INSANE FAST!

PING: 20000 ops 46189.38 ops/sec 1/4/1.082
SET: 20000 ops 41237.11 ops/sec 0/6/1.210
GET: 20000 ops 39682.54 ops/sec 1/7/1.257
INCR: 20000 ops 40080.16 ops/sec 0/8/1.242
LPUSH: 20000 ops 41152.26 ops/sec 0/3/1.212
LRANGE (10 elements): 20000 ops 36563.07 ops/sec 1/8/1.363
LRANGE (100 elements): 20000 ops 21834.06 ops/sec 0/9/2.287

using node_redis(with hredis(node.js) c library). I would Add the URLs to redis using sadd.

Run tasks every 5 minutes

This could be achieved without barely any effort. I would use the setInterval(callback, delay, [arg], [...]) to repeatedly test response time of servers. Get all URLs on callback from redis using smembers. I would put all the URLs(messages) on the message queue using rpush.

Checking Response (Time)

However, what happen when one doesn't answers? What happens to the ones after that?

I might not completely understand this sentence but here it goes. If one fails it just fails. You could try to check response(time) again in 5 seconds or something to see if it is online. A precise algorithm for this should be devised. The ones after that should not have anything to do with previous URLs unless the are to the same server. Also something you clearly think about I guess because then you should not ping all those URLs to the same server at the same time but queue them up or something.

Processing URL

From the worker process(for now just one would be suffice) fetch message(URL) from redis using brpop command. check response time for URL(message) and fetch next URL(message) from the list. I would probably do a couple of request simultaneous to speed up the process.

like image 50
Alfred Avatar answered Nov 12 '22 18:11

Alfred


There is no "basic way", since you must handle a lot of use cases:

  • http redirects,
  • https pages,
  • request timeouts,
  • the cpu load of the server you use for pinging,
  • the type of report you need (availability? Uptime? Responsiveness? Downtime?)
  • how to aggregate qos measurements by time
  • lifetime of the data you collect (pinging dozens of targets every five minutes quickly produces a lot of data)
  • realtime alerts
  • etc.

Pingdom and the like are not "basic" tools, and if you want something similar you may want to pay for it or rely on an existing open-source alternative. I know it for sure because I built a remote monitoring application myself. It's called Uptime, it's written in Node.js and MongoDB, and it's hosted on GitHub (https://github.com/fzaninotto/uptime). It took several weeks of hard work to develop it, so believe me: it is NOT a no-brainer.

like image 44
François Zaninotto Avatar answered Nov 12 '22 20:11

François Zaninotto