Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to communicate with 1000's of socket simultaneosuly in Java?

Tags:

java

sockets

This is the problem description We have thousands of devices(approx 4k -5k) through which we have to read data continuously, every 2 min or 30 seconds. Each device has its unique IP. This data would be collected and then stored in database. These devices are at 100's of location around the country. The data would not be read 24X7 but for at least 12 hrs.

There is a web application which would request at some point to show data which is being collected data through these devices. We would know that data from which device is being requested.

This is how we think we can implement in Java

Solution A : In each location , designate one machine which will act as server and would read data from x number of devices. This data will be pushed to central server every 1 hour . On this designated machine , data is pulled and stored locally (flat file or in memory database)

In this case we will have as many servers as number of locations . for eg we might end up having 1500 servers/machine managing which becomes a nightmare.

Solution B:

We have 8-10 central servers and each server reads data from a bunch of machines. The data gets queued up and is picked up in order which it has arrived.

The servers push the data to database.

How does client get the data ?

In solution B, the client gets it form database, assuming the data has been pushed into db and is still not queued up.

What do you think should work better ?

Any alternate design/solution ?

Should we think about programming at server with Unix/Perl. We do not want to use C++ for some other reasons.

like image 414
vsingh Avatar asked Oct 06 '10 20:10

vsingh


3 Answers

The requirement stated in your question do not imply 1000s of concurrent connections, as you can easily build the connection anew every 30 seconds. Assuming a connection can be disposed of within 500 ms, that leaves 5000 / 30 * 0.5 ~= 100 concurrent connections. Any decent OS should be able to handle that many. With such low concurrency, you can even get away with using a single server with each connection worked by a dedicated thread.

Your design should therefore focus on your other requirements. A few ideas:

  • Are the devices firewalled? With solution A you will have outgoing connections from each location, with solution B you will have incoming ones.
  • What kind of reliability do you need? For instance, do you need to record measurements if a location's internet connection is down? That would imply a local server buffering the measurements.
like image 62
meriton Avatar answered Oct 31 '22 17:10

meriton


If you maintain the connections, you should be able to poll each connection in under 20 microseconds per connection. This means you could poll every connection in under 100 ms uning just one non blocking thread. (perhaps the least efficent way to do this)

Using a Selector is a better approach as it gives a Set of the ready connections.

If you create a new connection each time, this is far more expensive but can take 20 milliseconds, (longer depending on the latency of your network). To pool 5000 connections in 30 seconds you would need to keep 3-4 active at any time. (most of the time would be spent establishing and destroying the connection) You can do all this with one thread, but using a small thread pool might be simpler.

like image 40
Peter Lawrey Avatar answered Oct 31 '22 17:10

Peter Lawrey


Try Netty.

like image 29
duffymo Avatar answered Oct 31 '22 19:10

duffymo