Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RESTful web services vs Socket Programming for a Data Intensive Application

I'm building a web application with Ruby on Rails which needs to be highly scalable. In this application, data is produced by a mobile client (approximately 20 bytes) every second. All of this data must be transferred to a server at some point, preferably as soon as possible.

To accomplish this task, I want the server to act as a RESTful service. The client could buffer locations (say every 5 to 30 seconds) and then shoot them off as a HTTP put request, where the server could then store them. I believe this model is simpler to implement, and better handles high volume traffic, as the clients could keep buffering data until they hear a response from the server.

My boss, on the other hand, wants to implement the server using socket programming. He believes socket programming will result in less data being transferred, which will increase the total efficiency of the system. I can't disagree on this point, but I think given modern bandwidth the extra overhead with HTTP is worth it. Plus, I think trying to maintain thousands (or millions) of simultaneous connects with users will cause its own problems and greatly increase the complexity of the server.

Honestly, I don't know the right approach to this problem, so I thought I'd post it here and get the opinions of much smarter people than myself. I'd appreciate it if any answers included the pros and cons of the proposed solution.

Thanks.

Update

We now have a few additional requirements flushed out. First, the mobile client cannot upload more than 5 GB of data per month. In this case, we're talking one message a second for eight hours a day per month. Second is we want's to combine messages as little as possible. This is to ensure if something happens to the mobile client (say a car crash), we lose as little data as possible.

like image 759
LandonSchropp Avatar asked May 20 '11 04:05

LandonSchropp


People also ask

Should I use WebSockets or REST API?

WebSocket is ideal for a scenario where high loads are a part of the game, i.e. real-time scalable chat application, whereas REST is better fitted for occasional communication in a typical GET request scenario to call RESTful APIs.

Which is the best choice for real-time data WebSockets or REST API?

There are good, logical fact-based reasons why a webSocket is a better choice for delivering real-time data to a client than an Ajax call using REST. This isn't opinion at all - in fact it's pretty much why webSockets were designed to improve/solve this problem better than Ajax calls.

Are WebSockets RESTful?

REST requires a stateless protocol according to the statelessness constraint, websockets is a stateful protocol, so it is not possible.


3 Answers

Your boss appears to be optimizing prematurely, which is not really a good idea.

Instead of trying to fight an imaginary performance bogeyman before you've even started writing your code, you should examine your application's requirements and design to them. Don't let perceived problems drive your design.

If it comes to it, have your boss outline exactly how he'd marshal data across his socket connection and then do some quick calculations to see if you could match or beat them with HTTP. Will he use something like Google's Protocol Buffers, or write his own marshaling protocol? If so, will it be self-describing? How about application "verbs" like what you'd get for free in HTTP? Will his connections be persistent? There's a lot more to "sockets" than just opening a connection and spewing bytes down it.

You've also correctly noted that your boss seems to be favoring raw speed of sockets over everything else: scalability, maintainability, availability of development and testing tools, protocol sniffers, the helpful semantics of the HTTPS verbs, and so on. HTTP is well understood by load balancers and firewalls and the like. Your proprietary socket protocol will not be so lucky.

What I'd suggest is you look into all the options out there and evaluate them from a performance perspective through testing, prototyping and benchmarking. Then weigh those numbers against the difficulty of building and maintaining the application with that technology.

like image 124
Brian Kelly Avatar answered Oct 19 '22 02:10

Brian Kelly


Stick to HTTP.

It's far easier to create a park of HTTP servers and put them behind a load balancer than to try do the same thing with for your own protocol. Why? Everything already exists for HTTP.

Update

What you need to reimplement yourself:

  • Buffer management (important if your load is high)
  • Making sure that you've received an entire message (A simple Receive/BeginReceive is not enough)
  • Asynchronous socket handling
  • Authentication
  • A load balancer (this part is tricky and you need to design it carefully)
  • Your own protocol (you need a way to identify when you've received an entire message)

If you use ASP.NET MVC + JSON (the steps for merb or rails is similar):

  1. Create a new website
  2. Activate digest authentication in IIS
  3. Create a new controller, tag it with the [Authorize] attribute
  4. Add an action

What is cheapest? A server or having you spend a month on something that already have been done?

like image 27
jgauffin Avatar answered Oct 19 '22 02:10

jgauffin


HTTP was designed to scale based on the assumption that the vast majority of requests are GETs. It sounds like the most of your interactions are the client sending data to the server. I think it is quite probable that there exists a better architectural style than REST to achieve what you are trying to do.

The question is, can you afford the time to start from scratch, or is HTTP good enough for your needs. Without knowing more details about your app, I think it is difficult to give good advice.

like image 40
Darrel Miller Avatar answered Oct 19 '22 02:10

Darrel Miller