Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I create a queue with multiple workers?

Tags:

firebase

I want to create a queue where clients can put in requests, then server worker threads can pull them out as they have resources available.

I'm exploring how I could do this with a Firebase repository, rather than an external queue service that would then have to inject data back into Firebase.

With security and validation tools in mind, here is a simple example of what I have in mind:

  • user pushes a request into a "queue" bucket
  • servers pull out the request and deletes it (how do I ensure only one server gets the request?)
  • server validates data and retrieves from a private bucket (or injects new data)
  • server pushes data and/or errors back to the user's bucket

enter image description here

A simplified example of where this might be useful would be authentication:

  • user puts authentication request into the public queue
  • his login/password goes into his private bucket (a place only he can read/write into)
  • a server picks up the authentication request, retrieves login/password, and validates against the private bucket only the server can access
  • the server pushes a token into user's private bucket

(certainly there are still some security loopholes in a public queue; I'm just exploring at this point)

Some other examples for usage:

  • read only status queue (user status is communicated via private bucket, server write's it to a public bucket which is read-only for the public)
  • message queue (messages are sent via user, server decides which discussion buckets they get dropped into)

So the questions are:

  1. Is this a good design that will integrate well into the upcoming security plans? What are some alternative approaches being explored?
  2. How do I get all the servers to listen to the queue, but only one to pick up each request?
like image 387
Kato Avatar asked Jun 28 '12 15:06

Kato


2 Answers

Wow, great question. This is a usage pattern that we've discussed internally so we'd love to hear about your experience implementing it ([email protected]). Here are some thoughts on your questions:

Authentication

If your primary goal is actually authentication, just wait for our security features. :-) In particular, we're intending to have the ability to do auth backed by your own backend server, backed by a firebase user store, or backed by 3rd-party providers (Facebook, twitter, etc.).

Load-balanced Work Queue

Regardless of auth, there's still an interesting use case for using Firebase as the backbone for some sort of workload balancing system like you describe. For that, there are a couple approaches you could take:

  1. As you describe, have a single work queue that all of your servers watch and remove items from. You can accomplish this using transaction() to remove the items. transaction() deals with conflicts so that only one server's transaction will succeed. If one server beats a second server to a work item, the second server can abort its transaction and try again on the next item in the queue. This approach is nice because it scales automatically as you add and remove servers, but there's an overhead for each transaction attempt since it has to make a round-trip to the firebase servers to make sure nobody else has grabbed the item from the queue already. But if the time it takes to process a work item is much greater than the time to do a round-trip to the Firebase servers, this overhead probably isn't a big deal. If you have lots of servers (i.e. more contention) and/or lots of small work items, the overhead may be a killer.
  2. Push the load-balancing to the client by having them choose randomly among a number of work queues. (e.g. have /queue/0, /queue/1, /queue/2, /queue/3, and have the client randomly choose one). Then each server can monitor one work queue and own all of the processing. In general, this will have the least overhead, but it doesn't scale as seamlessly when you add/remove servers (you'll probably need to keep a separate list of work queues that servers update when they come online, and then have clients monitor the list so they know how many queues there are to choose from, etc.).

Personally, I'd lean toward option #2 if you want optimal performance. But #1 might be easier for prototyping and be fine at least initially.

In general, your design is definitely on the right track. If you experiment with implementation and run into problems or have suggestions for our API, let us know ([email protected] :-)!

like image 78
Michael Lehenbauer Avatar answered Nov 10 '22 04:11

Michael Lehenbauer


This question is pretty old but in case someone makes it here anyway...

Since mid 2015 Firebase offers something called the Firebase Queue, a fault-tolerant multi-worker job pipeline built on Firebase.

Q: Is this a good design that will integrate well into the upcoming security plans?

A: Your design suggestion fits perfectly with Firebase Queue.

Q: How do I get all the servers to listen to the queue, but only one to pick up each request?

A: Well, that is pretty much what Firebase Queue does for you!

References:

  • Introducing Firebase Queue (blog entry)
  • Firebase Queue (official GitHub-repo)
like image 3
wassgren Avatar answered Nov 10 '22 03:11

wassgren