Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to simulate python random.shuffle of a queue using a pseudorandom sequence or hash function?

I'm building an application based around a task queue: it serves a series of tasks to multiple, asynchronously connected clients. The twist is that the tasks must be served in a random order.

My problem is that the algorithm I'm using now is computationally expensive, because it relies on many large queries and transfers from the database. I have a strong hunch that there's a cheaper way to achieve the same result, but I can't quite see the solution. Can you think of a clever fix for this problem?

Here's the (computationally expensive) algorithm I'm using now:

When the client queries for a new task...

  1. Query the database for "unfinished" tasks
  2. Put all tasks in a list
  3. Shuffle the list (using random.shuffle)
  4. Flag the first task as "in progress"
  5. Send the task parameters to the client for completion

When the client finishes the task...

6a. Record the result and flag the task as "finished."

If the client fails to finish the task by some deadline...

6b. Re-flag the task as "unfinished."

Seems like we could do better by replacing steps 1, 2, and 3, with pseudorandom sequences or hash functions. But I can't quite figure out the whole solution. Ideas?

Other considerations:

  • In case it's important, I'm using python and mongodb for all of this. (Mongodb doesn't have some clever "use find_one to efficiently return a random matching entry" usage, does it?)
  • The term "queue" is a little misleading. All the tasks are stored in subfields of a single collection within the mongodb. The length (total number of tasks) in the collection is known and fixed at the outset.
  • If it's necessary, it might be okay to let the same task be assigned multiple times, as long as the occurrence is rare. But instances of this kind would need to be very rare, because completing each task is costly.
  • I have identifying information on each client, so we know exactly who originates each task request.
like image 922
Abe Avatar asked Nov 13 '22 01:11

Abe


1 Answers

There is an easy way to get a random document from MongoDB!

See Random record from MongoDB

If you don't want a task to be picked twice, you could mark the task as active and not select it.

like image 69
iblue Avatar answered Dec 06 '22 21:12

iblue