Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Producer/consumer system using database (MySql), is this feasible?

I need to use something to coordinate my system with several consumers/producers each running on different machines with different operating systems. I have been researching on using MySql to do this, but it seems ridiculously difficult.

My requirements are simple: I want to be able to add or remove consumers/producers at any time and thus they should not depend on each other at all. Naturally a database would separate the two nicely.

I have been looking at Q4M message queuing plugin for MySql but it seems complicated to use.

I really need some input on how to construct my system best possible.

like image 452
johnrl Avatar asked Mar 15 '10 17:03

johnrl


3 Answers

I need to use something to coordinate my system with several consumers/producers each running on different machines with different operating systems

That's a message queue. Do not pursue other alternatives. Everything else (i.e., using a database with insert and deletes) is dreadfully slow and cumbersome.

Building a large, slow message queue with a database often turns out badly in practice because (1) databases are slow, (2) databases are huge and complex, (3) you have locking and contention issues that make each transaction potentially slow, (4) it's a lot more overhead than the problem deserves.

There are numerous message queue solutions.

If you can't make Q4M work, you should move on to another.

http://en.wikipedia.org/wiki/Message_queue

http://linux.die.net/man/7/mq_overview

http://qpid.apache.org/

http://code.google.com/p/httpsqs/

like image 55
S.Lott Avatar answered Nov 06 '22 14:11

S.Lott


It's actually (fairly) complicated to build such system. (I say fairly, because it's of course doable).

If you have multiple producer and one consumer, it's easy. All producer write concurrently, and the single consumer read data as soon as they are visible (committed).

But if you want scalability with several consumer, you will need to create a locking scheme that is not trivial. (You must ensure that no row gets dispatched to two consumers. This is not easy to achieve with database transactions and locks. Naive solutions lead to the serialization of all message delivery, like you had only one consumer, which we don't want.).

I would suggest to use a built-in solution. You can also read this question about a similar question.

like image 2
ewernli Avatar answered Nov 06 '22 14:11

ewernli


I think it's feasible without third-party software.

My first design would look like this:

  • The producer writes data to the database
  • To guarantee consistency it has to use transactions
  • The consumer process the data (read and delete) also using transactions.

Because of the transactions requirement InnoDB is the logical choice of the storage engine. Also you have to carefully choose the isolation level. My first guess is "serializable" to avoid phantom reads, but perhaps a weaker level also is possible.

If performance and scalability is an issue, you should consider using a "real" messaging solution. Rolling out your one will most likely lead to performance and/or scalability issues.

like image 1
Daniel Rikowski Avatar answered Nov 06 '22 14:11

Daniel Rikowski