MongoDb Streaming Out Inserted Data in Real-time (or near real-time)

Tags:

I have a number of MongoDB collections which take a number of JSON documents from various streaming sources. In other-words there a a number of processes which are continually inserting data into a set of MongoDB collections.

I need a way to stream the data out of MongoDB into downstream applications. So I want an system that conceptually looks like this:

App Stream1 --> 
App Stream2 -->     MONGODB     --->  Aggregated Stream
App Stream3 -->

OR this:

App Stream1 -->                 --->  MongoD Stream1
App Stream2 -->     MONGODB     --->  MongoD Stream2
App Stream3 -->                 --->  MongoD Stream3

The question is how do I stream data out of Mongo without having to continually poll/query the database?

The obvious question answer would be "why dont you change those app streaming processes to send messages to a Queue like Rabbit, Zero or ActiveMQ which then has them send to your Mongo Streaming processes and Mongo at once like this":

                 MONGODB
                   /|\  
                    |
App Stream1 -->     |          --->  MongoD Stream1
App Stream2 -->  SomeMQqueue   --->  MongoD Stream2
App Stream3 -->                --->  MongoD Stream3

In an ideal world yes that would be good, but we need Mongo to ensure that messages are saved first, to avoid duplicates and ensure that IDs are all generated etc. Mongo has to sit in the middle as the persistence layer.

So how do I stream messages out of a Mongo collection (not using GridFS etc) into these down stream apps. The basic school of thought has been to just poll for new documents and each document that is collected update it by adding another field to the JSON documents stored in the database, much like a process flag in a SQL table that stores a processed time stamp. I.e. every 1 second poll for documents where processed == null.... add processed = now().... update document.

Is there a neater/more computationally efficient method?

FYI - These are all Java processes.

Cheers!

439

asked Aug 24 '11 04:08

NightWolf

1 Answers

If you are writing to a capped collection (or collections), you can use a tailablecursor to push new data on the stream, or on a message queue from where it can be streamed out. However this will not work for a non-capped collection though.

answered Oct 12 '22 23:10

lobster1234

Related questions
                            
                                How to check equality of annotations?
                            
                                What's a good Java library for dynamic SOAP client operations?
                            
                                change java variables from JRuby code?
                            
                                Cannot refer/modify non-final variable in an innerclass
                            
                                Single instance of a Java desktop application with argument passing
                            
                                How do I use the condition element in ant to set another property?
                            
                                Spring MVC Plugin Architecture
                            
                                Java based blog website framework - Apache Roller?
                            
                                How can I monitor Tomcat with jvisualvm, showing visualgc?
                            
                                exact way to measure performance on individual methods from inside Java?
                            
                                JSP tags in a Freemarker template
                            
                                best practices to managing and loading properties
                            
                                android app not releasing Bluetooth properly on exit
                            
                                Convert raw grayscale binary to JPEG
                            
                                Runtime.exec freezes when invoking ant script that contain hbm2ddl task?
                            
                                HttpClient stalls for long periods of time, even with timeout parameters set
                            
                                JSR 303 Validation Override
                            
                                Java Deployment in TorqueBox
                            
                                How to create a server for communication with an Android app [closed]
                            
                                Where can I get the source of packages start with `sun` in JDK?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MongoDb Streaming Out Inserted Data in Real-time (or near real-time)

Tags:

java

mongodb

nosql

real-time

streaming

NightWolf

People also ask

1 Answers

lobster1234

Recent Activity

Donate For Us