I'd like to do something like
SELECT e1.sender
FROM email as e1, email as e2
WHERE e1.sender = e2.receiver;
but in MongoDB. I found many forums about JOIN, which can be implemented via MapReduce in MongoDB, but I don't understand how to do it in this example with self-join.
I was thinking about something like this:
var map1 = function(){
var output = {
sender:db.collectionSender.email,
receiver: db.collectionReceiver.findOne({email:db.collectionSender.email}).email
}
emit(this.email, output);
};
var reduce1 = function(key, values){
var outs = {sender:null, receiver:null
values.forEach(function(v) {
if(outs.sender == null){
outs.sender = v.sender
}
if(outs.receivers == null){
outs.receiver = v.receiver
}
});
return outs; }};
db.email.mapReduce(map2,reduce2,{out:'rec_send_email'})
to create 2 new collections - collectionReceiver containing only receiver email and collectionSender containing only sender email
OR
var map2 = function(){
var output = {sender:this.sender,
receiver: db.email.findOne({receiver:this.sender})}
emit(this.sender, output);
};
var reduce2 = function(key, values){
var outs = {sender:null, receiver:null
values.forEach(function(v){
if(outs.sender == null){
outs.sender = v.sender
}
if(outs.receiver == null){
outs.receiver = v.receiver
}
});
return outs; };};
db.email.mapReduce(map2,reduce2,{out:'rec_send_email'})
but none of them is working and I don't understand this MapReduce-thing well. Could somebody explain it to me please? I was inspired by this article http://tebros.com/2011/07/using-mongodb-mapreduce-to-join-2-collections/ .
Additionally, I need to write it in Java. Is there any way how to solve it?
If you need to implement a "self-join" when using MongoDB then you may have structured your schema incorrectly (or sub-optimally).
In MongoDB (and noSQL in general) the schema structure should reflect the queries you will need to run against them.
It looks like you are assuming a collection of emails where each document has one sender and one receiver and now you want to find all senders who also happen to be receivers of email? The only way to do this would be via two simple queries, and not via map/reduce (which would be far more complex, unnecessary and the way you've written them wouldn't work as you can't query from within map function).
You are writing in Java - why not make two queries - the first to get all unique senders and the second to find all unique receivers who are also in the list of senders?
In the shell it would be:
var senderList = db.email.distinct("sender");
var receiverList = db.email.distinct("receiver", {"receiver":{$in:senderList}})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With