I have a collection called 'my_emails' where are stored email addresses :
[
{ email:"[email protected]"},
{ email:"[email protected]"},
{ email:"[email protected]"},
]
and I try to get the top 10 hostnames used...
[
{host: "gmail.com", count: 1000},
{host: "yahoo.com", count: 989}, ...
]
if I had MySQL, I’ll do this query :
SELECT substr(email,locate('@',email)+1,255) AS host,count(1) AS count
FROM my_emails
WHERE email like '%@%'
GROUP BY substr(email,locate('@',email)+1,255)
ORDER BY count(1) DESC
LIMIT 10
how can I do with mongodb ? I try without result something like this :
db.my_emails.aggregate([ { $group : {_id : "$host", count : { $sum : 1 }}}]);
I don't know how to make the $host value without adding a new property to my records
MongoDB doesn't provide any operator like locate but you can use .mapReduce to do this:
db.collection.mapReduce(
function() {
emit(this.email.substr(this.email.indexOf('@') + 1), 1);
},
function(host, count) {
return Array.sum(count) ; },
{ out: "hosts" }
)
Then db.hosts.find().sort({ 'value': -1 }).limit(10) returns top 10 hostname:
{ "_id" : "yahoo.com", "value" : 2 }
{ "_id" : "gmail.com", "value" : 1 }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With