Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Map reduce in RavenDb over 2 collections with child collection

I have 2 different object types stored in RavenDb, which are a parent/child type relationship, like this in JSON:

Account/1
{        
    "Name": "Acc1",
}

Items/1
{
    "Account": "Account/1",
    "Value" : "100",
    "Tags": [
       "tag1",
       "tag2"]
}

Items/2
{
    "Account": "Account/1",
    "Value" : "50",
    "Tags": [
       "tag2"]
}

Note that I don't want to store these in the same document, as an account may have thousands of items.

I am trying to write a map/reduce index that will return me something like:

{
    "Account": "Acc1",
    "TagInfo": [
        { "TagName" : "tag1",
          "Count" : "1",  //Count of all the "tag1" occurrences for acc1
          "Value" : "100" //Sum of all the Values for acc1 which are tagged 'tag1'
        },
        { "TagName" : "tag2",
          "Count" : "2",  //Two items are tagged "tag2"
          "Value" : "150"
        }]
}

i.e. a list of all the distinct tag names along with the number of each and their value.

I think I need to use a multi-map to map the Account and Items collections together, but I can't figure out the reduce part to create the "TagInfo" part of the result.

Is this possible, or am I modelling this all wrong in Raven?

EDIT:

The class I want to retrieve from this query would look something like this:

public class QueryResult
{
    public string AccountId {get;set;}
    public TagInfo Tags {get;set;} 
}

public class TagInfo
{
    public string TagName {get;set;}
    public int Count {get;set;}
    public int TotalSum {get;set;}
}
like image 264
Simon Avatar asked May 02 '12 17:05

Simon


2 Answers

You can't use a Multi Map/Reduce index for that because you want one map on the tags and the other on the account. They don't have a common property, so you can't have a multi maps/reduce here.

However, you can use TransformResult instead. Here's how to do it:

public class Account
{
    public string Id { get; set; }
    public string Name { get; set; }
}

public class Item
{
    public string Id { get; set; }
    public string AccountId { get; set; }
    public int Value { get; set; }
    public List<string> Tags { get; set; }
}

public class TagsWithCountAndValues : AbstractIndexCreationTask<Item, TagsWithCountAndValues.ReduceResult>
{
    public class ReduceResult
    {
        public string AccountId { get; set; }
        public string AccountName { get; set; }
        public string Tag { get; set; }
        public int Count { get; set; }
        public int TotalSum { get; set; }
    }

    public TagsWithCountAndValues()
    {
        Map = items => from item in items
                        from tag in item.Tags
                        select new
                        {
                            AccountId = item.AccountId,
                            Tag = tag,
                            Count = 1,
                            TotalSum = item.Value
                        };
        Reduce = results => from result in results
                            group result by result.Tag
                            into g
                            select new
                            {
                                AccountId = g.Select(x => x.AccountId).FirstOrDefault(),
                                Tag = g.Key,
                                Count = g.Sum(x => x.Count),
                                TotalSum = g.Sum(x => x.TotalSum)
                            };
        TransformResults = (database, results) => from result in results
                                                    let account = database.Load<Account>(result.AccountId)
                                                    select new
                                                    {
                                                        AccountId = result.AccountId,
                                                        AccountName = account.Name,
                                                        Tag = result.Tag,
                                                        Count = result.Count,
                                                        TotalSum = result.TotalSum
                                                    };
    }
}

Later then, you can query like this:

var results = session.Query<TagsWithCountAndValues.ReduceResult, TagsWithCountAndValues>()
    .Where(x => x.AccountId == "accounts/1")                        
    .ToList();
like image 185
Daniel Lang Avatar answered Sep 21 '22 02:09

Daniel Lang


OK, so I figured out a way to do this in an acceptable manner that builds on Daniel's answer, so I'll record it here for any future travellers (probably myself!).

I changed from trying to return one result per account, to one result per account/tag combination, so the index had to change as follows (note the group by in the reduce is on 2 properties):

public class TagsWithCountAndValues : AbstractIndexCreationTask<Item, TagsWithCountAndValues.ReduceResult>
{
    public class ReduceResult
    {
        public string AccountId { get; set; }
        public string AccountName { get; set; }
        public string TagName { get; set; }
        public int TagCount { get; set; }
        public int TagValue { get; set; }
    }

    public TagsWithCountAndValues()
    {
        Map = items => from item in items
                       from tag in item.Tags
                       select new ReduceResult
                       {
                           AccountId = item.AccountId,
                           TagName = tag,
                           TagCount = 1,
                           TagValue = item.Value
                       };

        Reduce = results => from result in results
                            where result.TagName != null
                            group result by new {result.AccountId, result.TagName}
                            into g
                            select new ReduceResult
                                       {
                                           AccountId = g.Key.AccountId,
                                           TagName = g.Key.TagName,
                                           TagCount = g.Sum(x => x.TagCount),
                                           TagValue = g.Sum(x => x.TagValue),
                                       };

        TransformResults = (database, results) => from result in results
                                                  let account = database.Load<Account>(result.AccountId)
                                                  select new ReduceResult
                                                             {
                                                                 AccountId = result.AccountId,
                                                                 AccountName = account.Name,
                                                                 TagName = result.TagName,
                                                                 TagCount = result.TagCount,
                                                                 TagValue = result.TagValue,
                                                             };
    }
}

As before, querying this is just:

var results = session
    .Query<TagsWithCountAndValues.ReduceResult, TagsWithCountAndValues>()
    .ToList();

The result of this can then be transformed into the object I originally wanted by an in-memory LINQ query. At this point the number of results that could be returned would be relatively small, so performing this at the client end is easily acceptable. The LINQ statement is:

var hierachicalResult = from result in results
                        group new {result.TagName, result.TagValue} by result.AccountName
                        into g
                        select new
                        {
                            Account = g.Key,
                            TagInfo = g.Select(x => new { x.TagName, x.TagValue, x.TagCount })
                        };

Which gives us one object per account, with a child list of TagInfo objects - one for each unique tag.

like image 22
Simon Avatar answered Sep 20 '22 02:09

Simon