Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I retrieve multiple types of entities using a single query to Azure Table Storage?

I'm trying to grasp how Azure table storage works to create facebook-style feeds and I'm stuck on how to retrieve the entries.

(My questions is almost the same as https://stackoverflow.com/questions/6843689/retrieve-multiple-type-of-entities-from-azure-table-storage but the link in the answer is broken.)

This is my intended approach:

  1. Create a personal feed for all users within my application which can contain different types of entries (notification, status update etc). My idea is to store them in an Azure Table grouped by a partition key for each user.

  2. Retrieve all entries within the same partition key and pass it to different views depending on entry type.

How do I query the table storage for all types of the same base type while keeping their unique properties?

The CloudTableQuery<TElement> requires a typed entity, if I specify EntryBase as generic argument I don't get the entry-specific properties (NotificationSpecificProperty, StatusUpdateSpecificProperty) and vice versa.

My entities:

public class EntryBase : TableServiceEntity
{
    public EntryBase()
    {


    }
    public EntryBase(string partitionKey, string rowKey)
    {
        this.PartitionKey = partitionKey;
        this.RowKey = rowKey;
    }
}


public class NotificationEntry : EntryBase
{
    public string NotificationSpecificProperty { get; set; }
}

public class StatusUpdateEntry : EntryBase
{
    public string StatusUpdateSpecificProperty { get; set; }
}

My query for a feed:

List<AbstractFeedEntry> entries = // how do I fetch all entries?

foreach (var item in entries)
{

    if(item.GetType() == typeof(NotificationEntry)){

        // handle notification

    }else if(item.GetType() == typeof(StatusUpdateEntry)){

        // handle status update

    }

}
like image 400
Jonas Stensved Avatar asked May 22 '12 07:05

Jonas Stensved


People also ask

Can two entities in same table storage contains different collection of properties of different types?

An entity has a primary key and a set of properties. A property is a name, typed-value pair, similar to a column. The Table service does not enforce any schema for tables, so two entities in the same table may have different sets of properties.

How do you retrieve data from table storage in Azure?

On the Message Analyzer File menu, point to New Session and then select Azure Table in the submenu to open a Data Retrieval Session that enables you to target Azure Storage table data as input to Message Analyzer. Enter an Account Name, Account Key, and Table Name on the Azure Table tab of the New Session dialog.

How many properties you can have per entity in Azure tables?

An individual entity can have no more than 252 properties (excluding the mandatory system properties) and cannot store more than 1 MB of data in total.

How many properties can each entity have in a table storage?

Each entity can include up to 252 properties to store data. Each entity also has three system properties that specify a partition key, a row key, and a timestamp.


3 Answers

Finally there's a official way! :)

Look at the NoSQL sample which does exactly this in this link from the Azure Storage Team Blog:

Windows Azure Storage Client Library 2.0 Tables Deep Dive

like image 86
Jonas Stensved Avatar answered Oct 22 '22 15:10

Jonas Stensved


There are a few ways to go about this and how you do it depends a bit on your personal preference as well as potentially performance goals.

  • Create an amalgamated class that represents all queried types. If I had StatusUpdateEntry and a NotificationEntry, then I would simply merge each property into a single class. The serializer will automatically fill in the correct properties and leave the others null (or default). If you also put a 'type' property on the entity (calculated or set in storage), you could easily switch on that type. Since I always recommend mapping from table entity to your own type in the app, this works fine as well (the class only becomes used for DTO).

Example:

[DataServiceKey("PartitionKey", "RowKey")]
public class NoticeStatusUpdateEntry
{
    public string PartitionKey { get; set; }   
    public string RowKey { get; set; }
    public string NoticeProperty { get; set; }
    public string StatusUpdateProperty { get; set; }
    public string Type
    {
       get 
       {
           return String.IsNullOrEmpty(this.StatusUpdateProperty) ? "Notice" : "StatusUpate";
       }
    }
}
  • Override the serialization process. You can do this yourself by hooking the ReadingEntity event. It gives you the raw XML and you can choose to serialize however you want. Jai Haridas and Pablo Castro gave some example code for reading an entity when you don't know the type (included below), and you can adapt that to read specific types that you do know about.

The downside to both approaches is that you end up pulling more data than you need in some cases. You need to weigh this on how much you really want to query one type versus another. Keep in mind you can use projection now in Table storage, so that also reduces the wire format size and can really speed things up when you have larger entities or many to return. If you ever had the need to query only a single type, I would probably use part of the RowKey or PartitionKey to specify the type, which would then allow me to query only a single type at a time (you could use a property, but that is not as efficient for query purposes as PK or RK).

Edit: As noted by Lucifure, another great option is to design around it. Use multiple tables, query in parallel, etc. You need to trade that off with complexity around timeouts and error handling of course, but it is a viable and often good option as well depending on your needs.

Reading a Generic Entity:

[DataServiceKey("PartitionKey", "RowKey")]   
public class GenericEntity   
{   
    public string PartitionKey { get; set; }   
    public string RowKey { get; set; } 

    Dictionary<string, object> properties = new Dictionary<string, object>();   

    internal object this[string key]   
    {   
        get   
        {   
            return this.properties[key];   
        }   

        set   
        {   
            this.properties[key] = value;   
        }   
    }   

    public override string ToString()   
    {   
        // TODO: append each property   
        return "";   
    }   
}   


    void TestGenericTable()   
    {   
        var ctx = CustomerDataContext.GetDataServiceContext();   
        ctx.IgnoreMissingProperties = true;   
        ctx.ReadingEntity += new EventHandler<ReadingWritingEntityEventArgs>(OnReadingEntity);   
        var customers = from o in ctx.CreateQuery<GenericTable>(CustomerDataContext.CustomersTableName) select o;   

        Console.WriteLine("Rows from '{0}'", CustomerDataContext.CustomersTableName);   
        foreach (GenericEntity entity in customers)   
        {   
            Console.WriteLine(entity.ToString());   
        }   
    }  

    // Credit goes to Pablo from ADO.NET Data Service team 
    public void OnReadingEntity(object sender, ReadingWritingEntityEventArgs args)   
    {   
        // TODO: Make these statics   
        XNamespace AtomNamespace = "http://www.w3.org/2005/Atom";   
        XNamespace AstoriaDataNamespace = "http://schemas.microsoft.com/ado/2007/08/dataservices";   
        XNamespace AstoriaMetadataNamespace = "http://schemas.microsoft.com/ado/2007/08/dataservices/metadata";   

        GenericEntity entity = args.Entity as GenericEntity;   
        if (entity == null)   
        {   
            return;   
        }   

        // read each property, type and value in the payload   
        var properties = args.Entity.GetType().GetProperties();   
        var q = from p in args.Data.Element(AtomNamespace + "content")   
                                .Element(AstoriaMetadataNamespace + "properties")   
                                .Elements()   
                where properties.All(pp => pp.Name != p.Name.LocalName)   
                select new   
                {   
                    Name = p.Name.LocalName,   
                    IsNull = string.Equals("true", p.Attribute(AstoriaMetadataNamespace + "null") == null ? null : p.Attribute(AstoriaMetadataNamespace + "null").Value, StringComparison.OrdinalIgnoreCase),   
                    TypeName = p.Attribute(AstoriaMetadataNamespace + "type") == null ? null : p.Attribute(AstoriaMetadataNamespace + "type").Value,   
                    p.Value   
                };   

        foreach (var dp in q)   
        {   
            entity[dp.Name] = GetTypedEdmValue(dp.TypeName, dp.Value, dp.IsNull);   
        }   
    }   


    private static object GetTypedEdmValue(string type, string value, bool isnull)   
    {   
        if (isnull) return null;   

        if (string.IsNullOrEmpty(type)) return value;   

        switch (type)   
        {   
            case "Edm.String": return value;   
            case "Edm.Byte": return Convert.ChangeType(value, typeof(byte));   
            case "Edm.SByte": return Convert.ChangeType(value, typeof(sbyte));   
            case "Edm.Int16": return Convert.ChangeType(value, typeof(short));   
            case "Edm.Int32": return Convert.ChangeType(value, typeof(int));   
            case "Edm.Int64": return Convert.ChangeType(value, typeof(long));   
            case "Edm.Double": return Convert.ChangeType(value, typeof(double));   
            case "Edm.Single": return Convert.ChangeType(value, typeof(float));   
            case "Edm.Boolean": return Convert.ChangeType(value, typeof(bool));   
            case "Edm.Decimal": return Convert.ChangeType(value, typeof(decimal));   
            case "Edm.DateTime": return XmlConvert.ToDateTime(value, XmlDateTimeSerializationMode.RoundtripKind);   
            case "Edm.Binary": return Convert.FromBase64String(value);   
            case "Edm.Guid": return new Guid(value);   

            default: throw new NotSupportedException("Not supported type " + type);   
        }   
    }
like image 39
dunnry Avatar answered Oct 22 '22 14:10

dunnry


Another option, of course, is to have only a single entity type per table, query the tables in parallel and merge the result sorted by timestamp. In the long run this may prove to be the more prudent choice with reference to scalability and maintainability.

Alternatively you would need to use some flavor of generic entities as outlined by ‘dunnry’, where the non-common data is not explicitly typed and instead persisted via a dictionary.

I have written an alternate Azure table storage client, Lucifure Stash, which supports additional abstractions over azure table storage including persisting to/from a dictionary, and may work in your situation if that is the direction you want to pursue.

Lucifure Stash supports large data columns > 64K, arrays & lists, enumerations, composite keys, out of the box serialization, user defined morphing, public and private properties and fields and more. It is available free for personal use at http://www.lucifure.com or via NuGet.com.

Edit: Now open sourced at CodePlex

like image 1
hocho Avatar answered Oct 22 '22 15:10

hocho