Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mass filtering with protobuf-net

I have serialized a list of objects with protobuf-net.

Theoretically, the .bin file can contain millions of objects.

Let's assume the objects are of a class containing the following:

public string EventName;

I have to take a query and create a list containing the objects matching the query. What is the correct way to extract the matching objects from the serialized file using LINQ?

like image 276
Gilad Naaman Avatar asked Feb 14 '11 19:02

Gilad Naaman


2 Answers

The protobuf format is a linear sequence of items; any indexing etc you way can only be applies separately. However, IEnumerable<T> is available; you might find that:

var item = Serializer.DeserializeItems<YourType>(source)
       .First(item => item.Id == id);

does the job nicely; this:

  • is lazily spooled; each item is yielded individually, so you don't need a glut of memory
  • is short-circuited; if the item is found near the start, it'll exit promptly

Or for multiple items:

var list = Serializer.DeserializeItems<YourType>(source)
    .Where(item => item.Foo == foo);

(add a ToList to te end of the above if you want to buffer the matching items in memory, or use without a ToList if you just want to parse it once in a forwards-only way)

like image 182
Marc Gravell Avatar answered Sep 20 '22 22:09

Marc Gravell


If you want to add some projection over the selected list of elements you should try a library of mine, https://github.com/Scooletz/protobuf-linq. They are available on NuGet as well. The library lowers overhead of deserialization greatly. In some cases it can drop to 50% of the original query.

like image 25
Scooletz Avatar answered Sep 17 '22 22:09

Scooletz