Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to append data in a serialized file on disk

Tags:

c#

I have a program written in C# that serializes data into binary and write it on the disk. If I want to add more data to this file, fist I have to deserialise whole file and then append more serialized data to it. Is it possible to append data to this serialized file without deserialising the existing data so that I can save some time during whole process?

like image 957
Dee Avatar asked Jul 16 '12 06:07

Dee


2 Answers

You don't have to have to read all the data in the file to append data.

You can open it in append mode and write the data.

var fileStream = File.Open(fileName, FileMode.Append, FileAccess.Write, FileShare.Read);
var binaryWriter = new BinaryWriter(fileStream);
binaryWriter.Write(data);
like image 135
nunespascal Avatar answered Oct 25 '22 13:10

nunespascal


Now that we know (comments) that we're talking about a DataTable/DataSet via BinaryFormatter, it becomes clearer. If your intention is for that to appear as extra rows in the existing table, then no: that isn't going to work. What you could do is append, but deserialize each table in turn, then manually merge the contents. That is probably your best bet with what you describe. Here's an example just using 2, but obviously you'd repeat the deserialize/merge until EOF:

var dt = new DataTable();
dt.Columns.Add("foo", typeof (int));
dt.Columns.Add("bar", typeof(string));

dt.RemotingFormat = SerializationFormat.Binary;
var ser = new BinaryFormatter();
using(var ms = new MemoryStream())
{
    dt.Rows.Add(123, "abc");
    ser.Serialize(ms, dt); // batch 1
    dt.Rows.Clear();
    dt.Rows.Add(456, "def");
    ser.Serialize(ms, dt); // batch 2

    ms.Position = 0;

    var table1 = (DataTable) ser.Deserialize(ms);

    // the following is the merge loop that you'd repeat until EOF
    var table2 = (DataTable) ser.Deserialize(ms);
    foreach(DataRow row in table2.Rows) {
        table1.ImportRow(row);
    }

    // show the results
    foreach(DataRow row in table1.Rows)
    {
        Console.WriteLine("{0}, {1}", row[0], row[1]);
    }
}

However! Personally I have misgivings about both DataTable and BinaryFormatter. If you know what your data is, there are other techniques. For example, this could be done very simply with "protobuf", since protobuf is inherently appendable. In fact, you need to do extra to not append (although that is simple enough too):

[ProtoContract]
class Foo
{
    [ProtoMember(1)]
    public int X { get; set; }
    [ProtoMember(2)]
    public string Y { get; set; }

}
[ProtoContract]
class MyData
{
    private readonly List<Foo> items = new List<Foo>();
    [ProtoMember(1)]
    public List<Foo> Items { get { return items; } }
}

then:

var batch1 = new MyData { Items = { new Foo { X = 123, Y = "abc" } } };
var batch2 = new MyData { Items = { new Foo { X = 456, Y = "def" } } };
using(var ms = new MemoryStream())
{
    Serializer.Serialize(ms, batch1);
    Serializer.Serialize(ms, batch2);
    ms.Position = 0;
    var merged = Serializer.Deserialize<MyData>(ms);
    foreach(var row in merged.Items) {
        Console.WriteLine("{0}, {1}", row.X, row.Y);
    }
}
like image 29
Marc Gravell Avatar answered Oct 25 '22 14:10

Marc Gravell