Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best approach when working with on-disk data structures

Tags:

c

struct

on-disk

I would like to know how best to work with on-disk data structures given that the storage layout needs to exactly match the logical design. I find that structure alignment & packing do not really help much when you need to have a certain layout for your storage.

My approach to this problem is defining the (width) of the structure using a processor directive and using the width when allocation character (byte) arrays that I will write to disk after appending data that follows the logical structure model.

eg:

typedef struct __attribute__((packed, aligned(1))) foo {
   uint64_t some_stuff;
   uint8_t flag;
} foo;

if I persist foo on-disk the "flag" value will come at the very end of the data. Given that I can easily use foo when reading the data using fread on a &foo type then using the struct normally without any further byte fiddling.

Instead I prefer to do this

#define foo_width sizeof(uint64_t)+sizeof(uint8_t)

uint8_t *foo = calloc(1, foo_width);

foo[0] = flag_value;
memcpy(foo+1, encode_int64(some_value), sizeof(uint64_t));

Then I just use fwrite and fread to commit and read the bytes but later unpack them in order to use the data stored in various logical fields.

I wonder which approach is best to use given I desire the layout of the on-disk storage to match the logical layout ... this was just an example ...

If anyone knows how efficient each method is with respect to decoding/unpacking bytes vs copying structure directly from it's on-disk representation please share , I personally prefer using the second approach since it gives me full control on the storage layout but I am not ready to sacrifice looks for performance since this approach requires a lot of loop logic to unpack / traverse through the bytes to various boundaries in the data.

Thanks.

like image 558
DeLorean Avatar asked Nov 19 '14 11:11

DeLorean


1 Answers

Based on your requirements (considering looks and performance), the first approach is better because, the compiler will do the hard work for you. In other words, if a tool (compiler in this case) provides you certain feature then you do not want to implement it on your own because, in most cases, tool's implementation would be more efficient than yours.

like image 179
RcnRcf Avatar answered Oct 26 '22 22:10

RcnRcf