Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the most simple way to read and write data from a struct to and from a file in c++ without serialization library?

Tags:

c++

fstream

I am writing a program to that regularly stores and reads structs in the form below.

struct Node {
    int leftChild = 0;
    int rightChild = 0;
    std::string value;
    int count = 1;
    int balanceFactor = 0;
};

How would I read and write nodes to a file? I would like to use the fstream class with seekg and seekp to do the serialization manually but I'm not sure how it works based off of the documentation and am struggling with finding decent examples.

[edit] specified that i do not want to use a serialization library.

like image 979
David Carek Avatar asked Apr 12 '16 14:04

David Carek


4 Answers

This problem is known as serialization. Use a serializing library like e.g. Google's Protocol Buffers or Flatbuffers.

like image 127
Claudio Avatar answered Nov 16 '22 22:11

Claudio


To serialize objects, you will need to stick to the concept that the object is writing its members to the stream and reading members from the stream. Also, member objects should write themselves to the stream (as well as read).

I implemented a scheme using three member functions, and a buffer:

void load_from_buffer(uint8_t * & buffer_pointer);  
void store_to_buffer(uint8_t * & buffer_pointer) const;  
unsigned int size_on_stream() const;  

The size_on_stream would be called first in order to determine the buffer size for the object (or how much space it occupied in the buffer).

The load_from_buffer function loads the object's members from a buffer using the given pointer. The function also increments the pointer appropriately.

The store_to_buffer function stores the objects's members to a buffer using the given pointer. The function also increments the pointer appropriately.

This can be applied to POD types by using templates and template specializations.

These functions also allow you to pack the output into the buffer, and load from a packed format.

The reason for I/O to the buffer is so you can use the more efficient block stream methods, such as write and read.

Edit 1: Writing a node to a stream
The problem with writing or serializing a node (such a linked list or tree node) is that pointers don't translate to a file. There is no guarantee that the OS will place your program in the same memory location or give you the same area of memory each time.

You have two options: 1) Only store the data. 2) Convert the pointers to file offsets. Option 2) is very complicated as it may require repositioning the file pointer because file offsets may not be known ahead of time.

Also, be aware of variable length records like strings. You can't directly write a string object to a file. Unless you use a fixed string width, the string size will change. You will either need to prefix the string with the string length (preferred) or use some kind of terminating character, such as '\0'. The string length first is preferred because you don't have to search for the end of the string; you can use a block read to read in the text.

like image 26
Thomas Matthews Avatar answered Nov 16 '22 20:11

Thomas Matthews


Another approach would be to overload the operator<< and operator>> for the structure so that it knows how to save/load itself. That would reduce the problem to knowing where to read/write the node. In theory, your left and right child fields could be seek addresses to where the nodes actually reside, while a new field could hold the seek location of the current node.

like image 27
EvilTeach Avatar answered Nov 16 '22 20:11

EvilTeach


When implementing your own serialization method, the first decision you'll have to make is whether you want the data on disk to be in binary format or textual format.

I find it easier to implement the ability to save to a binary format. The number of functions needed to implement that is small. You need to implement functions that can write the fundamental types, arrays of known size at compile time, dynamic arrays and strings. Everything else can be built on top of those.

Here's something very close to what I recently put into production code.

#include <cstring>
#include <fstream>
#include <cstddef>
#include <stdexcept>

// Class to write to a stream
struct Writer
{
   std::ostream& out_;

   Writer(std::ostream& out) : out_(out) {}

   // Write the fundamental types
   template <typename T>
      void write(T number)
      {
         out_.write(reinterpret_cast<char const*>(&number), sizeof(number));
         if (!out_ )
         {
            throw std::runtime_error("Unable to write a number");
         }
      }

   // Write arrays whose size is known at compile time
   template <typename T, uint64_t N>
      void write(T (&array)[N])
      {
         for(uint64_t i = 0; i < N; ++i )
         {
            write(array[i]);
         }
      }

   // Write dynamic arrays
   template <typename T>
      void write(T array[], uint64_t size)
      {
         write(size);
         for(uint64_t i = 0; i < size; ++i )
         {
            write(array[i]);
         }
      }

   // Write strings
   void write(std::string const& str)
   {
      write(str.c_str(), str.size());
   }

   void write(char const* str)
   {
      write(str, std::strlen(str));
   }
};

// Class to read from a stream
struct Reader
{
   std::ifstream& in_;
   Reader(std::ifstream& in) : in_(in) {}

   template <typename T>
      void read(T& number)
      {
         in_.read(reinterpret_cast<char*>(&number), sizeof(number));
         if (!in_ )
         {
            throw std::runtime_error("Unable to read a number.");
         }
      }

   template <typename T, uint64_t N>
      void read(T (&array)[N])
      {
         for(uint64_t i = 0; i < N; ++i )
         {
            read(array[i]);
         }
      }

   template <typename T>
      void read(T*& array)
      {
         uint64_t size;
         read(size);
         array = new T[size];
         for(uint64_t i = 0; i < size; ++i )
         {
            read(array[i]);
         }
      }

   void read(std::string& str)
   {
      char* s;
      read(s);
      str = s;
      delete [] s;
   }
};

// Test the code.

#include <iostream>

void writeData(std::string const& file)
{
   std::ofstream out(file);
   Writer w(out);
   w.write(10);
   w.write(20.f);
   w.write(200.456);
   w.write("Test String");
}

void readData(std::string const& file)
{
   std::ifstream in(file);
   Reader r(in);

   int i;
   r.read(i);
   std::cout << "i: " << i << std::endl;

   float f;
   r.read(f);
   std::cout << "f: " << f << std::endl;

   double d;
   r.read(d);
   std::cout << "d: " << d << std::endl;

   std::string s;
   r.read(s);
   std::cout << "s: " << s << std::endl;
}

void testWriteAndRead(std::string const& file)
{
   writeData(file);
   readData(file);
}

int main()
{
   testWriteAndRead("test.bin");
   return 0;
}

Output:

i: 10
f: 20
d: 200.456
s: Test String

The ability to write and read a Node is very easily implemented.

void write(Writer& w, Node const& n)
{
    w.write(n.leftChild);
    w.write(n.rightChild);
    w.write(n.value);
    w.write(n.count);
    w.write(n.balanceFactor);
}

void read(Reader& r, Node& n)
{
    r.read(n.leftChild);
    r.read(n.rightChild);
    r.read(n.value);
    r.read(n.count);
    r.read(n.balanceFactor);
}
like image 1
R Sahu Avatar answered Nov 16 '22 21:11

R Sahu