I've a class that consists basically of a matrix of vectors: vector< MyFeatVector<T> > m_vCells
, where the outer vector represents the matrix. Each element in this matrix is then a vector
(I extended the stl vector
class and named it MyFeatVector<T>
).
I'm trying to code an efficient method to store objects of this class in binary files. Up to now, I require three nested loops:
foutput.write( reinterpret_cast<char*>( &(this->at(dy,dx,dz)) ), sizeof(T) );
where this->at(dy,dx,dz)
retrieves the dz
element of the vector at position [dy,dx]
.
Is there any possibility to store the m_vCells
private member without using loops? I tried something like: foutput.write(reinterpret_cast<char*>(&(this->m_vCells[0])), (this->m_vCells.size())*sizeof(CFeatureVector<T>));
which seems not to work correctly. We can assume that all the vectors in this matrix have the same size, although a more general solution is also welcomed :-)
Furthermore, following my nested-loop implementation, storing objects of this class in binary files seem to require more physical space than storing the same objects in plain-text files. Which is a bit weird.
I was trying to follow the suggestion under http://forum.allaboutcircuits.com/showthread.php?t=16465 but couldn't arrive into a proper solution.
Thanks!
Below a simplified example of my serialization
and unserialization
methods.
template < typename T >
bool MyFeatMatrix<T>::writeBinary( const string & ofile ){
ofstream foutput(ofile.c_str(), ios::out|ios::binary);
foutput.write(reinterpret_cast<char*>(&this->m_nHeight), sizeof(int));
foutput.write(reinterpret_cast<char*>(&this->m_nWidth), sizeof(int));
foutput.write(reinterpret_cast<char*>(&this->m_nDepth), sizeof(int));
//foutput.write(reinterpret_cast<char*>(&(this->m_vCells[0])), nSze*sizeof(CFeatureVector<T>));
for(register int dy=0; dy < this->m_nHeight; dy++){
for(register int dx=0; dx < this->m_nWidth; dx++){
for(register int dz=0; dz < this->m_nDepth; dz++){
foutput.write( reinterpret_cast<char*>( &(this->at(dy,dx,dz)) ), sizeof(T) );
}
}
}
foutput.close();
return true;
}
template < typename T >
bool MyFeatMatrix<T>::readBinary( const string & ifile ){
ifstream finput(ifile.c_str(), ios::in|ios::binary);
int nHeight, nWidth, nDepth;
finput.read(reinterpret_cast<char*>(&nHeight), sizeof(int));
finput.read(reinterpret_cast<char*>(&nWidth), sizeof(int));
finput.read(reinterpret_cast<char*>(&nDepth), sizeof(int));
this->resize(nHeight, nWidth, nDepth);
for(register int dy=0; dy < this->m_nHeight; dy++){
for(register int dx=0; dx < this->m_nWidth; dx++){
for(register int dz=0; dz < this->m_nDepth; dz++){
finput.read( reinterpret_cast<char*>( &(this->at(dy,dx,dz)) ), sizeof(T) );
}
}
}
finput.close();
return true;
}
A most efficient method is to store the objects into an array (or contiguous space), then blast the buffer to the file. An advantage is that the disk platters don't have waste time ramping up and also the writing can be performed contiguously instead of in random locations.
If this is your performance bottleneck, you may want to consider using multiple threads, one extra thread to handle the output. Dump the objects into a buffer, set a flag, then the writing thread will handle the output, releaving your main task to perform more important tasks.
Edit 1: Serializing Example
The following code has not been compiled and is for illustrative purposes only.
#include <fstream>
#include <algorithm>
using std::ofstream;
using std::fill;
class binary_stream_interface
{
virtual void load_from_buffer(const unsigned char *& buf_ptr) = 0;
virtual size_t size_on_stream(void) const = 0;
virtual void store_to_buffer(unsigned char *& buf_ptr) const = 0;
};
struct Pet
: public binary_stream_interface,
max_name_length(32)
{
std::string name;
unsigned int age;
const unsigned int max_name_length;
void load_from_buffer(const unsigned char *& buf_ptr)
{
age = *((unsigned int *) buf_ptr);
buf_ptr += sizeof(unsigned int);
name = std::string((char *) buf_ptr);
buf_ptr += max_name_length;
return;
}
size_t size_on_stream(void) const
{
return sizeof(unsigned int) + max_name_length;
}
void store_to_buffer(unsigned char *& buf_ptr) const
{
*((unsigned int *) buf_ptr) = age;
buf_ptr += sizeof(unsigned int);
std::fill(buf_ptr, 0, max_name_length);
strncpy((char *) buf_ptr, name.c_str(), max_name_length);
buf_ptr += max_name_length;
return;
}
};
int main(void)
{
Pet dog;
dog.name = "Fido";
dog.age = 5;
ofstream data_file("pet_data.bin", std::ios::binary);
// Determine size of buffer
size_t buffer_size = dog.size_on_stream();
// Allocate the buffer
unsigned char * buffer = new unsigned char [buffer_size];
unsigned char * buf_ptr = buffer;
// Write / store the object into the buffer.
dog.store_to_buffer(buf_ptr);
// Write the buffer to the file / stream.
data_file.write((char *) buffer, buffer_size);
data_file.close();
delete [] buffer;
return 0;
}
class Many_Strings
: public binary_stream_interface
{
enum {MAX_STRING_SIZE = 32};
size_t size_on_stream(void) const
{
return m_string_container.size() * MAX_STRING_SIZE // Total size of strings.
+ sizeof(size_t); // with room for the quantity variable.
}
void store_to_buffer(unsigned char *& buf_ptr) const
{
// Treat the vector<string> as a variable length field.
// Store the quantity of strings into the buffer,
// followed by the content.
size_t string_quantity = m_string_container.size();
*((size_t *) buf_ptr) = string_quantity;
buf_ptr += sizeof(size_t);
for (size_t i = 0; i < string_quantity; ++i)
{
// Each string is a fixed length field.
// Pad with '\0' first, then copy the data.
std::fill((char *)buf_ptr, 0, MAX_STRING_SIZE);
strncpy(buf_ptr, m_string_container[i].c_str(), MAX_STRING_SIZE);
buf_ptr += MAX_STRING_SIZE;
}
}
void load_from_buffer(const unsigned char *& buf_ptr)
{
// The actual coding is left as an exercise for the reader.
// Psuedo code:
// Clear / empty the string container.
// load the quantity variable.
// increment the buffer variable by the size of the quantity variable.
// for each new string (up to the quantity just read)
// load a temporary string from the buffer via buffer pointer.
// push the temporary string into the vector
// increment the buffer pointer by the MAX_STRING_SIZE.
// end-for
}
std::vector<std::string> m_string_container;
};
I'd suggest you to read C++ FAQ on Serialization and you can choose what best fits for your
When you're working with structures and classes, you've to take care of two things
Both of these could make some notorious results in your output. IMO, the object must implement to serialize and de-serialize the object. The object can know well about the structures, pointers data etc. So it can decide which format can be implemented efficiently.
You will have to iterate anyway or has to wrap it somewhere. Once you finished implementing the serialization and de-serialization function (either you can write using operators or functions). Especially when you're working with stream objects, overloading << and >> operators would be easy to pass the object.
Regarding your question about using underlying pointers of vector, it might work if it's a single vector. But it's not a good idea in the other way.
Update according to the question update.
There are few things you should mind before overriding STL members. They're not really a good candidate for inheritance because it doesn't have any virtual destructors. If you're using basic data types and POD like structures it wont make much issues. But if you use it truly object oriented way, you may face some unpleasant behavior.
Regarding your code
For example the following code is for simple read and write operation( in text format).
fstream fr("test.txt", ios_base::out | ios_base::binary );
for( int i =0;i <_countof(arr);i++)
fr << arr[i] << ' ';
fr.close();
fstream fw("test.txt", ios_base::in| ios_base::binary);
int j = 0;
while( fw.eof() || j < _countof(arrout))
{
fw >> arrout[j++];
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With