Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Allocate a struct containing a string in a single allocation

I'm working on a program that stores a vital data structure as an unstructured string with program-defined delimiters (so we need to walk the string and extract the information we need as we go) and we'd like to convert it to a more structured data type.

In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself. The length of the string will always be known at allocation time. We've determined through testing that doubling the number of allocations required for each of these data types is an unnacceptable cost. Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation? If we were using cstrings I'd just have a char * in the struct and point it to the end of the struct after allocating a block big enough for the struct and string, but we'd prefer std::string if possible.

Most of my experience is with C, so please forgive any C++ ignorance displayed here.

like image 818
Shea Levy Avatar asked Jun 08 '12 12:06

Shea Levy


4 Answers

If you have such rigorous memory needs, then you're going to have to abandon std::string.

The best alternative is to find or write an implementation of basic_string_ref (a proposal for the next C++ standard library), which is really just a char* coupled with a size. But it has all of the (non-mutating) functions of std::basic_string. Then you use a factory function to allocate the memory you need (your struct size + string data), and then use placement new to initialize the basic_string_ref.

Of course, you'll also need a custom deletion function, since you can't just pass the pointer to "delete".


Given the previously linked to implementation of basic_string_ref (and its associated typedefs, string_ref), here's a factory constructor/destructor, for some type T that needs to have a string on it:

template<typename T> T *Create(..., const char *theString, size_t lenstr)
{
  char *memory = new char[sizeof(T) + lenstr + 1];
  memcpy(memory + sizeof(T), theString, lenstr);

  try
  {
    return new(memory) T(..., string_ref(theString, lenstr);
  }
  catch(...)
  {
    delete[] memory;
    throw;
  }
}

template<typename T> T *Create(..., const std::string & theString)
{
  return Create(..., theString.c_str(), theString.length());
}

template<typename T> T *Create(..., const string_ref &theString)
{
  return Create(..., theString.data(), theString.length());
}

template<typename T> void Destroy(T *pValue)
{
  pValue->~T();

  char *memory = reinterpret_cast<char*>(pValue);
  delete[] memory;
}

Obviously, you'll need to fill in the other constructor parameters yourself. And your type's constructor will need to take a string_ref that refers to the string.

like image 93
Nicol Bolas Avatar answered Nov 15 '22 13:11

Nicol Bolas


If you are using std::string, you can't really do one allocation for both structure and string, and you also can't make the allocation of both to be one large block. If you are using old C-style strings it's possible though.

like image 22
Some programmer dude Avatar answered Nov 15 '22 13:11

Some programmer dude


If I understand you correctly, you are saying that through profiling you have determined that the fact that you have to allocate a string and another data member in your data structure imposes an unacceptable cost to you application.

If that's indeed the case I can think of a couple solutions.

  1. You could pre-allocate all of these structures up front, before your program starts. Keep them in some kind of fixed collection so they aren't copy-constructed, and reserve enough buffer in your strings to hold your data.
  2. Controversial as it may seem, you could use old C-style char arrays. It seems like you are fogoing much of the reason to use strings in the first place, which is the memory management. However in your case, since you know the needed buffer sizes at start up, you could handle this yourself. If you like the other facilities that string provides, bear in mind that much of that is still available in the <algorithm>s.
like image 1
John Dibling Avatar answered Nov 15 '22 11:11

John Dibling


Take a look at Variable Sized Struct C++ - the short answer is that there's no way to do it in vanilla C++.

Do you really need to allocate the container structs on the heap? It might be more efficient to have those on the stack, so they don't need to be allocated at all.

like image 1
ecatmur Avatar answered Nov 15 '22 13:11

ecatmur