I've taken over some code, and came across a weird reallocation of an array. This is a function from within an Array class (used by the JsonValue)
void reserve( uint32_t newCapacity ) {
if ( newCapacity > length + additionalCapacity ) {
newCapacity = std::min( newCapacity, length + std::numeric_limits<decltype( additionalCapacity )>::max() );
JsonValue *newPtr = new JsonValue[newCapacity];
if ( length > 0 ) {
memcpy( newPtr, values, length * sizeof( JsonValue ) );
memset( values, 0, length * sizeof( JsonValue ) );
}
delete[] values;
values = newPtr;
additionalCapacity = uint16_t( newCapacity - length );
}
}
I get the point of this; it is just allocating a new array, and doing a copy of the memory contents from the old array into the new array, then zero-ing out the old array's contents. I also know this was done in order to prevent calling destructors, and moves.
The JsonValue
is a class with functions, and some data which is stored in a union (string, array, number, etc.).
My concern is whether this is actually defined behaviour or not. I know it works, and has not had a problem since we began using it a few months ago; but if its undefined then it doesn't mean it is going to keep working.
EDIT:
JsonValue
looks something like this:
struct JsonValue {
// …
~JsonValue() {
switch ( details.type ) {
case Type::Array:
case Type::Object:
array.destroy();
break;
case Type::String:
delete[] string.buffer;
break;
default: break;
}
}
private:
struct Details {
Key key = Key::Unknown;
Type type = Type::Null; // (0)
};
union {
Array array;
String string;
EmbedString embedString;
Number number;
Details details;
};
};
Where Array
is a wrapper around an array of JsonValue
s, String
is a char*
, EmbedString
is char[14]
, Number
is a union of int
, unsigned int
, and double
, Details
contains the type of value it holds. All values have 16-bits of unused data at the beginning, which is used for Details
. Example:
struct EmbedString {
uint16_t : 16;
char buffer[14] = { 0 };
};
memcpy() copies from one place to another. memset() just sets all pieces of memory to the same. For example here sets string length of the string str to * (or whatever second argument of the memset).
memmove() is similar to memcpy() as it also copies data from a source to destination.
memcpy is usually naive - certainly not the slowest way to copy memory around, but usually quite easy to beat with some loop unrolling, and you can go even further with assembler.
Re: Do I need to release memory set by memset? memset does not allocate memory. It only fills a region of memory with a certain value. You do not have to free the buffer (unless the buffer was allocated).
Whether this code has well-defined behavior basically depends on two things: 1) is JsonValue
trivially-copyable and, 2) if so, are a bunch of all-zero Bytes a valid object representation for a JsonValue
.
If JsonValue
is trivially-copyable, then the memcpy
from one array of JsonValue
s to another will indeed be equivalent to copying all the elements over [basic.types]/3. If all-zeroes is a valid object representation for a JsonValue
, then the memset
should be ok (I believe this actually falls into a bit of a grey-area with the current wording of the standard, but I believe at least the intention would be that this is fine).
I'm not sure why you'd need to "prevent calling destructors and moves", but overwriting objects with zeroes does not prevent destructors from running. delete[] values
will call the destructurs of the array members. And moving the elements of an array of trivially-copyable type should compile down to just copying over the bytes anyways.
Furthermore, I would suggest to get rid of these String
and EmbedString
classes and simply use std::string
. At least, it would seem to me that the sole purpose of EmbedString
is to manually perform small string optimization. Any std::string
implementation worth its salt is already going to do exactly that under the hood. Note that std::string
is not guaranteed (and will often not be) trivially-copyable. Thus, you cannot simply replace String
and EmbedString
with std::string
while keeping the rest of this current implementation.
If you can use C++17, I would suggest to simply use std::variant
instead of or at least inside this custom JsonValue
implementation as that seems to be exactly what it's trying to do. If you need some common information stored in front of whatever the variant value may be, just have a suitable member holding that information in front of the member that holds the variant value rather than relying on every member of the union starting with the same couple of members (which would only be well-defined if all union members are standard-layout types that keep this information in their common initial sequence [class.mem]/23).
The sole purpose of Array
would seem to be to serve as a vector that zeroes memory before deallocating it for security reasons. If this is the case, I would suggest to just use an std::vector
with an allocator that zeros memory before deallocating instead. For example:
template <typename T>
struct ZeroingAllocator
{
using value_type = T;
T* allocate(std::size_t N)
{
return reinterpret_cast<T*>(new unsigned char[N * sizeof(T)]);
}
void deallocate(T* buffer, std::size_t N) noexcept
{
auto ptr = reinterpret_cast<volatile unsigned char*>(buffer);
std::fill(ptr, ptr + N, 0);
delete[] reinterpret_cast<unsigned char*>(buffer);
}
};
template <typename A, typename B>
bool operator ==(const ZeroingAllocator<A>&, const ZeroingAllocator<B>&) noexcept { return true; }
template <typename A, typename B>
bool operator !=(const ZeroingAllocator<A>&, const ZeroingAllocator<B>&) noexcept { return false; }
and then
using Array = std::vector<JsonValue, ZeroingAllocator<JsonValue>>;
Note: I fill the memory via volatile unsigned char*
to prevent the compiler from optimizing away the zeroing. If you need to support overaligned types, you can replace the new[]
and delete[]
with direct calls to ::operator new
and ::operator delete
(doing this will prevent the compiler from optimizing away allocations). Pre C++17, you will have to allocate a sufficiently large buffer and then manually align the pointer, e.g., using std::align
…
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With