Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Null bytes in char* in QByteArray with QDataStream

Tags:

c++

qt

I'm discovered that char* in QByteArray have null bytes. Code:

QByteArray arr;
QDataStream stream(&arr, QIODevice::WriteOnly);
stream << "hello";

Look at debugger variable view:

char* in QByteArray

I don't understand why I have three empty bytes at the beginning. I know that [3] byte is string length. Can I remove last byte? I know it's null-terminated string, but for my application I must have raw bytes (with one byte at beggining for store length).

More weird for me is when I use QString:

QString str = "hello";
[rest of code same as above]
stream << str;

QString in QByteArray

It's don't have null at end, so I think maybe null bytes before each char informs that next byte is char?

Just two questions:

  1. Why so much null bytes?
  2. How I can remove it, including last null byte?
like image 249
aso Avatar asked Oct 01 '22 15:10

aso


1 Answers

I don't understand why I have three empty bytes at the beginning.

It's a fixed-size, uint32_t (4-byte) header. It's four bytes so that it can specify data lengths as long as (2^32-1) bytes. If it was only a single byte, then it would only be able to describe strings up to 255 bytes long, because that's the largest integer value that can fit into a single byte.

Can I remove last byte? I know it's null-terminated string, but for my application I must have raw bytes (with one byte at beggining for store length).

Sure, as long as the code that will later parse the data array is not depending on the presence of a trailing NUL byte to work correctly.

More weird for me is when I use QString [...] it's don't have null at end, so I think maybe null bytes before each char informs that next byte is char?

Per the Qt serialization documentation page, a QString is serialized as:

- If the string is null: 0xFFFFFFFF (quint32)
- Otherwise:  The string length in bytes (quint32) followed by the data in UTF-16.

If you don't like that format, instead of serializing the QString directly, you could do something like

stream << str.toUtf8();

instead, and that way the data in your QByteArray would be in a simpler format (UTF-8).

Why so much null bytes?

They are used in fixed-size header fields when the length-values being encoded are small; or to indicate the end of NUL-terminated C strings.

How I can remove it, including last null byte?

You could add the string in your preferred format (no NUL terminator but with a single length header-byte) like this:

const char * hello = "hello";
char slen = strlen(hello);
stream.writeRawData(&slen, 1);
stream.writeRawData(hello, slen);

... but if you have the choice, I highly recommend just keeping the NUL-terminator bytes at the end of the strings, for these reasons:

  1. A single preceding length-byte will limit your strings to 255 bytes long (or less), which is an unnecessary restriction that will likely haunt you in the future.

  2. Avoiding the NUL-terminator byte doesn't actually save any space, because you've added a string-length byte to compensate.

  3. If the NUL-terminator byte is there, you can simply pass a pointer to the first byte of the string directly to any code expects a C-style string, and it will be able to use the string immediately (without any data-conversion steps). If you rely on a different convention instead, you'll end up having to make a copy of the entire string before you can pass it to that code, just so that you can append a NUL byte to the end of the string so that that C-string-expecting code can use it. That will be CPU-inefficient and error-prone.

like image 159
Jeremy Friesner Avatar answered Oct 19 '22 00:10

Jeremy Friesner