I have a lot of functions that expect a string as argument, for which I use char*
, but all my functions that expect a byte-array, also use char*
.
The problem is that I can easily make the mistake of passing a byte-array in a string-function, causing all kinds of overflows, because the null-terminator cannot be found.
How is this usually delt with? I can imagine changing all my byte-array functions to take an uint8_t
, and then the compiler will warn about signed-ness when I pass a string. Or what is the right approach here?
Strings are basically a standardized array of bytes, so you can use them easily in your code, not worrying about encoding. An array of bytes is binary, and can contain any encoding.
A string is a sequence of characters; these are an abstract concept, and can't be directly stored on disk. A byte string is a sequence of bytes - things that can be stored on disk.
The difference between bytes() and bytearray() is that bytes() returns an object that cannot be modified, and bytearray() returns an object that can be modified.
bytearray() method returns a bytearray object which is an array of given bytes. It gives a mutable sequence of integers in the range 0 <= x < 256. Syntax: bytearray(source, encoding, errors)
I generally make an array something like the following
typedef struct {
unsigned char* data;
unsigned long length;
unsigned long max_length;
} array_t;
then pass array_t* around
and create array functions that take array_t*
void array_create( array_t* a, unsgined long length) // allocates memory, sets the max_length, zero length
void array_add(array_t* a, unsigned char byte) // add a byte
etc
The problem is more general in C than you are thinking. Since char*
and char[]
are equivalent for function parameters, such a parameter may refer to three different semantic concepts:
char
object (this is the "official" definition of pointer types)char
arrayIn most cases where is is possible the mondern interfaces in the C standard uses void*
for an untyped byte array, and you should probably adhere to that convention, and use char*
only for strings.
char[]
by themselves probably are rarely used as such; I can't imagine a lot of use cases for these. If you think of them as numbers you should use the signed
or unsigned
variant, if you see them just as bit pattern unsigned char
should be your choice.
If you really mean an array as function parameter (char
or not) you can mark that fact for the casual reader of your code by clearly indicating it:
void toto(size_t n, char A[const n]);
This is equivalent to
void toto(size_t n, char *const A);
but makes your intention clearer. And in the future there might even be tools that do the bounds checking for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With