Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distinguish between string and byte array?

I have a lot of functions that expect a string as argument, for which I use char*, but all my functions that expect a byte-array, also use char*.

The problem is that I can easily make the mistake of passing a byte-array in a string-function, causing all kinds of overflows, because the null-terminator cannot be found.

How is this usually delt with? I can imagine changing all my byte-array functions to take an uint8_t, and then the compiler will warn about signed-ness when I pass a string. Or what is the right approach here?

like image 737
Maestro Avatar asked Feb 02 '13 20:02

Maestro


People also ask

What is difference between byte array and string in Java?

Strings are basically a standardized array of bytes, so you can use them easily in your code, not worrying about encoding. An array of bytes is binary, and can contain any encoding.

What is the difference between byte string and string?

A string is a sequence of characters; these are an abstract concept, and can't be directly stored on disk. A byte string is a sequence of bytes - things that can be stored on disk.

What is the difference between byte and byte array?

The difference between bytes() and bytearray() is that bytes() returns an object that cannot be modified, and bytearray() returns an object that can be modified.

What is an byte array?

bytearray() method returns a bytearray object which is an array of given bytes. It gives a mutable sequence of integers in the range 0 <= x < 256. Syntax: bytearray(source, encoding, errors)


Video Answer


2 Answers

I generally make an array something like the following

typedef struct {
   unsigned char* data;
   unsigned long length;
   unsigned long max_length;
} array_t;

then pass array_t* around

and create array functions that take array_t*

void array_create( array_t* a, unsgined long length) // allocates memory, sets the max_length, zero length

void array_add(array_t* a, unsigned char byte)  // add a byte

etc

like image 103
Keith Nicholas Avatar answered Oct 02 '22 16:10

Keith Nicholas


The problem is more general in C than you are thinking. Since char* and char[] are equivalent for function parameters, such a parameter may refer to three different semantic concepts:

  • a pointer on one char object (this is the "official" definition of pointer types)
  • a char array
  • a string

In most cases where is is possible the mondern interfaces in the C standard uses void* for an untyped byte array, and you should probably adhere to that convention, and use char* only for strings.

char[] by themselves probably are rarely used as such; I can't imagine a lot of use cases for these. If you think of them as numbers you should use the signed or unsigned variant, if you see them just as bit pattern unsigned char should be your choice.

If you really mean an array as function parameter (char or not) you can mark that fact for the casual reader of your code by clearly indicating it:

void toto(size_t n, char A[const n]);

This is equivalent to

void toto(size_t n, char *const A);

but makes your intention clearer. And in the future there might even be tools that do the bounds checking for you.

like image 36
Jens Gustedt Avatar answered Oct 02 '22 17:10

Jens Gustedt