Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't C terminate strings with a special escaped string-termination character?

In C, strings are terminated with null ( \0 ) which causes problems when you want to put a null in a strings. Why not have a special escaped character such as \$ or something?

I am fully aware at how dumb this question is, but I was curious.

like image 438
akway Avatar asked Jul 19 '09 22:07

akway


People also ask

What is the terminating character of a string in C?

Strings are actually one-dimensional array of characters terminated by a null character '\0'.

What is the reason C string terminates with the \0 character?

C strings are null-terminated. That is, they are terminated by the null character, NUL . They are not terminated by the null pointer NULL , which is a completely different kind of value with a completely different purpose. NUL is guaranteed to have the integer value zero.

What happens if you don't null terminate a string?

Many library functions accept a string or wide string argument with the constraint that the string they receive is properly null-terminated. Passing a character sequence or wide character sequence that is not null-terminated to such a function can result in accessing memory that is outside the bounds of the object.

What are null terminated strings in C++?

The null terminated strings are basically a sequence of characters, and the last element is one null character (denoted by ‘\0’). When we write some string using double quotes (“…”), then it is converted into null terminated strings by the compiler.

What is a string terminator in C programming?

\0 acts as a string terminator in C. It is known as the null character, or NUL. It signals code that processes strings - standard libraries but also your own code - where the end of a string is. A good example is strlen which returns the length of a string.

Can you write functions that handle strings terminated by some other character?

There's no reason why someone couldn't write functions that handle strings terminated by some other character, but there's also no reason to buck the established standard in most cases unless your goal is giving programmers fits. :-)

How do you know when a string is terminated in C++?

Note that C++ std::string are not [&0&] terminated, but the class provides functions to fetch the underlying string data as [&0&] terminated c-style string. In C a string is collection of characters. This collection usually ends with a [&0&]. Unless a special character like 0 is used there would be no way of knowing when a string ends.


1 Answers

Terminating with a 0 has many performance niceties, which were very much relevant back in the late 60s.

CPUs have instructions for conditional jump on test for 0. In fact, some CPUs even have instructions which will iterate/copy a sequence of bytes up to the 0.

If you used an escaped character instead, you have two test TWO different bytes to assert the end of the string. Not only that's slower, but you lose the ability to iterate one byte at a time, as you need a look-ahead or the ability to backtrack.

Now, other languages (cough, Pascal, cough) use strings in a count/value style. For them, any character is valid, but they always keep a counter with the size of the string. The advantage is clear, but there are disadvantages to this technique too.

For one thing, the string size is limited by the number of bytes the count takes. One byte gives you 255 characters, two bytes gives you 65535, etc. It might be almost irrelevant today, but adding two bytes to every string once was quite expensive.

Edit:

I do not think the question is dumb. In these days of high level languages with memory management, incredible CPU power and obscene amounts of memory, such decisions from the past can well seem senseless. And, indeed, they MIGHT be senseless nowadays, so it's a fine thing to question them.

like image 145
Daniel C. Sobral Avatar answered Oct 21 '22 08:10

Daniel C. Sobral