Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java string replace and the NUL (NULL, ASCII 0) character?

Testing out someone elses code, I noticed a few JSP pages printing funky non-ASCII characters. Taking a dip into the source I found this tidbit:

// remove any periods from first name e.g. Mr. John --> Mr John firstName = firstName.trim().replace('.','\0'); 

Does replacing a character in a String with a null character even work in Java? I know that '\0' will terminate a C-string. Would this be the culprit to the funky characters?

like image 304
praspa Avatar asked Mar 26 '10 12:03

praspa


People also ask

What is the ascii value of '\ 0 the null character?

The ASCII null is represented as 0x00, and zero is represented as 0x30. The ASCII NUL character is used to denote the end of the string in C or C++. When programmer used '0' (character 0) it is treated as 0x30.

Why do we have a null character 0 or NUL at the end of a string?

\0 is used to mark end of character string in C. Most C std library functions requires the string to be terminated this way in order to work. Since C does not know how long is your string you must mark the end with a \0 so it knows it has reached the end of your string.

How do you replace a null character in Java?

Just use replace(".", "") .

What is the NUL character in Java?

A null character refers to any character that has a numeric value of zero. It is termed a null character as it doesn't carry a value and all its bit are set on 0. A null character is also known as a null terminator.

Does replacing a character with a null character actually work?

It DOESN'Twork if you expect replacing a character with the null character would somehow remove that character from the string. Of course it doesn't work like that. A null character is still a character!

Is there a way to display null character in Java?

That said, the null character probably will look funky regardless; usually it's not a character that you want to display. That said, since null character is not the string terminator, Java is more than capable of handling it one way or another.

How do you replace a character in a string in Java?

Replace the character at the specific index by calling this method and passing the character and the index as the parameter. Like StringBuilder, the StringBuffer class has a predefined method for this purpose – setCharAt (). Replace the character at the specific index by calling this method and passing the character and the index as the parameter.

What is the ASCII code?

Almost all computer systems today use the ASCII code to represent characters and texts. (227) . Without knowing it you use it all the time, every time you use a computer system, but if all you need is to get some of the characters not included in your keyboard should do the following, for example:


1 Answers

Does replacing a character in a String with a null character even work in Java? I know that '\0' will terminate a c-string.

That depends on how you define what is working. Does it replace all occurrences of the target character with '\0'? Absolutely!

String s = "food".replace('o', '\0'); System.out.println(s.indexOf('\0')); // "1" System.out.println(s.indexOf('d')); // "3" System.out.println(s.length()); // "4" System.out.println(s.hashCode() == 'f'*31*31*31 + 'd'); // "true" 

Everything seems to work fine to me! indexOf can find it, it counts as part of the length, and its value for hash code calculation is 0; everything is as specified by the JLS/API.

It DOESN'T work if you expect replacing a character with the null character would somehow remove that character from the string. Of course it doesn't work like that. A null character is still a character!

String s = Character.toString('\0'); System.out.println(s.length()); // "1" assert s.charAt(0) == 0; 

It also DOESN'T work if you expect the null character to terminate a string. It's evident from the snippets above, but it's also clearly specified in JLS (10.9. An Array of Characters is Not a String):

In the Java programming language, unlike C, an array of char is not a String, and neither a String nor an array of char is terminated by '\u0000' (the NUL character).


Would this be the culprit to the funky characters?

Now we're talking about an entirely different thing, i.e. how the string is rendered on screen. Truth is, even "Hello world!" will look funky if you use dingbats font. A unicode string may look funky in one locale but not the other. Even a properly rendered unicode string containing, say, Chinese characters, may still look funky to someone from, say, Greenland.

That said, the null character probably will look funky regardless; usually it's not a character that you want to display. That said, since null character is not the string terminator, Java is more than capable of handling it one way or another.


Now to address what we assume is the intended effect, i.e. remove all period from a string, the simplest solution is to use the replace(CharSequence, CharSequence) overload.

System.out.println("A.E.I.O.U".replace(".", "")); // AEIOU 

The replaceAll solution is mentioned here too, but that works with regular expression, which is why you need to escape the dot meta character, and is likely to be slower.

like image 86
polygenelubricants Avatar answered Sep 20 '22 17:09

polygenelubricants