Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange character for empty line in TextWrangler and cat -v

I have a text file, which on my Mac I open with TextWrangler. I enable the invisible characters to see the line endings. I see that every empty line has a red, upside down question mark in it. Which character is this?

When in the terminal I type cat -v file.txt, it shows these characters as ^@ (and the line endings themselves as ^M). What I need to know is the regex of that specific character, like /n for the end of line.

In the hex dump, I see the following:

0000000: 312e 300d 0a00 0d0a 2231 3130 3030 3030  1.0....."1100000
0000010: 3030 3222 3b22 3922 3b22 5354 4422 3b3b  002";"9";"STD";;
0000020: 3b0d 0a22 3131 3030 3030 3030 3639 223b  ;.."1100000069";

If I manually remove the strange characters, and make a new hex dump, I see:

0000000: 312e 300d 0a0d 0a22 3131 3030 3030 3030  1.0...."11000000
0000010: 3032 223b 2239 223b 2253 5444 223b 3b3b  02";"9";"STD";;;
0000020: 0d0a 2231 3130 3030 3030 3036 3922 3b22  .."1100000069";"

The difference is a byte sequence 00. Is there an encoding in which this 00 is required for empty lines?

like image 298
physicalattraction Avatar asked May 27 '15 14:05

physicalattraction


1 Answers

The red inverted question mark, you are looking at, is apparently a NULL / NUL character. Whether or not it makes any difference does depend on the application writing/reading the files in question. (So, it's most likely not a general encoding issue of sorts. Compare: Wikipedia.)
Once you made the hidden characters visible in TextWrangler, you can mark that/any character (or character sequence for that matter), and copy it to the Find input field using CMD + E. The NULL character shows up as \x{00} on my machine.
Alternatively, you might use -> Text -> Zap Gremlins... with (at least) Null (ASCII 0) characters checked, Replace with code selected, and were told \x00. Either one of these should work when searching for these characters - no matter whether grep is enabled or not. Not sure, though, whether \s should actually find it as well in grep mode - it does not on my machine. But \W does grep it.

Please comment, if and as this requires adjustment / further detail.

like image 61
Abecee Avatar answered Sep 18 '22 07:09

Abecee