Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing NUL characters from bytes

Tags:

go

To teach myself Go I'm building a simple server that takes some input, does some processing, and sends output back to the client (that includes the original input).

The input can vary in length from around 5 - 13 characters + endlines and whatever other guff the client sends.

The input is read into a byte array and then converted to a string for some processing. Another string is appended to this string and the whole thing is converted back into a byte array to get sent back to the client.

The problem is that the input is padded with a bunch of NUL characters, and I'm not sure how to get rid of them.

So I could loop through the array and when I come to a nul character, note the length (n), create a new byte array of that length, and copy the first n characters over to the new byte array and use that. Is that the best way, or is there something to make this easier for me?

Some stripped down code:

data := make([]byte, 16)
c.Read(data)

s := strings.Replace(string(data[:]), "an", "", -1)
s = strings.Replace(s, "\r", "", -1)
s += "some other string"
response := []byte(s)
c.Write(response)
c.close()

Also if I'm doing anything else obviously stupid here it would be nice to know.

like image 330
Tom Carrick Avatar asked Mar 15 '13 11:03

Tom Carrick


People also ask

What is nul in text file?

The null character (also null terminator) is a control character with the value zero. It is present in many character sets, including those defined by the Baudot and ITA2 codes, ISO/IEC 646 (or ASCII), the C0 control code, the Universal Coded Character Set (or Unicode), and EBCDIC.

How do you remove null characters from a string in Java?

A short answer: you could use the String. replace() method to replace the 0 character with another character, or the replaceAll() method to replace it with an empty String.

How do I remove a null character from a string in Python?

rstrip() will do the job. Unlike strip(), rstrip() only removes trailing whitespace characters. I tried it with null spaces on REPL and it worked.

How do I get rid of null bytes in strings?

These are strings of character, there are no null bytes in them. The easiest way to do this is just replace all trailing zeroes with null bytes '\0'.

Is it possible to remove 0 bytes from a string?

Zero bytes should never be removed from data to be parsed into a Unicode strings, as the characters are not 1-bytes ones. The results of that could be unpredictable. That could happen with all other "answers".

How to remove all trailing zeroes in a string?

I just want to remove all trailling zero. Hexadecimal 4 Byte = 0x7f000000. These are strings of character, there are no null bytes in them. The easiest way to do this is just replace all trailing zeroes with null bytes '\0'.


3 Answers

In package "bytes", func Trim(s []byte, cutset string) []byte is your friend:

Trim returns a subslice of s by slicing off all leading and trailing UTF-8-encoded Unicode code points contained in cutset.

// Remove any NULL characters from 'b'
b = bytes.Trim(b, "\x00")
like image 84
zzzz Avatar answered Oct 16 '22 20:10

zzzz


Your approach sounds basically right. Some remarks:

  1. When you have found the index of the first nul byte in data, you don't need to copy, just truncate the slice: data[:idx].

  2. bytes.Index should be able to find that index for you.

  3. There is also bytes.Replace so you don't need to convert to string.

like image 6
Thomas Kappler Avatar answered Oct 16 '22 19:10

Thomas Kappler


The io.Reader documentation says:

Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered.

If the call to Read in the application does not read 16 bytes, then data will have trailing zero bytes. Use the number of bytes read to trim the zero bytes from the buffer.

data := make([]byte, 16)
n, err := c.Read(data)
if err != nil {
   // handle error
}
data = data[:n]

There's another issue. There's no guarantee that Read slurps up all of the "message" sent by the peer. The application may need to call Read more than once to get the complete message.

You mention endlines in the question. If the message from the client is terminated but a newline, then use bufio.Scanner to read lines from the connection:

 s := bufio.NewScanner(c)
 if s.Scan() {
     data = s.Bytes() // data is next line, not including end lines, etc.
 }
 if s.Err() != nil {
     // handle error
 } 
like image 3
user13631587 Avatar answered Oct 16 '22 21:10

user13631587