Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert ASCII byte[] to String

I am trying to pass a byte[] containing ASCII characters to log4j, to be logged into a file using the obvious representation. When I simply pass in the byt[] it is of course treated as an object and the logs are pretty useless. When I try to convert them to strings using new String(byte[] data), the performance of my application is halved.

How can I efficiently pass them in, without incurring the approximately 30us time penalty of converting them to strings.

Also, why does it take so long to convert them?

Thanks.

Edit

I should add that I am optmising for latency here - and yes, 30us does make a difference! Also, these arrays vary from ~100 all the way up to a few thousand bytes.

like image 442
jwoolard Avatar asked Feb 04 '10 17:02

jwoolard


People also ask

How do you convert a byte array into a string?

There are two ways to convert byte array to String: By using String class constructor. By using UTF-8 encoding.

How do you convert bytes to strings?

One method is to create a string variable and then append the byte value to the string variable with the help of + operator. This will directly convert the byte value to a string and add it in the string variable.

Can we convert byte to char in Java?

First, the byte is converted to an int via widening primitive conversion (§5.1. 2), and then the resulting int is converted to a char by narrowing primitive conversion (§5.1. 3).


2 Answers

ASCII is one of the few encodings that can be converted to/from UTF16 with no arithmetic or table lookups so it's possible to convert manually:

String convert(byte[] data) {
    StringBuilder sb = new StringBuilder(data.length);
    for (int i = 0; i < data.length; ++ i) {
        if (data[i] < 0) throw new IllegalArgumentException();
        sb.append((char) data[i]);
    }
    return sb.toString();
}

But make sure it really is ASCII, or you'll end up with garbage.

like image 64
finnw Avatar answered Oct 05 '22 18:10

finnw


What you want to do is delay processing of the byte[] array until log4j decides that it actually wants to log the message. This way you could log it at DEBUG level, for example, while testing and then disable it during production. For example, you could:

final byte[] myArray = ...;
Logger.getLogger(MyClass.class).debug(new Object() {
    @Override public String toString() {
        return new String(myArray);
    }
});

Now you don't pay the speed penalty unless you actually log the data, because the toString method isn't called until log4j decides it'll actually log the message!

Now I'm not sure what you mean by "the obvious representation" so I've assumed that you mean convert to a String by reinterpreting the bytes as the default character encoding. Now if you are dealing with binary data, this is obviously worthless. In that case I'd suggest using Arrays.toString(byte[]) to create a formatted string along the lines of

[54, 23, 65, ...]
like image 43
Steven Schlansker Avatar answered Oct 05 '22 17:10

Steven Schlansker