Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Strings storing byte arrays

I want to store a byte array wrapped in a String object. Here's the scenario

  1. The user enters a password.
  2. The bytes of that password are obtained using the getBytes() String method.
  3. They bytes are encrypted using java's crypo package.
  4. Those bytes are then converted into a String using the constructor new String(bytes[])
  5. That String is stored, or otherwise passed around (NOT changed)
  6. The bytes of that String are obtained and they are different then the encoded bytes.

Here's a snippet of code that describes what I'm talking about.

String s = "test123";
byte[] a = s.getBytes();
byte[] b = env.encrypt(a);
String t = new String(b);
byte[] c = t.getBytes();
byte[] d = env.decrypt(c);

Where env.encrypt() and env.decrypt() do the encryption and decryption. The problem I'm having is that the b array is of length 8 and the c array is of length 16. I would think that they would be equal. What's going on here? I tried to modify the code as below

String s = "test123";
Charset charset = Charset.getDefaultCharset();
byte[] a = s.getBytes(charset);
byte[] b = env.encrypt(a);
String t = new String(b, charset);
byte[] c = t.getBytes(charset);
byte[] d = env.decrypt(c);

but that didn't help.

Any ideas?

like image 945
Jon Avatar asked Nov 28 '22 00:11

Jon


2 Answers

It's not a good idea to store binary data in a String object. You'd be better off using something like Base64 encoding, which is intended to make binary data into a printable string, and is completely reversible.

In fact, I just found a public domain base64 encoder for Java: http://iharder.sourceforge.net/current/java/base64/

like image 174
Jonathan Avatar answered Dec 04 '22 00:12

Jonathan


Several people have pointed out that this is not a proper use of the String(byte[]) constructor. It is important to remember that in Java a String is made up of characters, which happen to be 16 bits, and not 8 bits, as a byte is. You are also forgetting about character encoding. Remember, a character is often not a byte.

Let's break it down bit by bit:

String s = "test123";
byte[] a = s.getBytes();

At this point your byte array most likely contains 8 bytes if your system's default character encoding is Windows-1252 or iso-8859-1 or UTF-8.

byte[] b = env.encrypt(a);

Now b contains some seemingly random data depending on your encryption, and isn't even guaranteed to be a certain length. Many encryption engines pad the input data so that the output matches a certain block size.

String t = new String(b);

This is taking your random bytes and asking Java to interpret them as character data. These characters may appear as gibberish and some sequences of bits are not valid characters for every encoding. Java dutifully does its best and creates a sequence of 16-bit chars.

byte[] c = t.getBytes();

This may or may not give you the same byte array as b, depending on the encoding. You state in the problem description that you are seeing c as 16 bytes long; this is probably because the garbage in t doesn't convert well in the default character encoding.

byte[] d = env.decrypt(c);

This won't work because c is not the data you expect it to be but rather is corrupt.

Solutions:

  1. Just store the byte array directly in the database or wherever. However you are still forgetting about the character encoding problem, more on that in a sec.
  2. Take the byte array data and encode it using Base64 or as hexadecimal digits and store that string:

    byte[] cypherBytes = env.encrypt(getBytes(plainText));
    StringBuffer cypherText = new StringBuffer(cypherBytes.length * 2);
    for (byte b : cypherBytes) {
      String hex = String.format("%02X", b); //$NON-NLS-1$
      cypherText.append(hex);
    }
    return cypherText.toString();
    

Character encoding:

A user's password may not be ASCII and thus your system is susceptible to problems because you don't specify the encoding.

Compare:

String s = "tést123";
byte[] a = s.getBytes();
byte[] b = env.encrypt(a);

with

String s = "tést123";
byte[] a = s.getBytes("UTF-8");
byte[] b = env.encrypt(a);

The byte array a won't have the same value with the UTF-8 encoding as with the system default (unless your system default is UTF-8). It doesn't matter what encoding you use as long as A) you're consistent and B) your encoding can represent all the allowable characters for your data. You probably can't store Chinese text in the system default encoding. If your application is ever deployed on more than one computer, and one of those has a different system-default encoding, passwords encrypted on one system will become gibberish on the other system.

Moral of the story: Characters are not bytes and bytes are not characters. You have to remember which you are dealing with and how to convert back and forth between them.

like image 38
Mr. Shiny and New 安宇 Avatar answered Dec 04 '22 00:12

Mr. Shiny and New 安宇