I have come across this line of legacy code, which I am trying to figure out: <pre class="prettyprint"><code>String newString = new String(oldString.getBytes("UTF-8"), "UTF-8")); </code></pre> As far as I can understand, it is encoding & decoding using the same charSet. How is this different from the following? <pre class="prettyprint"><code>String newString = oldString; </code></pre> Is there any scenario in which the two lines will have different outputs? p.s.: Just to clarify, yes I am aware of the excellent article on encoding by Joel Spolsky !

<blockquote> How is this different from the following? </blockquote> This line of code here: <pre class="prettyprint"><code>String newString = new String(oldString.getBytes("UTF-8"), "UTF-8")); </code></pre> constructs a new String object (i.e. a copy of <code>oldString</code>), while this line of code: <pre class="prettyprint"><code>String newString = oldString; </code></pre> declares a new variable of type <code>java.lang.String</code> and initializes it to refer to the same String object as the variable <code>oldString</code>. <blockquote> Is there any scenario in which the two lines will have different outputs? </blockquote> Absolutely: <pre class="prettyprint"><code>String newString = oldString; boolean isSameInstance = newString == oldString; // isSameInstance == true </code></pre> vs. <pre class="prettyprint"><code>String newString = new String(oldString.getBytes("UTF-8"), "UTF-8")); // isSameInstance == false (in most cases) boolean isSameInstance = newString == oldString; </code></pre> a_horse_with_no_name (see comment) is right of course. The equivalent of <pre class="prettyprint"><code>String newString = new String(oldString.getBytes("UTF-8"), "UTF-8")); </code></pre> is <pre class="prettyprint"><code>String newString = new String(oldString); </code></pre> minus the subtle difference wrt the encoding that Peter Lawrey explains in his answer.

Java String encoding (UTF-8)

Tags:

java

string

encoding

I have come across this line of legacy code, which I am trying to figure out:

String newString = new String(oldString.getBytes("UTF-8"), "UTF-8"));

As far as I can understand, it is encoding & decoding using the same charSet.

How is this different from the following?

String newString = oldString;

Is there any scenario in which the two lines will have different outputs?

p.s.: Just to clarify, yes I am aware of the excellent article on encoding by Joel Spolsky !

232

asked Jan 13 '12 16:01

OceanBlue

2 Answers

This could be complicated way of doing

String newString = new String(oldString);

This shortens the String is the underlying char[] used is much longer.

However more specifically it will be checking that every character can be UTF-8 encoded.

There are some "characters" you can have in a String which cannot be encoded and these would be turned into ?

Any character between \uD800 and \uDFFF cannot be encoded and will be turned into '?'

String oldString = "\uD800";
String newString = new String(oldString.getBytes("UTF-8"), "UTF-8");
System.out.println(newString.equals(oldString));

prints

false

answered Sep 17 '22 22:09

Peter Lawrey

How is this different from the following?

This line of code here:

String newString = new String(oldString.getBytes("UTF-8"), "UTF-8"));

constructs a new String object (i.e. a copy of oldString), while this line of code:

String newString = oldString;

declares a new variable of type java.lang.String and initializes it to refer to the same String object as the variable oldString.

Is there any scenario in which the two lines will have different outputs?

Absolutely:

String newString = oldString;
boolean isSameInstance = newString == oldString; // isSameInstance == true

vs.

String newString = new String(oldString.getBytes("UTF-8"), "UTF-8"));
 // isSameInstance == false (in most cases)    
boolean isSameInstance = newString == oldString;

a_horse_with_no_name (see comment) is right of course. The equivalent of

String newString = new String(oldString.getBytes("UTF-8"), "UTF-8"));

String newString = new String(oldString);

minus the subtle difference wrt the encoding that Peter Lawrey explains in his answer.

answered Sep 16 '22 22:09

afrischke

Related questions
                            
                                Apache CLI: Required options contradicts with help option.
                            
                                Java 9, compatability issue with ClassLoader.getSystemClassLoader
                            
                                Converting HTML to PDF using iText
                            
                                ClassNotFoundException: org.flywaydb.core.api.callback.FlywayCallback
                            
                                Reverse (parse the output) of Arrays.toString(int[]) [duplicate]
                            
                                How do I handle simultaneous key presses in Java?
                            
                                XStream : node with attributes and text node?
                            
                                JSF2 - backed by EJB or ManagedBean?
                            
                                httpOnly Session Cookie + Servlet 3.0 (e.g. Glassfish v3)
                            
                                java.lang.NoSuchMethodException: org.hibernate.validator.ClassValidator Seam weblogic 10.3
                            
                                Multi-node concurrency in Java
                            
                                How to get error's line number while validating a XML file against a XML schema
                            
                                Difference between encodeURL and encodeRedirectURL
                            
                                Quartz vs. ScheduledExecutorService in Java web application
                            
                                Caching with Guava
                            
                                How to instantiate Class class for a primitive type?
                            
                                BouncyCastle installation problems
                            
                                What is the difference between GZIPOutputStream and DeflaterOutputStream?
                            
                                How to use Wicket's DownloadLink with a file generated on the fly?
                            
                                Java Disposable pattern

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With