Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java String wrong order concatenation of different languages

Screen Shoot of strange java String behavior

So, as you can see from the image, that i have concatenated a,c and b. And i am getting the result i expected. But in 2nd println, when i concatenated a,e and b, i got e concatenated in the end, not where i was expecting it to be. I want to know, the reason of this behavior and solution to this behavior. Thank you in advance.

import java.util.*;
public class prob 
{
    public static void main(String... args)
    {
        String a="الف",b="1/2",c="ب",e="B";

        System.out.println(a+" : "+c+" : "+b);
        System.out.println(a+" : "+e+" : "+b);
    }
}

EDIT(To explain why my question is not a duplicate): My question is on converting L2R languages to R2L.

like image 640
Zia Ul Rehman Mughal Avatar asked Oct 19 '15 15:10

Zia Ul Rehman Mughal


People also ask

Is string concatenation in loop bad?

Additionally, String concatenation using the + operator within a loop should be avoided. Since the String object is immutable, each call for concatenation will result in a new String object being created.

How do you Deconcatenate two strings?

You concatenate strings by using the + operator. For string literals and string constants, concatenation occurs at compile time; no run-time concatenation occurs. For string variables, concatenation occurs only at run time.

Why should you be careful about string concatenation (+) operator in loops in Java?

If you concatenate Stings in loops for each iteration a new intermediate object is created in the String constant pool. This is not recommended as it causes memory issues.


1 Answers

This is because the first character is R2L (right to left orientation as in asian languages), so next character becames at the begining (correct orientation):

First char:

الف 
// actual orientation ←

Second char added at L

// add ←
B : الف 
// actual orientation →

After this, B is L2R as usual in Europe, so next char (1/2) is added in the right orientation AFTER B:

// → add in this direction
B : 1/2 : الف 
// actual orientation → (still)

You can easily test it by copy paste char and writting manually another, you will see how orientation changes depending of the char you inserted.


UPDATE:

what is my solution for this issue, because i made this example only to show what issue i was facing in making some big reports, where data is mix sometimes, it is L2R String and sometimes R2L. And i want to make a string in strictly this format.(

From this answer:

  • Left-to-right embedding (U+202A)
  • Right-to-left embedding (U+202B)
  • Pop directional formatting (U+202C)

So in java, to embed a RTL language like Arabic in an LTR language like English, you would do

myEnglishString + "\u202B" + myArabicString + "\u202C" + moreEnglish

and to do the reverse

myArabicString + "\u202A" + myEnglishString + "\u202C" + moreArabic

See (for the source material)

  • Bidirectional General Formatting for more details,
  • the Unicode specification chapter on "Directional Formatting Codes"

ADD ON 2:

char l2R = '\u202A';
System.out.println(l2R + a + " : " + e +" : "+b);

OUTPUT:

‪الف : B : 1/2
like image 167
Jordi Castilla Avatar answered Oct 24 '22 16:10

Jordi Castilla