Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String.split by semicolon

Tags:

java

string

split

I want to split a string by semicolon(";"):

String phrase = "‫;‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid";
String[] dateSplit = phrase.split(";");
System.out.println("dateSplit[0]:" + dateSplit[0]);
System.out.println("dateSplit[1]:" + dateSplit[1]);

But it removes the ";" from string and puts all string to 'datesplit1' so the output is:

dateSplit[0]:‫
dateSplit[1]:‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid`

Demo

and on doing

System.out.println("Real String :"+phrase);

string printed is

Real String :‫;‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid
like image 322
s_puria Avatar asked Apr 21 '15 12:04

s_puria


People also ask

How do you split a string with a semicolon in Python?

split() method to split a string on the semicolons, e.g. my_list = my_str. split(';') . The str. split method will split the string on each occurrence of a semicolon and will return a list containing the results.

How do you split a string with a comma?

To split a string with comma, use the split() method in Java. str. split("[,]", 0);

How do you split a string with a delimiter?

You can use the split() method of String class from JDK to split a String based on a delimiter e.g. splitting a comma-separated String on a comma, breaking a pipe-delimited String on a pipe, or splitting a pipe-delimited String on a pipe.


1 Answers

The phrase contains bi-directional characters like right-to-left embedding. It's why some editors don't manage to display correctly the string.

This piece of code shows the actual characters in the String (for some people the phrase won't display here the right way, but it compiles and looks fine in Eclipse). I just translate left-right with ->, right-to-left with <- and pop directions with ^:

public static void main(String[]args) {
    String phrase = "‫;‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid";
    String[] dateSplit = phrase.split(";");
    for (String d : dateSplit) {
        System.out.println(d);
    }
    char[] c = phrase.toCharArray();
    StringBuilder p = new StringBuilder();
    for (int i = 0; i < c.length;i++) {
        int code = Character.codePointAt(c, i);
        switch (code) {
        case 8234:
            p.append(" -> ");
            break;
        case 8235:
            p.append(" <- ");
            break;
        case 8236:
            p.append(" ^ ");
            break;
        default:
            p.append(c[i]);
        }
    }
    System.out.println(p.toString());
}

Prints:

<- ; -> 14/May/2015 ^ ^ <- -> FC ^ ^ <- -> Barcelona ^ ^ <- -> VS. ^ ^ <- -> Real ^ ^ <- -> Madrid

The String#split() will work on the actual character string and not on what the editor displays, hence you can see the ; is the second character after a right-to-left, which gives (beware of display again: the ; is not part of the string in dateSplit[1]):

dateSplit[0] = "";
dateSplit[1] = "14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid";

I guess you are processing data from a language writing/reading from right-to-left and there is some mixing with the football team names which are left-to-right. The solution is certainly to get rid of directional characters and put the ; at the right place, i.e as a separator for the token.

like image 167
T.Gounelle Avatar answered Oct 01 '22 09:10

T.Gounelle