Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting strings based on a delimiter

Tags:

java

regex

I am trying to break apart a very simple collection of strings that come in the forms of

0|0
10|15
30|55

etc etc. Essentially numbers that are seperated by pipes.

When I use java's string split function with .split("|"). I get somewhat unpredictable results. white space in the first slot, sometimes the number itself isn't where I thought it should be.

Can anybody please help and give me advice on how I can use a reg exp to keep ONLY the integers?

I was asked to give the code trying to do the actual split. So allow me to do that in hopes to clarify further my problem :)

String temp = "0|0";
String splitString = temp.split("|");

results

\n
0
| 
0

I am trying to get

0
0

only. Forever grateful for any help ahead of time :)

like image 560
Selcuk Bor Avatar asked Dec 17 '11 21:12

Selcuk Bor


People also ask

How do you split a string with a delimiter?

Using String. split() Method. The split() method of the String class is used to split a string into an array of String objects based on the specified delimiter that matches the regular expression.

Which string function splits a string based on a delimiter?

The split() method splits a string into an array of substrings.

Does string split remove delimiter?

split() is a powerful string manipulation tool that we have at our disposal, but it can also be destructive. Splitting will remove delimiters from the returned array, which may give us output that differs greatly from what the original string was.

How do you split a list by delimiter in Python?

Split by delimiter: split()Use split() method to split by delimiter. If the argument is omitted, it will be split by whitespace, such as spaces, newlines \n , and tabs \t . Consecutive whitespace is processed together. A list of the words is returned.


2 Answers

I still suggest to use split(), it skips null tokens by default. you want to get rid of non numeric characters in the string and only keep pipes and numbers, then you can easily use split() to get what you want. or you can pass multiple delimiters to split (in form of regex) and this should work:

String[] splited = yourString.split("[\\|\\s]+");

and the regex:

import java.util.regex.*;

Pattern pattern = Pattern.compile("\\d+(?=([\\|\\s\\r\\n]))");
Matcher matcher = pattern.matcher(yourString);
while (matcher.find()) {
    System.out.println(matcher.group());
}
like image 189
fardjad Avatar answered Oct 19 '22 15:10

fardjad


The pipe symbol is special in a regexp (it marks alternatives), you need to escape it. Depending on the java version you are using this could well explain your unpredictable results.

class t {
    public static void main(String[]_)
    {
        String temp = "0|0";
        String[] splitString = temp.split("\\|");

        for (int i=0; i<splitString.length; i++)
            System.out.println("splitString["+i+"] is " + splitString[i]);
    }       
}

outputs

splitString[0] is 0
splitString[1] is 0

Note that one backslash is the regexp escape character, but because a backslash is also the escape character in java source you need two of them to push the backslash into the regexp.

like image 33
crazyscot Avatar answered Oct 19 '22 17:10

crazyscot