Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java regular expression to split string

I've been trying to build a pattern in Java to split the following string by dashes AND by tab characters. The exception is that if a dash appears after a tab has already been encountered in the string, even once, we stop splitting on the dash and only split on tabs. For example:

Input string (those big spaces are tab characters):

"4852174--r-watch   7   47  2   0   80-B    20  5"

Expected output: ["4852174", "r", "watch", "7", "47", "2", "0", "80-B", "20", "5"]

I'm using the following regular expression so far: "(?<!\\d)(\\-+)(?!\t)|\t"

The first set of brackets to signal I don't want any numbers preceding the delimiter, the next to signal that I want one or more dashes, and the last set to note that I want no tabs to follow. Of course, the OR at the end is for splitting by single tab characters.

The result that I'm getting is the following: ["4852174-", "r", "watch", "7", "47", "2", "0", "80-B", "20", "5"]

Notice the extra dash in the "4852174-" that should not be there. I've tried for very long to try to figure this out but any small change I make ruins the splitting elsewhere.

Any help to solve this problem would be much appreciated. Thank you in advance!

like image 813
HikeTakerByRequest Avatar asked Nov 19 '13 10:11

HikeTakerByRequest


People also ask

How do you split a string in regex?

To split a string by a regular expression, pass a regex as a parameter to the split() method, e.g. str. split(/[,. \s]/) . The split method takes a string or regular expression and splits the string based on the provided separator, into an array of substrings.

Does split accept regex?

You do not only have to use literal strings for splitting strings into an array with the split method. You can use regex as breakpoints that match more characters for splitting a string.

What does split \\ do in Java?

split() The method split() splits a String into multiple Strings given the delimiter that separates them. The returned object is an array which contains the split Strings.


2 Answers

The regex

\t|-+(?!\w\t)

will split the string into your desired array, but without further clarification what you want to do I can not tell you if it will work for other Strings.

You can test regex at www.regexpal.com (This is with your regex.)

Please note that you have to escape the backslash in Java. So in Java it will be

\\t|-+(?!\\w\\t)
like image 107
Cv4 Avatar answered Oct 30 '22 06:10

Cv4


The regex for matching your string is: ^(([^-\s]+?)[-\s]*)+$

The above regex will match your string even if hyphens(-) are repeated more than twice. You can get the expected output by obtaining matches from group 2 (\2).

group 1 matching: (([^-\s]+?)[-\s]*)

group 2 matching: ([^-\s]+?) => this is the grouping you will need for constructing your output.

like image 35
Santhosh Gutta Avatar answered Oct 30 '22 07:10

Santhosh Gutta