I've been trying to build a pattern in Java to split the following string by dashes AND by tab characters. The exception is that if a dash appears after a tab has already been encountered in the string, even once, we stop splitting on the dash and only split on tabs. For example:
Input string (those big spaces are tab characters):
"4852174--r-watch 7 47 2 0 80-B 20 5"
Expected output: ["4852174", "r", "watch", "7", "47", "2", "0", "80-B", "20", "5"]
I'm using the following regular expression so far: "(?<!\\d)(\\-+)(?!\t)|\t"
The first set of brackets to signal I don't want any numbers preceding the delimiter, the next to signal that I want one or more dashes, and the last set to note that I want no tabs to follow. Of course, the OR at the end is for splitting by single tab characters.
The result that I'm getting is the following:
["4852174-", "r", "watch", "7", "47", "2", "0", "80-B", "20", "5"]
Notice the extra dash in the "4852174-" that should not be there. I've tried for very long to try to figure this out but any small change I make ruins the splitting elsewhere.
Any help to solve this problem would be much appreciated. Thank you in advance!
To split a string by a regular expression, pass a regex as a parameter to the split() method, e.g. str. split(/[,. \s]/) . The split method takes a string or regular expression and splits the string based on the provided separator, into an array of substrings.
You do not only have to use literal strings for splitting strings into an array with the split method. You can use regex as breakpoints that match more characters for splitting a string.
split() The method split() splits a String into multiple Strings given the delimiter that separates them. The returned object is an array which contains the split Strings.
The regex
\t|-+(?!\w\t)
will split the string into your desired array, but without further clarification what you want to do I can not tell you if it will work for other Strings.
You can test regex at www.regexpal.com (This is with your regex.)
Please note that you have to escape the backslash in Java. So in Java it will be
\\t|-+(?!\\w\\t)
The regex for matching your string is: ^(([^-\s]+?)[-\s]*)+$
The above regex will match your string even if hyphens(-) are repeated more than twice. You can get the expected output by obtaining matches from group 2 (\2).
group 1 matching: (([^-\s]+?)[-\s]*)
group 2 matching: ([^-\s]+?) => this is the grouping you will need for constructing your output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With