I have a String of the format "[(1, 2), (2, 3), (3, 4)]"
, with an arbitrary number of elements. I'm trying to split it on the commas separating the coordinates, that is, to retrieve (1, 2)
, (2, 3)
, and (3, 4)
.
Can I do it in Java regex? I'm a complete noob but hoping Java regex is powerful enough for it. If it isn't, could you suggest an alternative?
From Java 5
Scanner sc = new Scanner();
sc.useDelimiter("\\D+"); // skip everything that is not a digit
List<Coord> result = new ArrayList<Coord>();
while (sc.hasNextInt()) {
result.add(new Coord(sc.nextInt(), sc.nextInt()));
}
return result;
EDIT: We don't know how much coordinates are passed in the string coords
.
You can use String#split()
for this.
String string = "[(1, 2), (2, 3), (3, 4)]";
string = string.substring(1, string.length() - 1); // Get rid of braces.
String[] parts = string.split("(?<=\\))(,\\s*)(?=\\()");
for (String part : parts) {
part = part.substring(1, part.length() - 1); // Get rid of parentheses.
String[] coords = part.split(",\\s*");
int x = Integer.parseInt(coords[0]);
int y = Integer.parseInt(coords[1]);
System.out.printf("x=%d, y=%d\n", x, y);
}
The (?<=\\))
positive lookbehind means that it must be preceded by )
. The (?=\\()
positive lookahead means that it must be suceeded by (
. The (,\\s*)
means that it must be splitted on the ,
and any space after that. The \\
are here just to escape regex-specific chars.
That said, the particular String is recognizeable as outcome of List#toString()
. Are you sure you're doing things the right way? ;)
Update as per the comments, you can indeed also do the other way round and get rid of non-digits:
String string = "[(1, 2), (2, 3), (3, 4)]";
String[] parts = string.split("\\D.");
for (int i = 1; i < parts.length; i += 3) {
int x = Integer.parseInt(parts[i]);
int y = Integer.parseInt(parts[i + 1]);
System.out.printf("x=%d, y=%d\n", x, y);
}
Here the \\D
means that it must be splitted on any non-digit (the \\d
stands for digit). The .
after means that it should eliminate any blank matches after the digits. I must however admit that I'm not sure how to eliminate blank matches before the digits. I'm not a trained regex guru yet. Hey, Bart K, can you do it better?
After all, it's ultimately better to use a parser for this. See Huberts answer on this topic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With