Given a string entered by a user, I'm trying to split the string by removing any whitespace and getting each token.
But I'm having difficulties for when I have a token in quotation marks. Here are some examples to better clarify:
User input: that is cool
Expected Output:
that
is
cool
User input: The book "Harry Potter" is cool
Expected Output:
The
book
"Harry Potter"
is
cool
User input: Here " is one final " example
Expected Output:
Here
" is one final "
example
This is what I have so far:
public static void main(String[] args) {
String input;
Scanner in = new Scanner(System.in);
System.out.print("User input: ");
input = in.nextLine();
input = input.trim();
input = input.replaceAll("\\s+", " ");
String[] a = input.split(" ");
for (String c: a) {
System.out.println(c);
}
}
It only works for the first example but for the examples with quotations, it splits the spaces inside the quoted tokens swell. Example 3 output:
Here
"
is
one
final
"
example
Don't focus on things you want to split on. It is easier to focus on things you want to find as result:
private static final Pattern p = Pattern.compile("\"[^\"]+\"|\\S+");
// quotes--- ^^^^^^^^^^
// non+whitespace ^^^^
public static List<String> splitTokensAndQuotes(String text) {
List<String> result = new ArrayList<>();
Matcher m = p.matcher(text);
while (m.find()) {
result.add(m.group());
}
return result;
}
Demo:
public static void main(String[] args) {
splitTokensAndQuotes("that is cool")
.forEach(System.out::println);
System.out.println("------");
splitTokensAndQuotes("the book \"Harry Potter\" is cool")
.forEach(System.out::println);
System.out.println("------");
splitTokensAndQuotes("Here \" is one final \" example")
.forEach(System.out::println);
System.out.println("------");
}
Result:
that
is
cool
------
the
book
"Harry Potter"
is
cool
------
Here
" is one final "
example
------
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With