I want to create a Java regular expression to grab all words that start with a capital letter then capital or small letters, but those letters may contain accents.
Examples :
Where
Àdónde
Rápido
Àste
Can you please help me with that ?
Regex:
\b\p{Lu}\p{L}*\b
Java string:
"(?U)\\b\\p{Lu}\\p{L}*\\b"
Explanation:
\b # Match at a word boundary (start of word)
\p{Lu} # Match an uppercase letter
\p{L}* # Match any number of letters (any case)
\b # Match at a word boundary (end of word)
Caveat: This only works correctly in very recent Java versions (JDK7); for others you may need to substitute a longer sub-regex for \b
. As you can see here, you may need to use (kudos to @tchrist)
(?:(?<=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])|(?<![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]]))
for \b
, so the Java string would look like this:
"(?:(?<=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])|(?<![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\]))\\p{Lu}\\p{L}*(?:(?<=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])|(?<![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\]))"
Code for to detect the Capital Letters in a given para. in this case input given as Console Input.
import java.io.*;
import java.util.regex.*;
import java.util.Scanner;
public class problem9 {
public static void main(String[] args) {
String line1;
Scanner in = new Scanner(System.in);
String pattern = "(?U)\\b\\p{Lu}\\p{L}*\\b";
line1 = in.nextLine();
String delimiter = "\\s";
String[] words1 = line1.split(delimiter);
for(int i=0; i<words1.length;i++){
if(words1[i].matches(pattern)){
System.out.println(words1[i]);
}
}
}
}
If you give the Input something like
Input:This is my First Program
output:
This
First
Program
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With