I want to remove all the leading and trailing punctuation in a string. How can I do this?
Basically, I want to preserve punctuation in between words, and I need to remove all leading and trailing punctuation.
.
, @
, _
, &
, /
, -
are allowed if surrounded by letters
or digits\'
is allowed if preceded by a letter or digitI tried
Pattern p = Pattern.compile("(^\\p{Punct})|(\\p{Punct}$)");
Matcher m = p.matcher(term);
boolean a = m.find();
if(a)
term=term.replaceAll("(^\\p{Punct})", "");
but it didn't work!!
Ok. So basically you want to find some pattern in your string and act if the pattern in matched.
Doing this the naiive way would be tedious. The naiive solution could involve something like
while(myString.StartsWith("." || "," || ";" || ...)
myString = myString.Substring(1);
If you wanted to do a bit more complex task, it could be even impossible to do the way i mentioned.
Thats why we use regular expressions. Its a "language" with which you can define a pattern. the computer will be able to say, if a string matches that pattern. To learn about regular expressions, just type it into google. One of the first links: http://www.codeproject.com/Articles/9099/The-30-Minute-Regex-Tutorial
As for your problem, you could try this:
myString.replaceFirst("^[^a-zA-Z]+", "")
The meaning of the regex:
the first ^ means that in this pattern, what comes next has to be at the start of the string.
The [] define the chars. In this case, those are things that are NOT (the second ^) letters (a-zA-Z).
You can use a similar regex to remove trailing chars.
myString.replaceAll("[^a-zA-Z]+$", "");
the $ means "at the end of the string"
You could use a regular expression:
private static final Pattern PATTERN =
Pattern.compile("^\\p{Punct}*(.*?)\\p{Punct}*$");
public static String trimPunctuation(String s) {
Matcher m = PATTERN.matcher(s);
m.find();
return m.group(1);
}
The boundary matchers ^
and $
ensure the whole input is matched.
A dot .
matches any single character.
A star *
means "match the preceding thing zero or more times".
The parentheses ()
define a capturing group whose value is retrieved by calling Matcher.group(1)
.
The ?
in (.*?)
means you want the match to be non-greedy, otherwise the trailing punctuation would be included in the group.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With