I'm trying to build regex which will filter form string all non-alphabetical characters, and if any string contains single quotes then I want to keep it as an exception to the rule.
So for example when I enter
car's34
as a result I want to get
car's
when I enter
*&* Lisa's car 0)*
I want to get
Lisa's
at the moment I use this:
string.replaceAll("[^A-Za-z]", "")
however, it gives me only alphabets, and removed the desired single quotas.
This will also remove apostrophes that are not "part if words":
string = string.replaceAll("[^A-Za-z' ]+|(?<=^|\\W)'|'(?=\\W|$)", "")
.replaceAll(" +", " ").trim();
This first simply adds an apostrophe to the list of chars you want to keep, but uses look arounds to find apostrophes not within words, so
I'm a ' 123 & 'test'
would become
I'm a test
Note how the solitary apostrophe was removed, as well as the apostrophes wrapping test, but I'm was preserved.
The subsequent replaceAll() is to replace multiple spaces with a single space, which will result if there's a solitary apostrophe in the input. A further call to trim() was added in case it occurs at the end of the input.
Here's a test:
String string = "I'm a ' 123 & 'test'";
string = string.replaceAll("[^A-Za-z' ]+|(?<=^|\\W)'|'(?=\\W|$)", "").replaceAll(" +", " ").trim();
System.out.println(string);
Output:
I'm a test
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With