Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete text within quotes

I want to remove strings within the double quotes or single quotes or backticks along with the enclosing characters.

Input is:

Lorem ipsum "'dolor sit amet consectetur'" adipiscing "elite"  ellentesque 

scelerisque 'tortor' tortor in `vestibulum` dolor

Expected output:

Lorem ipsum adipiscing ellentesque scelerisque tortor in dolor

I have this code, but there is no change in the result. Could anyone tell me what is wrong with my code?

line.replaceAll("[\'\"\\`].*[\'\"\\`]$", "");
like image 944
mysticfalls Avatar asked Feb 13 '23 05:02

mysticfalls


2 Answers

There are three problems with your regex.

  1. It matches text from any one of "'` to any one of "'`, not necessarily the same one that started the match.
  2. * is greedy, meaning it will match text from the first ", ', or ` to the very last one in the line.
  3. Because your regex ends with $, it will only match text if that text ends with the end of the entire string.

You can try it this way:

sb.append(line.replaceAll("(?:([\"'`])[^\\1]*?\\1)\\s+|\r?\n", ""));

Input:

Lorem ipsum "'dolor sit amet consectetur'" adipiscing "elite"  ellentesque 

scelerisque 'tortor' tortor in `vestibulum` dolor

Output:

Lorem ipsum adipiscing ellentesque scelerisque tortor in dolor

There is an explanation and demonstration of that regex here: http://regex101.com/r/iK3fQ8

like image 108
The Guy with The Hat Avatar answered Feb 16 '23 02:02

The Guy with The Hat


Change your greedy matcher .* to .+?(non greedy).

And assign the replaced value.

Full code:

line = line.replaceAll("([\'\"\\`]).+?\1", "");

Thanks tobias_k for pointing out that I could use backreference.

Also check for java's escaping rules and escape accordingly.

like image 39
Amit Joki Avatar answered Feb 16 '23 03:02

Amit Joki