Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to remove quotations

Tags:

java

regex

How do I write a regular expression to satisfy these requirements ? I can only use a string.replaceAll function ..

a) For ” which appears at end of paragraph which has a “, but not “ “—remove ”

b) For “ which appears at beginning of paragraph remove “ [NOTE: If there is “ “, it should now be “]

c) For ” which appears at end of paragraph without a matching “ at beginning of paragraph –remove ”

EDIT:

Rule a)
Transform:
String input1 ="“remove quotes”" 
String output1 ="“remove quotes"

Don't change anything:
String input1 ="““remove quotes”" 
String output1 ="““remove quotes”"

Rule b)
Transform:
String input1 ="“remove quotes”" 
String output1 ="remove quotes”"

Replace with single ldquo:
String input1 ="““remove quotes”" 
String output1 ="“remove quotes”"

Rule c)
Do nothing (there is a matching ldquo):
String input1 ="“do not remove quotes”" 
String output1 ="“do not remove quotes”"

Transform(no matching ldquo hence remove rdquo):
String input1 ="remove quotes”" 
String output1 ="remove quotes"

I think I am going to run all the 3 rules separately on the string. What would be 3 regexes and replace expressions ? 
like image 917
Phoenix Avatar asked Mar 23 '23 14:03

Phoenix


1 Answers

Description

This regex will do the following:

  1. if 2 initial “ strings and a ending ”, then remove single “
  2. if 1 initial “ string and a ending ”, then remove nothing
  3. if 0 initial “ strings and a ending ”, then remove ending ”

regex: ^(?=.*?”)“\s*(“)|^(?=.*?”)(“.*?”)|^(?!“)(.*?)”

replace with: $1$2$3

enter image description here

Input text

“ DO NOTHING  ”
“ “ REMOVE INITIAL LD  ”
REMOVE RD  ”

Output text respecitivly

“ DO NOTHING  ”
“ REMOVE INITIAL LD ”
REMOVE RD

These expressions where hashed out from a chat session, and written to be executed one at a time in A,B,C order, however because they are seperate, they can be executed in any order the developer would like which would change based on the desired output.

A

  • 1 LD and 1 RD, remove the RD
  • 2 LD and 1 RD, do nothing
  • regex: ^(“(?!\s*“).*?)”
  • replace with $1

B

  • 1 LD, remove 1 LD
  • 2 LD, remove 1 LD
  • regex: ^“(\s*(?:“)?)
  • replace with $1

C

  • 1 LD and 1 RD, do nothing
  • 0 LD and 1 RD, remove the RD
  • regex: ^(?!“)(.*?)”
  • replace with $1
like image 65
Ro Yo Mi Avatar answered Apr 10 '23 04:04

Ro Yo Mi