I have some poorly formatted text that I need to filter. As such, there are plenty of cases in which a quote in the text begins in one line and then cuts off and finishes in a second line. In such a case, my preference is to just remove the partial quotes completely, BUT, I want to preserve regular full quotes. I know that this can be done iteratively with a counter, but I would really prefer to go about it with Regular Expressions.
Take fore example:
"This is a quote" This is an end "partial- quote" Here is more text. This is an end "partial- quote w/o more text" This is an "embedded" quote
Here is an example with my current attempt (\"[^\"\n]+?|^[^\"\n]+?\")(\n|$) Note that it fails in two circumstances:
I figured that I could set up an if statement and run each line through, checking if it has less than two quotes and then proceeding to parse the partial quotes, but I thought the minds at SO would have a much cleaner solution.
NOTE The desired output is:
"This is a quote" This is an end Here is more text. This is an end This is an "embedded" quote
(I handle the whitespaces later-on)
Here you go,
^((?:[^"\n]*"[^"\n]*")*[^"\n]*)"[^"\n]*\n[^"\n]*"(\n|)
Replace the matched characters with \1\n
DEMO
>>> import re
>>> s = '''"This is a quote"
This is an end "partial-
quote" Here is more text.
This is an end "partial-
quote w/o more text"
This is an "embedded" quote'''
>>> m = re.sub(r'(?m)^((?:[^"\n]*"[^"\n]*")*[^"\n]*)"[^"\n]*\n[^"\n]*"(\n|)', r'\1\n', s)
>>> print(m)
"This is a quote"
This is an end
Here is more text.
This is an end
This is an "embedded" quote
Use this regex, if you want to deal with more than one lines present inside between double quotes.
^((?:[^"\n]*"[^"\n]*")*[^"\n]*)"(?:[^"\n]*\n)+[^"\n]*"(\n|)
DEMO
("[^"\n]*")|"[^"]*(\n)[^"]*"(?![^\n]*")|"[^"]*\n.*?(?=\n[^"]*"[^\n"]*")
You can try this.This will take case of odd number of quotes as well.See demo.
https://regex101.com/r/dL7oF8/6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With