I am trying to wrap quotes around certain section of content in a CSV file, the current layout is something like this:
###element1,element2,element3,element4,element5,element6,element7,element8, "element9,
element9,""element9"",element9,
element9,element9,""element9",element10,
###
the ### symbols depict a new line and each new line should have one, the problem is I need to get to all of element 9 in to one set of double quotes, however there are multiple instances of doublequotes within that area which break up the element in to new fields making my table expand beyond the fields I initially set. So I believe I need to remove all the " marks between the start and end of element9 and then reintroduce one set to highlight the whole section.
I approached this firstly by trying to select the 8th Comma from the start and the 2 comma from the end:
^((?:[^,]+,){8})(.+)((?:,[^,]*){2})$
and replacing with
$1"$2"$3
I tried to target the starting ### and ending ### to select those two elements but with no success.
any suggestions on how I can do this
UPDATE
###BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,
BLAHBLAH,
BLAHBLAH,
BLAHBLAH, BLAHBLAH,
BLAHBLAH, BLAHBLAH,
BLAHBLAH,
"BLAHBLAH""",E,
###
The last field always seem to contain a capital letter, the fields before vary in quotation placement so to really target that whole section I need to work out how many commas along and how many back I need to go, remove the quotes and then reinstate them in the correct positions.
What a great opportunity to explore how useful backreferences can be! Basically, we can use a capturing group's backreference to tell the Regex engine that a string should end with the same quote character that started it.
Regular Expression JavascriptPCRE flags Test String "this" is going "to be "search" Substitution Expression Flags ignore case (i) global (g) multiline (m) extended (x) extra (X) single line (s) unicode (u) Ungreedy (U) Anchored (A) dup subpattern names(J) Get text between quotes Comments
Next we want to match any string until we encounter an un-escaped quote, but it must be the SAME (e.g. single vs. double) that was matched at the begining. This is where backreferences come in (we need to reference what was matched at the start in order to tell the engine what to look for).
1. Match a single or double quote, as long as it's not preceded by \ 2. Store that match in a way that I can reference later. (with \1) 3. Continue matching ANY characters...
###(?:[^,]*,){8}\K([\s\S]*?)(?=,[^,]*,[^,]*?###)
Try this.Replace by "\1"
or "$1"
.See demo.
https://regex101.com/r/tD0dU9/13
/^(?:[^,]*,){8}([^#]*),[^,]*,[^,]*$/s
https://regex101.com/r/hU8yO6/1
I think the regexp you had is about right, except for needing the /s modifier.
For notepad++, get the s modifier by ticking ". matches newline":
^(?:[^,]*,){8}([^#]*),[^,]*,[^,]*$
This looks like a good reference: http://docs.notepad-plus-plus.org/index.php/Regular_Expressions
You'll probably want to add parens appropriately to make capture groups also.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With