Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex Wrapping Quotes

I am trying to wrap quotes around certain section of content in a CSV file, the current layout is something like this:

 ###element1,element2,element3,element4,element5,element6,element7,element8, "element9,
element9,""element9"",element9,
element9,element9,""element9",element10,
###

the ### symbols depict a new line and each new line should have one, the problem is I need to get to all of element 9 in to one set of double quotes, however there are multiple instances of doublequotes within that area which break up the element in to new fields making my table expand beyond the fields I initially set. So I believe I need to remove all the " marks between the start and end of element9 and then reintroduce one set to highlight the whole section.

I approached this firstly by trying to select the 8th Comma from the start and the 2 comma from the end:

 ^((?:[^,]+,){8})(.+)((?:,[^,]*){2})$

and replacing with

$1"$2"$3

I tried to target the starting ### and ending ### to select those two elements but with no success.

any suggestions on how I can do this

UPDATE

    ###BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,
BLAHBLAH,
BLAHBLAH,
BLAHBLAH, BLAHBLAH,
BLAHBLAH, BLAHBLAH,
BLAHBLAH,
"BLAHBLAH""",E,
###

The last field always seem to contain a capital letter, the fields before vary in quotation placement so to really target that whole section I need to work out how many commas along and how many back I need to go, remove the quotes and then reinstate them in the correct positions.

like image 774
Dan W Avatar asked Oct 15 '15 10:10

Dan W


People also ask

How useful are backreferences in regex?

What a great opportunity to explore how useful backreferences can be! Basically, we can use a capturing group's backreference to tell the Regex engine that a string should end with the same quote character that started it.

What are the regular expression flags in javascriptpcre?

Regular Expression JavascriptPCRE flags Test String "this" is going "to be "search" Substitution Expression Flags ignore case (i) global (g) multiline (m) extended (x) extra (X) single line (s) unicode (u) Ungreedy (U) Anchored (A) dup subpattern names(J) Get text between quotes Comments

How do you match a string until an unescaped quote?

Next we want to match any string until we encounter an un-escaped quote, but it must be the SAME (e.g. single vs. double) that was matched at the begining. This is where backreferences come in (we need to reference what was matched at the start in order to tell the engine what to look for).

How do you match a quote in a text file?

1. Match a single or double quote, as long as it's not preceded by \ 2. Store that match in a way that I can reference later. (with \1) 3. Continue matching ANY characters...


2 Answers

###(?:[^,]*,){8}\K([\s\S]*?)(?=,[^,]*,[^,]*?###)

Try this.Replace by "\1" or "$1".See demo.

https://regex101.com/r/tD0dU9/13

like image 100
vks Avatar answered Oct 24 '22 19:10

vks


/^(?:[^,]*,){8}([^#]*),[^,]*,[^,]*$/s

https://regex101.com/r/hU8yO6/1

I think the regexp you had is about right, except for needing the /s modifier.

For notepad++, get the s modifier by ticking ". matches newline":

^(?:[^,]*,){8}([^#]*),[^,]*,[^,]*$

This looks like a good reference: http://docs.notepad-plus-plus.org/index.php/Regular_Expressions

You'll probably want to add parens appropriately to make capture groups also.

like image 29
Jeff Y Avatar answered Oct 24 '22 20:10

Jeff Y