Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to escape double quotes inside double quotes with preg_replace

Tags:

php

I've been trying to escape all double quotes inside double quotes (yeah, crazy) all day long and i'm finally giving up. I have this kind of data:

{ "test": "testing with "data" like this", "subject": "trying the "special" chars" }

I've been trying to preg_replace every " with \" inside something like this /"(.*)+, "/ which means everything inside double quotes, followed by a comma and space.

I need a way to turn this:

{ "test": "testing with "data" like this", "subject": "trying the "special" chars" }

Into this:

{ "test": "testing with \"data\" like this", "subject": "trying the \"special\" chars" }

Using preg_replace.

like image 504
vinnylinux Avatar asked Sep 05 '12 22:09

vinnylinux


1 Answers

Looking at your regex I would suggest reading up on regex greediness. If you are selecting everything between quotes to the first comma, you will run into problems. The first thing returned would be test": "testing with "data" like this so then if you replaced all " with \" you would have test\": \"testing with \"data\" like this which obviously is not what you want. I would recommend using something like this:

/"((?:.|\n)*?)"\s*[:,}]\s*/

Explanation

  • "((?:.|\n)*?)" - captures any character between two quotations; the minimum amount while still having the pattern be true
  • \s* - matches 0 or more whitespace characters
  • [:,}] - match either a colon, comma, or right bracket character
  • \s* - matches 0 or more whitespace characters

Using this regex and your data, the first thing returned is test. The next thing returned would be testing with "data" like this so after replacing you would have testing with \"data\" like this.


UPDATE
$test = '{ "test": "testing with "data" like this", "subject": "trying the "special" chars" }';
$pattern = '/"((?:.|\n)*?)"\s*[:,}]\s*/';
preg_match_all($pattern, $test, $matches);
foreach($matches[1] as $match){
    $answers[] = str_replace('"','\\"',$match);
}
print_r($answers);
// Outputs
// Array ( [0] => test [1] => testing with \"data\" like this [2] => subject [3] => trying the \"special\" chars )


UPDATE 2

I think using preg_match_all and then str_replace is a better way to solve your problem because that regex is much more stable. But if you insist on using preg_replace then you can use this code:

$string = '{ "test": "testing with "data" like this", "subject": "trying the "special" chars" }';
$pattern = '/(?<!:|: )"(?=[^"]*?"(( [^:])|([,}])))/';
$string = preg_replace($pattern, '\\"', $string);
print_r($string);
//Outputs
//{ "test": "testing with \"data\" like this", "subject": "trying the \"special\" chars" }

Explanation

  • (?<! - starts a negative lookbehind
  • :|: ) - matches a colon or a colon with a space and ends the lookbehind
  • " - matches a quotation
  • (?= - starts a positive lookahead
  • [^"]*? - match anything except a quotation; the minimum amount while still having the pattern be true
  • "(( [^:])|([,}])) - matches a quotation followed by a space and anything but a colon OR it matches a quotation followed by either a comma or right bracket
  • ) - ends the lookahead

You can read more about regex lookaheads here. I think that this regex is messy although technically it works. I was going to keep playing with it to make it better but I'm tired so I'm going to bed now. This regex allows your data to be more loosely typed. Both of these work, and any combination of them:

{ "test" : "testing with "data" like this" , "subject" : "trying the "special" chars" }
{"test":"testing with "data" like this","subject":"trying the "special" chars"}
like image 199
Aust Avatar answered Oct 29 '22 12:10

Aust