Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find JSON strings in a string

Tags:

I am looking for a way to find JSON data in a string. Think about it like wordpress shortcodes. I figure the best way to do it would be a regular Expression. I do not want to parse the JSON, just find all occurences.

Is there a way in regex to have matching numbers of parentheses? Currently I run into that problem when having nested objects.

Quick example for demonstration:

This is a funny text about stuff, look at this product {"action":"product","options":{...}}. More Text is to come and another JSON string {"action":"review","options":{...}} 

As a result i would like to have the two JSON strings. Thanks!

like image 658
rootman Avatar asked Feb 24 '14 17:02

rootman


Video Answer


2 Answers

Extracting the JSON string from given text

Since you're looking for a simplistic solution, you can use the following regular expression that makes use of recursion to solve the problem of matching set of parentheses. It matches everything between { and } recursively.

Although, you should note that this isn't guaranteed to work with all possible cases. It only serves as a quick JSON-string extraction method.

$pattern = ' / \{              # { character     (?:         # non-capturing group         [^{}]   # anything that is not a { or }         |       # OR         (?R)    # recurses the entire pattern     )*          # previous group zero or more times \}              # } character /x ';  preg_match_all($pattern, $text, $matches); print_r($matches[0]); 

Output:

Array (     [0] => {"action":"product","options":{...}}     [1] => {"action":"review","options":{...}} ) 

Regex101 Demo


Validating the JSON strings

In PHP, the only way to know if a JSON-string is valid is by applying json_decode(). If the parser understands the JSON-string and is according to the defined standards, json_decode() will create an object / array representation of the JSON-string.

If you'd like to filter out those that aren't valid JSON, then you can use array_filter() with a callback function:

function isValidJSON($string) {     json_decode($string);     return (json_last_error() == JSON_ERROR_NONE); }  $valid_jsons_arr = array_filter($matches[0], 'isValidJSON'); 

Online demo

like image 180
Amal Murali Avatar answered Sep 19 '22 15:09

Amal Murali


Javascript folks looking for similar regex. The (?R) which is recursive regex pattern is not supported by javascript, python, and other languages as such.

Note: It's not 1 on 1 replacement.

 \{(?:[^{}]|(?R))*\} # PCRE Supported Regex 

Steps:

  1. Copy the whole regex and replace ?R which copied string example
  • level 1 json => \{(?:[^{}]|(?R))*\} => \{(?:[^{}]|())*\}
  • level 2 json => \{(?:[^{}]|(\{(?:[^{}]|(?R))*\}))*\} => \{(?:[^{}]|(\{(?:[^{}]|())*\}))*\}
  • level n json => \{(?:[^{}]|(?<n times>))*\}
  1. when decided to stop at some level replace ?R with blank string.

Done.

like image 37
Krishna Avatar answered Sep 22 '22 15:09

Krishna