I am looking for a way to find JSON data in a string. Think about it like wordpress shortcodes. I figure the best way to do it would be a regular Expression. I do not want to parse the JSON, just find all occurences.
Is there a way in regex to have matching numbers of parentheses? Currently I run into that problem when having nested objects.
Quick example for demonstration:
This is a funny text about stuff, look at this product {"action":"product","options":{...}}. More Text is to come and another JSON string {"action":"review","options":{...}}
As a result i would like to have the two JSON strings. Thanks!
Since you're looking for a simplistic solution, you can use the following regular expression that makes use of recursion to solve the problem of matching set of parentheses. It matches everything between {
and }
recursively.
Although, you should note that this isn't guaranteed to work with all possible cases. It only serves as a quick JSON-string extraction method.
$pattern = ' / \{ # { character (?: # non-capturing group [^{}] # anything that is not a { or } | # OR (?R) # recurses the entire pattern )* # previous group zero or more times \} # } character /x '; preg_match_all($pattern, $text, $matches); print_r($matches[0]);
Output:
Array ( [0] => {"action":"product","options":{...}} [1] => {"action":"review","options":{...}} )
Regex101 Demo
In PHP, the only way to know if a JSON-string is valid is by applying json_decode()
. If the parser understands the JSON-string and is according to the defined standards, json_decode()
will create an object / array representation of the JSON-string.
If you'd like to filter out those that aren't valid JSON, then you can use array_filter()
with a callback function:
function isValidJSON($string) { json_decode($string); return (json_last_error() == JSON_ERROR_NONE); } $valid_jsons_arr = array_filter($matches[0], 'isValidJSON');
Online demo
Javascript folks looking for similar regex. The (?R) which is recursive regex pattern is not supported by javascript, python, and other languages as such.
Note: It's not 1 on 1 replacement.
\{(?:[^{}]|(?R))*\} # PCRE Supported Regex
Steps:
?R
which copied string example\{(?:[^{}]|(?R))*\}
=> \{(?:[^{}]|())*\}
\{(?:[^{}]|(\{(?:[^{}]|(?R))*\}))*\}
=> \{(?:[^{}]|(\{(?:[^{}]|())*\}))*\}
\{(?:[^{}]|(?<n times>))*\}
?R
with blank string.Done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With