I need to parse certain invalid JSON in Ruby.
Something like:
json_str = '{name:"Javier"}'
ActiveSupport::JSON.decode json_str
As you can see, it's invalid because the hash key is not quoted, it should be
json_str = '{"name":"Javier"}'
But that can't be changed and I have to parse the keys unquoted.
I could parse it with ActiveSupport 2.x, but ActiveSupport 3 doesn't allow me. It throws me:
Yajl::ParseError: lexical error: invalid string in json text.
{name:"Javier"}
(right here) ------^
By the way, it's a Ruby application using some Rails libraries, but it's not a Rails application
Thanks in advance
I would use a regular expression to fix this invalid JSON:
json_str = '{name:"Javier"}'
json_str.gsub!(/(['"])?([a-zA-Z0-9_]+)(['"])?:/, '"\2":')
hash = Yajl::Parser.parse(json_str)
Here's a somewhat robust regex you can use. It's not perfect -- specifically it doesn't work in some corner cases where the values themselves contain json-like text, but it will work in most general cases:
quoted_json = unquoted_json.gsub(/([{,]\s*)(\w+)(\s*:\s*["\d])/, '\1"\2"\3')
First it looks for either a {
or ,
which are the options for the character preceding a key name (also allows any amount of whitespace with \s*
). It captures this as a group:
([{,]\s*)
Then it captures the key itself, which is composed of letters, digits, and underscores (which regex conveniently supplies a \w
character class for):
(\w+)
Finally, it matches what must follow a key name; i.e. a colon followed by either a start quote (for a string value) or a digit (for a numeric value). Also allows extra whitespace, and captures the whole thing in a group:
(\s*:\s*["\d])
For each match, it just puts the three pieces back together, but with quotes around the key (so quotes around capture group #2):
'\1"\2"\3'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With