Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to properly escape reserved regex characters in JSON?

Tags:

python

json

regex

I have a JSON file that contains some regex expressions that I want to use in my python code. The problem arises when I try to escape reserved regex characters in the JSON file. When I run the python code, it can't process the json file and throws an exception.

I have already debugged the code and come to the conclusion, that it fails when calling json.loads(ruleFile.read()). Apparently only some characters can be escaped in JSON and the dot is not one of them which causes a syntax error.

try:
    with open(args.rules, "r") as ruleFile:
        rules = json.loads(ruleFile.read())
        for rule in rules:
            rules[rule] = re.compile(rules[rule])
except (IOError, ValueError) as e:
    raise Exception("Error reading rules file")
{
    "Rule 1": "www\.[a-z]{3,10}\.com"
}
Traceback (most recent call last):
  File "foo.py", line 375, in <module>
    main()
  File "foo.py", line 67, in main
    raise Exception("Error reading rules file")
Exception: Error reading rules file

How do I work around this JSON syntax problem?

like image 796
Vuudi Avatar asked Sep 01 '25 01:09

Vuudi


1 Answers

The backslash needs to be escaped in JSON.

{
    "Rule 1": "www\\.[a-z]{3,10}\\.com"
}

From here:

The following characters are reserved in JSON and must be properly escaped to be used in strings:

  • Backspace is replaced with \b
  • Form feed is replaced with \f
  • Newline is replaced with \n
  • Carriage return is replaced with \r
  • Tab is replaced with \t
  • Double quote is replaced with \"
  • Backslash is replaced with \\
like image 153
Robby Cornelissen Avatar answered Sep 02 '25 13:09

Robby Cornelissen