Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for key-value-pair including unescaped whitespace

I need a regex for parsing key-value-pairs from a properties-file to write them into a database. The application is written in java. As I need to store information about comment-lines and empty lines, properties.load does not work for me

Key is everything until the first appearance of an unescaped whitespace or an equals sign (including escaped whitespaces). Value is everything until end of line, but can also be empty.

It has to match the following cases:

  • key=value
  • key value
  • key=value value
  • key
  • key value value
  • key\ key\ key=value
  • key\ key\ key value

I tried the following regex, but it does not seperate the last two cases correctly:

^(\\\s|[^\s=]+)+[\s|=](.*)?$

For the last two examples I get on Rubular:

1. key\
2. key\ key value

instead of

1. key\ key\ key
2. value

I also tried this, but it does not work for me, too

Thanks in advance for help!

like image 960
Sebastian Avatar asked Nov 05 '22 17:11

Sebastian


1 Answers

You want to use a negative lookbehind (?<!\\\\)\s when checking your space

^((.*?)((?<!\\\\)\\s|=)(.*?)|(\\w+))$

Breaking it down

(.*?)             Match everything non greedy up to the next match
((?<!\\\\)\\s|=)  Match witespace not preceded by \\
(.*?)             Again match everything non greedy up to the next match
|\\w+             Or match strings with no whitespace - this captures case 3 with no value

Each case tested with the tool here http://www.cis.upenn.edu/~matuszek/General/RegexTester/regex-tester.html

like image 50
cordsen Avatar answered Nov 09 '22 16:11

cordsen