I'm creating a CSS editor and am trying to create a regular expression that can get data from a CSS document. This regex works if I have one property but I can't get it to work for all properties. I'm using preg/perl syntax in PHP.
(?<selector>[A-Za-z]+[\s]*)[\s]*{[\s]*((?<properties>[A-Za-z0-9-_]+)[\s]*:[\s]*(?<values>[A-Za-z0-9#, ]+);[\s]*)*[\s]*}
body { background: #f00; font: 12px Arial; }
Array(
[0] => Array(
[0] => body { background: #f00; font: 12px Arial; }
[selector] => Array(
[0] => body
)
[1] => Array(
[0] => body
)
[2] => font: 12px Arial;
[properties] => Array(
[0] => font
)
[3] => Array(
[0] => font
)
[values] => Array(
[0] => 12px Arial
[1] => background: #f00
)
[4] => Array(
[0] => 12px Arial
[1] => background: #f00
)
)
)
Array(
[0] => Array
(
[0] => body { background: #f00; font: 12px Arial; }
[selector] => body
[1] => body
[2] => font: 12px Arial;
[properties] => font
[3] => font
[values] => 12px Arial
[4] => 12px Arial
)
)
Thanks in advance for any help - this has been confusing me all afternoon!
You are trying to pull structure out of the data, and not just individual values. Regular expressions might could be painfully stretched to do the job, but you are really entering parser territory, and should be pulling out the big guns, namely parsers.
I have never used the PHP parser generating tools, but they look okay after a light scan of the docs. Check out LexerGenerator and ParserGenerator. LexerGenerator will take a bunch of regular expressions describing the different types of tokens in a language (in this case, CSS) and spit out some code that recognizes the individual tokens. ParserGenerator will take a grammar, a description of what things in a language are made up of what other things, and spit out a parser, code that takes a bunch of tokens and returns a syntax tree (the data structure that you are after.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With