Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression to Match a Single CSS Property

Tags:

.net

regex

css

I currently have a large batch of HTML text and I have several CSS properties that resemble the following:

font:16px/normal Consolas;
font:16px/normal Arial;
font:12px/normal Courier;

which is also bundled with several other CSS properties and other associated HTML values and tags.

I've been trying to write a regular expression that will only grab these "font styles", so if I had the following two paragraphs:

<p style='font:16px/normal Arial; font-weight: x; color: y;'>Stack</p>
<span style='color: z; font:16px/normal Courier;'>Overflow</span>
<br />
<div style='font-family: Segoe UI; font-size: xx-large;'>Really large</div>

it would only match the properties beginning with font: and ending with a semicolon ;.

I've played around using RegexHero and the closest I have gotten was:

\b(?:font[\s*\\]*:[\s*\\]*?(\b.*\b);)

which yielded the following results:

font:bold;                   //Match
font:12pt/normal Arial;      //Match
font:16px/normal Consolas;   //Match
font:12pt/normal Arial;      //Match
property: value;             //Not a Match
property: value value value; //Not a Match

but when I attempted to drop in a large block of HTML, things seemed to get muddled and large blocks were selected rather than within the bounds previously specified.

I'll be glad to provide any additional info and test data that I can.

like image 901
Rion Williams Avatar asked Jun 13 '12 15:06

Rion Williams


People also ask

What regular expression would you use to match a single character?

Use square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

Can we use regex in CSS?

Using the wild character *. This means the string contains our given text. It will search the input tag which contains the 'name' attribute containing 'sel' text.

How do I use regex to match?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).


1 Answers

Try this

\b((?:font:[^;]*?)(?:;|'))

Explanation

\b             # Assert position at a word boundary
(              # Match the regular expression below and capture its match into backreference number 1
   (?:            # Match the regular expression below
      font:          # Match the characters “font:” literally
      [^;]           # Match any character that is NOT a “;”
         *?             # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
   )
   (?:            # Match the regular expression below
                     # Match either the regular expression below (attempting the next alternative only if this one fails)
         ;              # Match the character “;” literally
      |              # Or match regular expression number 2 below (the entire group fails if this one fails to match)
         &apos;              # Match the character “&apos;” literally
   )
)
like image 198
Cylian Avatar answered Nov 08 '22 04:11

Cylian