Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get URL of images in CSS file using Java?

I'm trying to get the URLs for images (all MIME types) in a remote CSS file using Java.

I am using jsoup to get the URL of the css.

After countless hours of looking at CSS Parser I couldn't figure it out due to the lack of documentation.

I also looked at some other treads, but have just confused me even more:

  • Parsing a css file with java
  • Looking for a CSS Parser in java

I've also seen some examples using regex, but I am not too familiar how to implement it in java.

Would anyone have some suggestions on how to go at this problem?

like image 328
pbojinov Avatar asked Nov 21 '11 06:11

pbojinov


People also ask

How do I get an image URL in CSS?

Usage is simple — you insert the path to the image you want to include in your page inside the brackets of url() , for example: background-image: url('images/my-image. png'); Note about formatting: The quotes around the URL can be either single or double quotes, and they are optional.

How to refer to URL in CSS?

A string which may specify a URL or the ID of an SVG shape. If you choose to write the URL without quotes, use a backslash ( \ ) before any parentheses, whitespace characters, single quotes ( ' ) and double quotes ( " ) that are part of the URL.


1 Answers

In Java, you have to use a Pattern and a Matcher from the java.util.regex package.

You compile your pattern, then you instantiate your matcher with your string and then you look for everything that matches your pattern.

Pattern p = Pattern.compile("...");
Matcher m = p.matcher("your CSS file as a String");
while (m.find()) {
  // Here use m.group(), m.group(1), ...
}

The CSS 2.1 spec states:

The format of a URI value is 'url(' followed by optional white space followed by an optional single quote (') or double quote (") character followed by the URI itself, followed by an optional single quote (') or double quote (") character followed by optional white space followed by ')'. The two quote characters must be the same.

Thus you could use a regex like this one:

url\(\s*(['"]?+)(.*?)\1\s*\)

The .*? is non-greedy allowing you to take as few characters as necessary. The possessive quantifier avoids any backtrack in ['"]?+.

like image 111
Ludovic Kuty Avatar answered Oct 02 '22 12:10

Ludovic Kuty