I am trying to extract all the image (.jpg, .png, .gif) uri's from css files.
Sample css
.blockpricecont{width:660px;height:75px;background:url('../images/postBack.jpg')
repeat-x;/*background:url('../images/tabdata.jpg') repeat-x;*/border: 1px solid #B7B7B7;
regex used -
images = re.compile("(?:\()(?:'|\")?(.*\.jpg('?))", flags=re.IGNORECASE)
The problem is, there are few css classes with commented code in it (/* ---- */) and these comments contain .jpg reference. The output I am getting for the above regex is
output
["../images/postBack.jpg') repeat-x;/*background:url('../images/tabdata.jpg'"]
expected output:
["../images/postBack.jpg"]
I want my regex to stop at the first match of .jpg but its continuing till the end of the line.
Thanks in advance.
print re.findall('url\(([^)]+)\)',target_text)
I think that should work
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With