python regular expression to match js or php url

Question

I tried to match js and php url with python re but expression below doesn't work, anyone can help me?

import re, urllib2
response = urllib2.urlopen('https://www.cnn.com')
s = response.read()
p = re.compile(r'^(http|https|//).+?\.(js|php)$')
m = p.findall(s)

for i in m:
    print i

Also, some Web pages use //, not http or https. Is there any way to match those, too?

Wiktor Stribiżew · Accepted Answer

You seem to want to match URLs that end with file extensions js and php, that may start with http, https or //.

Use

import re
s = "https://www.cnn.com/1.js!! http://www.cnn.com/2.php; //some.site.com/3.js,"
res = re.findall(r'(?:\bhttps?:)?//\S*\.(?:js|php)\b', s)
print(res)

See the Python demo

Details:

(?:\bhttps?:)? - an optional sequence of
- \b - a leading word boundary
- https?: - http, 1 or 0 (=optional) s, and a :
// - a literal char sequence //
\S* - zero or more non-whitespace symbols
\. - a dot
(?:js|php) - js or php literal char sequences
\b - a trailing word boundary

python regular expression to match js or php url

Tags:

python

regex

Jerry

1 Answers

Wiktor Stribiżew

Recent Activity

Donate For Us

python regular expression to match js or php url

Tags:

python

regex

Jerry

1 Answers

Wiktor Stribiżew

Related questions

Recent Activity

Donate For Us