While testing on http://gskinner.com/RegExr/ (online regex tester), the regex [jpg|bmp]
returns results when either jpg or bmp exist, however, when I run this regex in python, it only return j or b. How do I make the regex take the whole word "jpg" or "bmp" inside the set ? This may have been asked before however I was not sure how to structure question to find the answer. Thanks !!!
Here is the whole regex if it helps
"http://www\S*(?i)\\.(jpg|bmp|png|gif|img|jng|jpeg|jpe|gif|giff)"
Its just basically to look for pictures in a url
[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
\s | Matches whitespace characters, which include the \t , \n , \r , and space characters. \S | Matches non-whitespace characters.
Use (jpg|bmp)
instead of square brackets.
Square brackets mean - match a character from the set in the square brackets.
Edit - you might want something like that: [^ ].*?(jpg|bmp)
or [^ ].*?\.(jpg|bmp)
When you are using []
your are creating a character class that contains all characters between the brackets.
So your are not matching for jpg
or bmp
you are matching for either a j
or a p
or a g
or a |
...
You should add an anchor for the end of the string to your regex
http://www\S*(?i)\\.(jpg|bmp|png|gif|img|jng|jpeg|jpe|gif|giff)$
^ ^^
if you need double escaping then every where in your pattern
http://www\\S*(?i)\\.(jpg|bmp|png|gif|img|jng|jpeg|jpe|gif|giff)$
to ensure that it checks for the file ending at the very end of the string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With