How do I rewrite this new way to recognise addresses to work in Python? <code>\b(([\w-]+://?|www[.])[^\s()<>]+(?:$[\w\d]+$|([^[:punct:]\s]|/)))</code>

The original source for that states "This pattern should work in most modern regex implementations" and specifically Perl. Python's regex implementation is modern and similar to Perl's but is missing the <code>[:punct:]</code> character class. You can easily build that using this: <pre class="prettyprint"><code>>>> import string, re >>> pat = r'\b(([\w-]+://?|www[.])[^\s()<>]+(?:$[\w\d]+$|([^%s\s]|/)))' >>> pat = pat % re.sub(r'([-\\\]])', r'\\\1', string.punctuation) </code></pre> The <code>re.sub()</code> call escapes certain characters inside the character set as required. Edit: Using re.escape() works just as well, since it just sticks a backslash in front of everything. That felt crude to me at first, but certainly works fine for this case. <pre class="prettyprint"><code>>>> pat = pat % re.escape(string.punctuation) </code></pre>

I don't think python have this expression <pre class="prettyprint"><code>[:punct:] </code></pre> Wikipedia says <code>[:punct:]</code> is same to <pre class="prettyprint"><code>[-!\"#$%&\'()*+,./:;<=>?@\\[\\\\]^_`{|}~] </code></pre>

Gruber’s URL Regular Expression in Python

2 Answers

The original source for that states "This pattern should work in most modern regex implementations" and specifically Perl. Python's regex implementation is modern and similar to Perl's but is missing the [:punct:] character class. You can easily build that using this:

Click to copy

>>> import string, re
>>> pat = r'\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^%s\s]|/)))'
>>> pat = pat % re.sub(r'([-\\\]])', r'\\\1', string.punctuation)

The re.sub() call escapes certain characters inside the character set as required.

Edit: Using re.escape() works just as well, since it just sticks a backslash in front of everything. That felt crude to me at first, but certainly works fine for this case.

Click to copy

>>> pat = pat % re.escape(string.punctuation)

102

answered Oct 06 '22 00:10

Peter Hansen

I don't think python have this expression

Click to copy

[:punct:]

Wikipedia says [:punct:] is same to

Click to copy

[-!\"#$%&\'()*+,./:;<=>?@\\[\\\\]^_`{|}~]

answered Oct 05 '22 23:10

YOU

Related questions
                            
                                Check if a given key is contained in any of multiple dictionaries
                            
                                Invalid syntax on importing nltk in python 2.7
                            
                                Trying to Understand FB Prophet Cross Validation
                            
                                How to split a list into two random parts [duplicate]
                            
                                Python Structural Pattern Matching
                            
                                Could not load library cudnn_cnn_infer64_8.dll. Error code 126
                            
                                Newbie Python Question about tuples
                            
                                python - readable list of objects
                            
                                Your most unpythonic code snippet [closed]
                            
                                How to 'zoom' in on a section of the Mandelbrot set?
                            
                                Crunching xml with python
                            
                                Delete all files/directories except two specific directories
                            
                                python3.0: imputils
                            
                                Which credentials should I put in for Google App Engine BulkLoader at development server?
                            
                                python string replacement with % character/**kwargs weirdness
                            
                                Python and factories
                            
                                Is there any "remote console" for twisted server?
                            
                                Find (and keep) duplicates of sublist in python
                            
                                String formatting expressions (Python)
                            
                                What are the memory requirements for large python list?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Gruber’s URL Regular Expression in Python

Tags:

python

regex

gruber

Tobias

People also ask

2 Answers

Peter Hansen

YOU

Recent Activity

Donate For Us