I want to use word boundary in a regex for matching some unicode text. Unicode letters are detected as word boundary in Python regex as here: <pre class="prettyprint"><code>>>> re.search(r"\by\b","üyü") <_sre.SRE_Match object at 0x02819E58> >>> re.search(r"\by\b","ğyğ") <_sre.SRE_Match object at 0x028250C8> >>> re.search(r"\by\b","uyu") >>> </code></pre> What should I do in order to make the word boundary symbol not match unicode letters?

Use re.UNICODE: <pre class="prettyprint"><code>>>> re.search(r"\by\b","üyü", re.UNICODE) >>> </code></pre>

Word boundary to use in unicode text for Python regex

Tags:

python

regex

unicode

I want to use word boundary in a regex for matching some unicode text. Unicode letters are detected as word boundary in Python regex as here:

>>> re.search(r"\by\b","üyü")
<_sre.SRE_Match object at 0x02819E58>

>>> re.search(r"\by\b","ğyğ")
<_sre.SRE_Match object at 0x028250C8>

>>> re.search(r"\by\b","uyu")
>>>

What should I do in order to make the word boundary symbol not match unicode letters?

861

asked Oct 15 '13 07:10

Mert Nuhoglu

1 Answers

Use re.UNICODE:

>>> re.search(r"\by\b","üyü", re.UNICODE)
>>>

132

answered Nov 14 '22 22:11

Michael Brennan

Related questions
                            
                                Returning a row from a CSV, if specified value within the row matches condition
                            
                                Matplotlib, adding text with more than one line. Adding text that can follow the curve
                            
                                Mongodb replica set auto reconect don't work after down and up for nginx + uwsgi with several processes
                            
                                web scraping dynamic content with python
                            
                                Composition - Reference to another class in Python
                            
                                ValueError: too many values to unpack in Python Dictionary [duplicate]
                            
                                Is python @decorator related to the decorator design pattern?
                            
                                Convert integer to binary in python and compare the bits
                            
                                How to use super() when subclassing Tkinter widgets? [duplicate]
                            
                                Cannot convert array to floats python
                            
                                Matplotlib: using a figure object to initialize a plot
                            
                                basemap: How to remove actual lat/lon lines while keeping the ticks on the axis
                            
                                how to compute a new column based on the values of other columns in pandas - python
                            
                                Reducing memory used by a large dict
                            
                                Python escape character
                            
                                Opencv draws numpy.zeros as a gray image
                            
                                Python function is changing the value of my input, and I can't figure out why
                            
                                Shebang executable not found because of UTF-8 BOM (Byte Order Mark)
                            
                                Kivy - base application has strange alignment
                            
                                My code nests too deep. Is there a better way?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With