Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Raw unicode literal that is valid in Python 2 and Python 3?

Apparently the ur"" syntax has been disabled in Python 3. However, I need it! "Why?", you may ask. Well, I need the u prefix because it is a unicode string and my code needs to work on Python 2. As for the r prefix, maybe it's not essential, but the markup format I'm using requires a lot of backslashes and it would help avoid mistakes.

Here is an example that does what I want in Python 2 but is illegal in Python 3:

tamil_letter_ma = u"\u0bae"
marked_text = ur"\a%s\bthe Tamil\cletter\dMa\e" % tamil_letter_ma

After coming across this problem, I found http://bugs.python.org/issue15096 and noticed this quote:

It's easy to overcome the limitation.

Would anyone care to offer an idea about how?

Related: What exactly do "u" and "r" string flags do in Python, and what are raw string literals?

like image 452
Jim K Avatar asked Oct 08 '15 22:10

Jim K


1 Answers

Why don't you just use raw string literal (r'....'), you don't need to specify u because in Python 3, strings are unicode strings.

>>> tamil_letter_ma = "\u0bae"
>>> marked_text = r"\a%s\bthe Tamil\cletter\dMa\e" % tamil_letter_ma
>>> marked_text
'\\aம\\bthe Tamil\\cletter\\dMa\\e'

To make it also work in Python 2.x, add the following Future import statement at the very beginning of your source code, so that all the string literals in the source code become unicode.

from __future__ import unicode_literals
like image 164
falsetru Avatar answered Sep 21 '22 16:09

falsetru