In the upcoming Cython 3.0 version, 3str
language_level (which was introduced with Cython 0.29) becomes the new default instead of the current default 2
, i.e. if language_level is not set (how to set), we get the following warning:
FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /home/ed/mygithub/cython/foo.pyx tree = Parsing.p_module(s, pxd, full_module_name)
But what are the differences between 3str
and 3
language levels and for which code will there be differences in the behavior of modules compiled with 3str
and 3
language levels?
language_level
is used to indicate in which Python-version the pyx-file is written. Thus for language_level=3
the resulting behavior of the pyx-code is as if it were executed in Python3 even when the resulting extension is run with Python2 (see a more detailed explanation here).
Language level 3str
means "Python3 semantics, but with str literals (also in Python2.7)" - thus str
in the name. What are exactly the consequences?
Python3: When built in/for Python3 there are no differences between level 3
and level 3str
.
In Python3, str
is unicode
, so the type of
# foo.pyx
def test():
return type("aaa")
will stay the same (str
) for language_level=3
and language_level=3str
.
Python2: The situation is different when built with/for Python2. With language_level=3
the result of the above test
-function will be unicode
and with language_level=3str
the result will be str
(which is bytes in Python2). But also for Python2, in all other cases, 3
and 3str
have the same behavior.
It would be a mistake to think, that
cdef char *c_string = "some string"
would fail to build with language_level=3
(and build successfuly with 3str
for Python2, as "some string" were bytes
), because "some string"
is unicode and unicode literals can be only to coerced only to Py_UNICODE*
.
The literal on the righthand-side isn't a Python-object to begin with, but just a C-string in the generated C-code.
TLDR: 3str
does not assume that string literals are unicode under Python2.x making migration from Python2.x to Python3 easier.
Not a complete answer because I don't know the code to highlight the differences and this still leaves room for questions, but this may be useful, Whats new in cython 0.29:
A new language-level'
Cython 0.29 supports a new setting for the
language_level
directive,language_level=3str
, which will become the new default language level in Cython 3.0. We already added it now, so that users can opt in and benefit from it right away, and already prepare their code for the coming change. It's an "in between" kind of setting, which enables all the nice Python 3 goodies that are not syntax compatible with Python 2.x, but without requiring all unprefixed string literals to become Unicode strings when the compiled code runs in Python 2.x. This was one of the biggest problems in the general Py3 migration. And in the context of Cython's integration with C code, it got in the way of our users even a bit more than it would in Python code. Our goals are to make it easy for new users who come from Python 3 to compile their code with Cython and to allow existing (Cython/Python 2) code bases to make use of the benefits before they can make a 100% switch.
Also noted by Debian's manpage for cython:
--embed[=<method_name>]
Generate a main() function that embeds the Python interpreter.-2
Compile based on Python-2 syntax and code semantics.-3
Compile based on Python-3 syntax and code semantics.--3str
Compile based on Python-3 syntax and code semantics without assuming unicode by default for string literals under Python 2.
Lastly noted by cython docs:
The
3str
option enables Python 3 semantics but does not change thestr
type and unprefixed string literals tounicode
when the compiled code runs in Python 2.x.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With