Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are differences between Cython's language_level 3 and 3str?

Tags:

python

cython

In the upcoming Cython 3.0 version, 3str language_level (which was introduced with Cython 0.29) becomes the new default instead of the current default 2, i.e. if language_level is not set (how to set), we get the following warning:

FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /home/ed/mygithub/cython/foo.pyx tree = Parsing.p_module(s, pxd, full_module_name)

But what are the differences between 3str and 3 language levels and for which code will there be differences in the behavior of modules compiled with 3str and 3 language levels?

like image 523
ead Avatar asked Aug 08 '19 14:08

ead


2 Answers

language_level is used to indicate in which Python-version the pyx-file is written. Thus for language_level=3 the resulting behavior of the pyx-code is as if it were executed in Python3 even when the resulting extension is run with Python2 (see a more detailed explanation here).

Language level 3str means "Python3 semantics, but with str literals (also in Python2.7)" - thus str in the name. What are exactly the consequences?

Python3: When built in/for Python3 there are no differences between level 3 and level 3str.

In Python3, str is unicode, so the type of

# foo.pyx
def test():
   return type("aaa")

will stay the same (str) for language_level=3 and language_level=3str.

Python2: The situation is different when built with/for Python2. With language_level=3 the result of the above test-function will be unicode and with language_level=3str the result will be str (which is bytes in Python2). But also for Python2, in all other cases, 3 and 3str have the same behavior.


It would be a mistake to think, that

cdef char *c_string = "some string"

would fail to build with language_level=3 (and build successfuly with 3str for Python2, as "some string" were bytes), because "some string" is unicode and unicode literals can be only to coerced only to Py_UNICODE*.

The literal on the righthand-side isn't a Python-object to begin with, but just a C-string in the generated C-code.

like image 146
ead Avatar answered Sep 28 '22 08:09

ead


TLDR: 3str does not assume that string literals are unicode under Python2.x making migration from Python2.x to Python3 easier.

Not a complete answer because I don't know the code to highlight the differences and this still leaves room for questions, but this may be useful, Whats new in cython 0.29:

A new language-level'

Cython 0.29 supports a new setting for the language_level directive, language_level=3str, which will become the new default language level in Cython 3.0. We already added it now, so that users can opt in and benefit from it right away, and already prepare their code for the coming change. It's an "in between" kind of setting, which enables all the nice Python 3 goodies that are not syntax compatible with Python 2.x, but without requiring all unprefixed string literals to become Unicode strings when the compiled code runs in Python 2.x. This was one of the biggest problems in the general Py3 migration. And in the context of Cython's integration with C code, it got in the way of our users even a bit more than it would in Python code. Our goals are to make it easy for new users who come from Python 3 to compile their code with Cython and to allow existing (Cython/Python 2) code bases to make use of the benefits before they can make a 100% switch.

Also noted by Debian's manpage for cython:

--embed[=<method_name>] Generate a main() function that embeds the Python interpreter.
-2 Compile based on Python-2 syntax and code semantics.
-3 Compile based on Python-3 syntax and code semantics.
--3str Compile based on Python-3 syntax and code semantics without assuming unicode by default for string literals under Python 2.

Lastly noted by cython docs:

The 3str option enables Python 3 semantics but does not change the str type and unprefixed string literals to unicode when the compiled code runs in Python 2.x.

like image 31
Error - Syntactical Remorse Avatar answered Sep 28 '22 09:09

Error - Syntactical Remorse