Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "import" prefer - .pyd (.so) or .py?

I have 2 files in same directory, a compiled library file and source file:

.
├── a.py
└── a.pyd

It looks like import a that actually imports the a.pyd module. But I can't find some official document guaranteeing that.

Does any one know about the import ordering of different file type?

This same question applies to Unix Python extensions (.so)

like image 432
黃瀚嶙 Avatar asked Nov 04 '19 02:11

黃瀚嶙


2 Answers

In a typical Python installation, the ExtensionFileLoader class has precedence over the SourceFileLoader that is used for .py files. It's the ExtensionFileLoader which handles imports of .pyd files, and on a Windows machine you will find .pyd registered in importlib.machinery.EXTENSION_SUFFIXES (note: on Linux/macOS it will have .so in there instead).

So in the case of name collision within same directory (which means a "tie" when looking through sys.path in order), the a.pyd file takes precedence over the a.py file. You may verify that when creating empty a.pyd and a.py files, the statement import a attempts the DLL load (and fails, of course).

To see the precedence in the CPython sources, look here in importlib._bootstrap_external. _get_supported_file_loaders:

def _get_supported_file_loaders():
    """Returns a list of file-based module loaders.
    Each item is a tuple (loader, suffixes).
    """
    extensions = ExtensionFileLoader, _imp.extension_suffixes()
    source = SourceFileLoader, SOURCE_SUFFIXES
    bytecode = SourcelessFileLoader, BYTECODE_SUFFIXES
    return [extensions, source, bytecode]  # <-- extensions before source!

For a doc reference, see http://www.python.org/doc/essays/packages/

What If I Have a Module and a Package With The Same Name?

You may have a directory (on sys.path) which has both a module spam.py and a subdirectory spam that contains an _init_.py (without the _init_.py, a directory is not recognized as a package). In this case, the subdirectory has precedence, and importing spam will ignore the spam.py file, loading the package spam instead. If you want the module spam.py to have precedence, it must be placed in a directory that comes earlier in sys.path.

(Tip: the search order is determined by the list of suffixes returned by the function imp.get_suffixes(). Usually the suffixes are searched in the following order: ".so", "module.so", ".py", ".pyc". Directories don't explicitly occur in this list, but precede all entries in it.)

This doc doesn't explicitly mention ".pyd", but that's the Windows equivalent of ".so". I've just tested on a Windows machine, and indeed '.pyd' appears before '.py' in the suffix list.

Note that the reference given above is very old! Since this essay was written, the import system has been completely revamped, and the underlying machinery has been exposed for user access: you can mutate the sys.meta_path to register your own loaders or change precedence, for example. So it would be possible now to customize for '.py' to be preferred to '.pyd', and it doesn't matter much what imp.get_suffixes() has to say about anything (actually, that function is deprecated now). A default Python installation would not do that, of course, and the default precedence remains the same as the reference above has mentioned.

like image 103
wim Avatar answered Nov 15 '22 18:11

wim


Thanks for wim's answer.

import importlib.util
print(importlib.util.find_spec('a'))

show the result

ModuleSpec(name='a', loader=<_frozen_importlib_external.ExtensionFileLoader object at 0x02A79EF0>, origin='a.pyd')

Although I cant see the order of pyd,py.

At least I can distinguish which one that I import to modular.

like image 38
黃瀚嶙 Avatar answered Nov 15 '22 17:11

黃瀚嶙