I am doing some parsing and introspection of various modules, but I don't want to parse built-in modules. Now, there is no special type for built-in modules like there is a types.BuiltinFunctionType
, so how do I do this?
>>> import CornedBeef
>>> CornedBeef
<module 'CornedBeef' from '/meatish/CornedBeef.pyc'>
>>> CornedBeef.__file__
'/meatish/CornedBeef.pyc'
>>> del CornedBeef.__file__
>>> CornedBeef
<module 'CornedBeef' (built-in)>
According to Python, a module is apparently built-in if it doesn't have a __file__
attribute. Does this mean that hasattr(SomeModule, '__file__')
is the way to check if a module is built in? Surely, it isn't exactly common to del SomeModule.__file__
, but is there a more solid way to determine if a module is built-in?
Now, there is no special type for built-in modules like there is a types. BuiltinFunctionType , so how do I do this? According to Python, a module is apparently built-in if it doesn't have a __file__ attribute. Does this mean that hasattr(SomeModule, '__file__') is the way to check if a module is built in?
We can also use the inspect module in python to locate a module. We will use inspect. getfile() method of inspecting module to get the path. This method will take the module's name as an argument and will return its path.
sys.builtin_module_names
A tuple of strings giving the names of all modules that are compiled into this Python interpreter. (This information is not available in any other way — modules.keys() only lists the imported modules.)
If you consider it simply as asked, builtins
, then the accepted answer is obviously correct.
In my case, I was looking for the standard library as well, by which I mean a list of all importable modules shipped with a given Python distribution. Questions about this have been asked several times but I couldn't find an answer that included everything I was looking for.
My use case was bucketing an arbitrary x
in a Python import x
statement as either:
This will work for virtualenvs or a global install. It queries the distribution of whatever python binary is running the script. The final chunk does reaches out of a virtualenv, but I consider that the desired behavior.
# You may need to use setuptools.distutils depending on Python distribution (from setuptools import distutils)
import distutils
import glob
import os
import pkgutil
import sys
def get_python_library():
# Get list of the loaded source modules on sys.path.
modules = {
module
for _, module, package in list(pkgutil.iter_modules())
if package is False
}
# Glob all the 'top_level.txt' files installed under site-packages.
site_packages = glob.iglob(os.path.join(os.path.dirname(os.__file__)
+ '/site-packages', '*-info', 'top_level.txt'))
# Read the files for the import names and remove them from the modules list.
modules -= {open(txt).read().strip() for txt in site_packages}
# Get the system packages.
system_modules = set(sys.builtin_module_names)
# Get the just the top-level packages from the python install.
python_root = distutils.sysconfig.get_python_lib(standard_lib=True)
_, top_level_libs, _ = list(os.walk(python_root))[0]
return sorted(top_level_libs + list(modules | system_modules))
Returns
A sorted list of imports: [..., 'imaplib', 'imghdr', 'imp', 'importlib', 'imputil', 'inspect', 'io', ...]
Explanation:
I broke it up into chunks so the reason each group is needed can be clear.
modules
pkgutil.iter_modules
call scans all loaded modules on sys.path
and returns a generator of (module_loader, name, ispkg)
tuples.site_packages
modules
list. This roughly corresponds to the third party deps.pip.get_installed_distributions
or site
. But pip
returns the module names as they are on PyPi, not as they are when imported into a source file. Certain pathological packages would slip through the cracks, like:
requests-futures
which is imported as requests_futures
.colors
, which is actually ansicolors
on PyPi and thus confounds any reasonable heuristic.top_level.txt
in their package. But this covered 100% of my use cases seems to work on everything that is correctly configured.system_modules
sys
, gc
, errno
and some other optional modules.top_level_libs
distutils.sysconfig.get_python_lib(standard_lib=True)
call returns the top-level directory of the platform independent standard library.email
, logging
, xml
and a few more.Conclusion
For my 2013 MacBookPro I found 403 modules for the python2.7
install.
>>> print(sys.version)
2.7.10 (default, Jul 13 2015, 12:05:58)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)]
>>> print(sys.hexversion)
34015984
>>> python_stdlib = get_python_libirary()
>>> len(python_stdlib)
403
I put up a gist of the code and output. If you think I am missing a class or have included a bogus module, I would like to hear about it.
* Alternatives
In writing this post I dug around the pip
and setuptools
API. It is possible that this information through a single module but you would really need to know your way around that API.
Before I started this, I was told that six
has a function specifically for this problem. It makes sense that might exist but I couldn't find it myself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With