Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a module/library/package is part of the python standard library?

I have installed sooo many libraries/modules/packages with pip and now I cannot differentiate which is native to the python standard library and which is not. This causes problem when my code works on my machine but it doesn't work anywhere else.

How can I check if a module/library/package that I import in my code is from the python stdlib?

Assume that the checking is done on the machine with all the external libraries/modules/packages, otherwise I could simply do a try-except import on the other machine that doesn't have them.

For example, I am sure these imports work on my machine, but when it's on a machine with only a plain Python install, it breaks:

from bs4 import BeautifulSoup
import nltk
import PIL
import gensim
like image 983
alvas Avatar asked Mar 05 '14 10:03

alvas


People also ask

Is pip part of the standard library?

Pip is a package manager for Python that allows you to install additional libraries and packages that are not part of the standard Python library such as the ones found in the Python Package Index.

How many modules are in Python standard library?

The Python standard library contains well over 200 modules, although the exact number varies between distributions.

How do I find out what modules are in a python package?

Get the location of a particular module in Python using the OS module. For a pure Python module, we can locate its source by module_name. __file__. This will return the location where the module's .


2 Answers

You'd have to check all modules that have been imported to see if any of these are located outside of the standard library.

The following script is not bulletproof but should give you a starting point:

import sys
import os

external = set()
exempt = set()
paths = (os.path.abspath(p) for p in sys.path)
stdlib = {p for p in paths
          if p.startswith((sys.prefix, sys.real_prefix)) 
          and 'site-packages' not in p}
for name, module in sorted(sys.modules.items()):
    if not module or name in sys.builtin_module_names or not hasattr(module, '__file__'):
        # an import sentinel, built-in module or not a real module, really
        exempt.add(name)
        continue

    fname = module.__file__
    if fname.endswith(('__init__.py', '__init__.pyc', '__init__.pyo')):
        fname = os.path.dirname(fname)

    if os.path.dirname(fname) in stdlib:
        # stdlib path, skip
        exempt.add(name)
        continue

    parts = name.split('.')
    for i, part in enumerate(parts):
        partial = '.'.join(parts[:i] + [part])
        if partial in external or partial in exempt:
            # already listed or exempted
            break
        if partial in sys.modules and sys.modules[partial]:
            # just list the parent name and be done with it
            external.add(partial)
            break

for name in external:
    print name, sys.modules[name].__file__

Put this is a new module, import it after all imports in your script, and it'll print all modules that it thinks are not part of the standard library.

like image 196
Martijn Pieters Avatar answered Oct 14 '22 21:10

Martijn Pieters


The standard library is defined in the documentation of python. You can just search there, or put the module names into a list and check programmatically with that.

Alternatively, in python3.4 there's a new isolated mode that allows to ignore a certain number of user-defined library paths. In previous versions of python you can use -s to ignore the per-user environment and -E to ignore the system defined variables.

In python2 a very simple way to check if a module is part of the standard library is to clear the sys.path:

>>> import sys
>>> sys.path = []
>>> import numpy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named numpy
>>> import traceback
>>> import os
>>> import re

However this doesn't work in python3.3+:

>>> import sys
>>> sys.path = []
>>> import traceback
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'traceback'
[...]

This is because starting with python3.3 the import machinery was changed, and importing the standard library uses the same mechanism as importing any other module (see the documentation).

In python3.3 the only way to make sure that only stdlib's imports succeed is to add only the standard library path to sys.path, for example:

>>> import os, sys, traceback
>>> lib_path = os.path.dirname(traceback.__file__)
>>> sys.path = [lib_path]
>>> import traceback
>>> import re
>>> import numpy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'numpy'

I used the traceback module to get the library path, since this should work on any system.

For the built-in modules, which are a subset of the stdlib modules, you can check sys.builtin_module_names

like image 40
Bakuriu Avatar answered Oct 14 '22 20:10

Bakuriu