As a followup to the question Using builtin __import__()
in normal cases, I lead a few tests, and came across surprising results.
I am here comparing the execution time of a classical import
statement, and a call to the __import__
built-in function.
For this purpose, I use the following script in interactive mode:
import timeit
def test(module):
t1 = timeit.timeit("import {}".format(module))
t2 = timeit.timeit("{0} = __import__('{0}')".format(module))
print("import statement: ", t1)
print("__import__ function:", t2)
print("t(statement) {} t(function)".format("<" if t1 < t2 else ">"))
As in the linked question, here is the comparison when importing sys
, along with some other standard modules:
>>> test('sys')
import statement: 0.319865173171288
__import__ function: 0.38428380458522987
t(statement) < t(function)
>>> test('math')
import statement: 0.10262547545597034
__import__ function: 0.16307580163101054
t(statement) < t(function)
>>> test('os')
import statement: 0.10251490255312312
__import__ function: 0.16240755669640627
t(statement) < t(function)
>>> test('threading')
import statement: 0.11349136644972191
__import__ function: 0.1673617034957573
t(statement) < t(function)
So far so good, import
is faster than __import__()
.
This makes sense to me, because as I wrote in the linked post, I find it logical that the IMPORT_NAME
instruction is optimized in comparison with CALL_FUNCTION
, when the latter results in a call to __import__
.
But when it comes to less standard modules, the results reverse:
>>> test('numpy')
import statement: 0.18907936340054476
__import__ function: 0.15840019037769792
t(statement) > t(function)
>>> test('tkinter')
import statement: 0.3798560809537861
__import__ function: 0.15899962771786136
t(statement) > t(function)
>>> test("pygame")
import statement: 0.6624641952621317
__import__ function: 0.16268579177259568
t(statement) > t(function)
What is the reason behind this difference in the execution times?
What is the actual reason why the import
statement is faster on standard modules?
On the other hand, why is the __import__
function faster with other modules?
Tests lead with Python 3.6
timeit
measures the total execution time, but the first import of a module, whether through import
or __import__
, is slower than subsequent ones - because it's the only one that actually performs module initialization. It has to search the filesystem for the module's file(s), load the module's source code (slowest) or previously created bytecode (slow but a bit faster than parsing the .py
files) or shared library (for C extensions), execute the initialization code, and store the module object in sys.modules
. Subsequent imports get to skip all that and retrieve the module object from sys.modules
.
If you reverse the order the results will be different:
import timeit
def test(module):
t2 = timeit.timeit("{0} = __import__('{0}')".format(module))
t1 = timeit.timeit("import {}".format(module))
print("import statement: ", t1)
print("__import__ function:", t2)
print("t(statement) {} t(function)".format("<" if t1 < t2 else ">"))
test('numpy')
import statement: 0.4611093703134608
__import__ function: 1.275512785926014
t(statement) < t(function)
The best way to get non-biased results is to import it once and then do the timings:
import timeit
def test(module):
exec("import {}".format(module))
t2 = timeit.timeit("{0} = __import__('{0}')".format(module))
t1 = timeit.timeit("import {}".format(module))
print("import statement: ", t1)
print("__import__ function:", t2)
print("t(statement) {} t(function)".format("<" if t1 < t2 else ">"))
test('numpy')
import statement: 0.4826306561727307
__import__ function: 0.9192819125911029
t(statement) < t(function)
So, yes, import
is always faster than __import__
.
Remember that all modules get cached into sys.modules
after the first import, so the time...
Anyway, my results look like this:
#!/bin/bash
itest() {
echo -n "import $1: "
python3 -m timeit "import $1"
echo -n "__import__('$1'): "
python3 -m timeit "__import__('$1')"
}
itest "sys"
itest "math"
itest "six"
itest "PIL"
import sys
: 0.481__import__('sys')
: 0.586import math
: 0.163__import__('math')
: 0.247import six
: 0.157__import__('six')
: 0.273import PIL
: 0.162__import__('PIL')
: 0.265What is the reason behind this difference in the execution times?
The import statement has a pretty straighforward path to go through. It leads to IMPORT_NAME
which calls import_name
and imports the given module (if no overriding of the name __import__
has been made):
dis('import math')
1 0 LOAD_CONST 0 (0)
2 LOAD_CONST 1 (None)
4 IMPORT_NAME 0 (math)
6 STORE_NAME 0 (math)
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
__import__
, on the other hand, goes through the generic function call steps that all functions do via CALL_FUNCTION
:
dis('__import__(math)')
1 0 LOAD_NAME 0 (__import__)
2 LOAD_NAME 1 (math)
4 CALL_FUNCTION 1
6 RETURN_VALUE
Sure, it's builtin and so faster than normal py functions but it is still slower than the import
statement with import_name
.
This is why, the difference in time between them is constant. Using @MSeifert snippet (that corrected the unjust timings :-) and adding another print, you can see this:
import timeit
def test(module):
exec("import {}".format(module))
t2 = timeit.timeit("{0} = __import__('{0}')".format(module))
t1 = timeit.timeit("import {}".format(module))
print("import statement: ", t1)
print("__import__ function:", t2)
print("t(statement) {} t(function)".format("<" if t1 < t2 else ">"))
print('Diff: {}'.format(t2-t1))
for m in sys.builtin_module_names:
test(m)
On my machine, there's a constant diff of around 0.17 between them (with slight variance that's generally expected)
*It is worth noting that these aren't exactly equivalent. __import__
doesn't do any name binding as the bytecode attests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With