I am relatively new to python and am experimenting with SQLAlchemy. I noticed that, to create an engine, I have to use the <code>create_engine()</code> function, imported via <code>from sqlalchemy import create_engine</code>. Now, the <code>create_engine</code> function returns an instance of the <code>sqlalchemy.engine.base.Engine</code> class. However I never imported this class, I only imported the <code>create_engine</code> module. So, how does Python know about the <code>sqlalchemy.engine.base.Engine</code> class?

You probably don't understand what importing does. Python imports modules globally. There is a single structure, called <code>sys.modules</code>, that stores imported modules as a dictionary: <pre class="prettyprint"><code>>>> import sys >>> sys.modules {'builtins': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, '_frozen_importlib': <module 'importlib._bootstrap' (frozen)>, '_imp': <module '_imp' (built-in)>, ...} </code></pre> When you import SQLAlchemy, you import a package, a structure of multiple modules, where one import triggers more imports. All those imported modules are stored in that same place: <pre class="prettyprint"><code>>>> import sqlalchemy >>> [name for name in sys.modules if 'sqlalchemy' in name] ['sqlalchemy', 'sqlalchemy.sql', 'sqlalchemy.sql.expression', 'sqlalchemy.sql.visitors', 'sqlalchemy.util', 'sqlalchemy.util.compat', 'sqlalchemy.util._collections', 'sqlalchemy.util.langhelpers', 'sqlalchemy.exc', 'sqlalchemy.util.deprecations', 'sqlalchemy.sql.functions', 'sqlalchemy.sql.sqltypes', 'sqlalchemy.sql.elements', 'sqlalchemy.inspection', 'sqlalchemy.sql.type_api', 'sqlalchemy.sql.operators', 'sqlalchemy.sql.base', 'sqlalchemy.sql.annotation', 'sqlalchemy.processors', 'sqlalchemy.cprocessors', 'sqlalchemy.event', 'sqlalchemy.event.api', 'sqlalchemy.event.base', 'sqlalchemy.event.attr', 'sqlalchemy.event.registry', 'sqlalchemy.event.legacy', 'sqlalchemy.sql.schema', 'sqlalchemy.sql.selectable', 'sqlalchemy.sql.ddl', 'sqlalchemy.util.topological', 'sqlalchemy.sql.util', 'sqlalchemy.sql.dml', 'sqlalchemy.sql.default_comparator', 'sqlalchemy.sql.naming', 'sqlalchemy.events', 'sqlalchemy.pool', 'sqlalchemy.log', 'sqlalchemy.interfaces', 'sqlalchemy.util.queue', 'sqlalchemy.engine', 'sqlalchemy.engine.interfaces', 'sqlalchemy.sql.compiler', 'sqlalchemy.sql.crud', 'sqlalchemy.engine.base', 'sqlalchemy.engine.util', 'sqlalchemy.cutils', 'sqlalchemy.engine.result', 'sqlalchemy.cresultproxy', 'sqlalchemy.engine.strategies', 'sqlalchemy.engine.threadlocal', 'sqlalchemy.engine.url', 'sqlalchemy.dialects', 'sqlalchemy.types', 'sqlalchemy.schema', 'sqlalchemy.engine.default', 'sqlalchemy.engine.reflection'] </code></pre> Once a module is loaded from disk and added to that structure, Python doesn't need to load it a second time. Dots separate module names in a hierarchy, so everything starting with <code>sqlalchemy.</code> lives inside the <code>sqlalchemy</code> package as a tree structure. There are a lot of <code>sqlalchemy</code> modules here, this is a large project, and they were all loaded (directly or indirectly) by the root package module, <code>sqlalchemy/__init__.py</code>. The other thing <code>import</code> does is bind a name in your current namespace. Each module is a 'global' namespace, all names in that namespace are visible in that namespace. Your Python script is imported as the <code>__main__</code> namespace, and all names in it are available to your script. If you create a module <code>foo</code>, then that is a separate namespace with their own names. <code>import</code> adds names to your global namespace from another module. And in Python, names are just references; the actual objects each of these names reference all live on a big pile in memory, called the heap. The line <pre class="prettyprint"><code>from sqlalchemy import create_engine </code></pre> first makes sure that the object <code>sys.modules['sqlalchemy']</code> exists, and adds the name <code>create_engine</code> to your current namespace, a reference to <code>sqlalchemy.create_engine</code>, as if the line <code>create_engine = sys.modules['sqlalchemy'].create_engine</code> was executed: <pre class="prettyprint"><code>>>> sys.modules['sqlalchemy'].create_engine <function create_engine at 0x10188bbf8> >>> from sqlalchemy import create_engine >>> create_engine is sys.modules['sqlalchemy'].create_engine True </code></pre> Again, all names in Python are just references to a big pile of objects in memory. When you call the <code>create_engine()</code> function, the code for that function is executed, and that function has access to all the globals in the namespace it was defined in. In this case the function is defined in the <code>sqlalchemy.engine</code> module (the top-level <code>sqlalchemy</code> module itself has imported it as <code>from sqlalchemy.engine import create_engine</code> so you can access it from a more convenient location): <pre class="prettyprint"><code>>>> create_engine.__module__ 'sqlalchemy.engine' >>> sys.modules['sqlalchemy.engine'] <module 'sqlalchemy.engine' from '/Users/mjpieters/Development/venvs/stackoverflow-3.6/lib/python3.6/site-packages/sqlalchemy/engine/__init__.py'> >>> sorted(vars(sys.modules['sqlalchemy.engine'])) ['BaseRowProxy', 'BufferedColumnResultProxy', 'BufferedColumnRow', 'BufferedRowResultProxy', 'Compiled', 'Connectable', 'Connection', 'CreateEnginePlugin', 'Dialect', 'Engine', 'ExceptionContext', 'ExecutionContext', 'FullyBufferedResultProxy', 'NestedTransaction', 'ResultProxy', 'RootTransaction', 'RowProxy', 'Transaction', 'TwoPhaseTransaction', 'TypeCompiler', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'base', 'connection_memoize', 'create_engine', 'ddl', 'default', 'default_strategy', 'engine_from_config', 'interfaces', 'reflection', 'result', 'strategies', 'threadlocal', 'url', 'util'] </code></pre> That list of names are all the names defined in the same module as <code>create_engine</code> is defined in. The module was already loaded by code executed when you imported the <code>sqlalchemy</code> module. The function has access to all those and can return you any such object. You'll note that the is a <code>Engine</code> name defined there: <pre class="prettyprint"><code>>>> sys.modules['sqlalchemy.engine'].Engine <class 'sqlalchemy.engine.base.Engine'> </code></pre> So that object is already loaded into Python memory. All the function does is create an instance of that class for you and return it: <pre class="prettyprint"><code>>>> engine = create_engine('sqlite:///:memory:') >>> engine Engine(sqlite:///:memory:) >>> type(engine) <class 'sqlalchemy.engine.base.Engine'> </code></pre> If you want to learn more about Python and names, I recommend you read Ned Batchelder's essay on Facts and myths about Python names and values.

How does SQLAlchemy create_engine import Engine class?

Tags:

python

sqlalchemy

I am relatively new to python and am experimenting with SQLAlchemy. I noticed that, to create an engine, I have to use the create_engine() function, imported via from sqlalchemy import create_engine.

Now, the create_engine function returns an instance of the sqlalchemy.engine.base.Engine class. However I never imported this class, I only imported the create_engine module. So, how does Python know about the sqlalchemy.engine.base.Engine class?

436

asked Jan 15 '18 10:01

Joren Sips

1 Answers

You probably don't understand what importing does.

Python imports modules globally. There is a single structure, called sys.modules, that stores imported modules as a dictionary:

>>> import sys
>>> sys.modules
{'builtins': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, '_frozen_importlib': <module 'importlib._bootstrap' (frozen)>, '_imp': <module '_imp' (built-in)>, ...}

When you import SQLAlchemy, you import a package, a structure of multiple modules, where one import triggers more imports. All those imported modules are stored in that same place:

>>> import sqlalchemy
>>> [name for name in sys.modules if 'sqlalchemy' in name]
['sqlalchemy', 'sqlalchemy.sql', 'sqlalchemy.sql.expression', 'sqlalchemy.sql.visitors', 'sqlalchemy.util', 'sqlalchemy.util.compat', 'sqlalchemy.util._collections', 'sqlalchemy.util.langhelpers', 'sqlalchemy.exc', 'sqlalchemy.util.deprecations', 'sqlalchemy.sql.functions', 'sqlalchemy.sql.sqltypes', 'sqlalchemy.sql.elements', 'sqlalchemy.inspection', 'sqlalchemy.sql.type_api', 'sqlalchemy.sql.operators', 'sqlalchemy.sql.base', 'sqlalchemy.sql.annotation', 'sqlalchemy.processors', 'sqlalchemy.cprocessors', 'sqlalchemy.event', 'sqlalchemy.event.api', 'sqlalchemy.event.base', 'sqlalchemy.event.attr', 'sqlalchemy.event.registry', 'sqlalchemy.event.legacy', 'sqlalchemy.sql.schema', 'sqlalchemy.sql.selectable', 'sqlalchemy.sql.ddl', 'sqlalchemy.util.topological', 'sqlalchemy.sql.util', 'sqlalchemy.sql.dml', 'sqlalchemy.sql.default_comparator', 'sqlalchemy.sql.naming', 'sqlalchemy.events', 'sqlalchemy.pool', 'sqlalchemy.log', 'sqlalchemy.interfaces', 'sqlalchemy.util.queue', 'sqlalchemy.engine', 'sqlalchemy.engine.interfaces', 'sqlalchemy.sql.compiler', 'sqlalchemy.sql.crud', 'sqlalchemy.engine.base', 'sqlalchemy.engine.util', 'sqlalchemy.cutils', 'sqlalchemy.engine.result', 'sqlalchemy.cresultproxy', 'sqlalchemy.engine.strategies', 'sqlalchemy.engine.threadlocal', 'sqlalchemy.engine.url', 'sqlalchemy.dialects', 'sqlalchemy.types', 'sqlalchemy.schema', 'sqlalchemy.engine.default', 'sqlalchemy.engine.reflection']

Once a module is loaded from disk and added to that structure, Python doesn't need to load it a second time. Dots separate module names in a hierarchy, so everything starting with sqlalchemy. lives inside the sqlalchemy package as a tree structure. There are a lot of sqlalchemy modules here, this is a large project, and they were all loaded (directly or indirectly) by the root package module, sqlalchemy/__init__.py.

The other thing import does is bind a name in your current namespace. Each module is a 'global' namespace, all names in that namespace are visible in that namespace. Your Python script is imported as the __main__ namespace, and all names in it are available to your script. If you create a module foo, then that is a separate namespace with their own names. import adds names to your global namespace from another module. And in Python, names are just references; the actual objects each of these names reference all live on a big pile in memory, called the heap.

The line

from sqlalchemy import create_engine

first makes sure that the object sys.modules['sqlalchemy'] exists, and adds the name create_engine to your current namespace, a reference to sqlalchemy.create_engine, as if the line create_engine = sys.modules['sqlalchemy'].create_engine was executed:

>>> sys.modules['sqlalchemy'].create_engine
<function create_engine at 0x10188bbf8>
>>> from sqlalchemy import create_engine
>>> create_engine is sys.modules['sqlalchemy'].create_engine
True

Again, all names in Python are just references to a big pile of objects in memory.

When you call the create_engine() function, the code for that function is executed, and that function has access to all the globals in the namespace it was defined in. In this case the function is defined in the sqlalchemy.engine module (the top-level sqlalchemy module itself has imported it as from sqlalchemy.engine import create_engine so you can access it from a more convenient location):

>>> create_engine.__module__
'sqlalchemy.engine'
>>> sys.modules['sqlalchemy.engine']
<module 'sqlalchemy.engine' from '/Users/mjpieters/Development/venvs/stackoverflow-3.6/lib/python3.6/site-packages/sqlalchemy/engine/__init__.py'>
>>> sorted(vars(sys.modules['sqlalchemy.engine']))
['BaseRowProxy', 'BufferedColumnResultProxy', 'BufferedColumnRow', 'BufferedRowResultProxy', 'Compiled', 'Connectable', 'Connection', 'CreateEnginePlugin', 'Dialect', 'Engine', 'ExceptionContext', 'ExecutionContext', 'FullyBufferedResultProxy', 'NestedTransaction', 'ResultProxy', 'RootTransaction', 'RowProxy', 'Transaction', 'TwoPhaseTransaction', 'TypeCompiler', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'base', 'connection_memoize', 'create_engine', 'ddl', 'default', 'default_strategy', 'engine_from_config', 'interfaces', 'reflection', 'result', 'strategies', 'threadlocal', 'url', 'util']

That list of names are all the names defined in the same module as create_engine is defined in. The module was already loaded by code executed when you imported the sqlalchemy module. The function has access to all those and can return you any such object. You'll note that the is a Engine name defined there:

>>> sys.modules['sqlalchemy.engine'].Engine
<class 'sqlalchemy.engine.base.Engine'>

So that object is already loaded into Python memory. All the function does is create an instance of that class for you and return it:

>>> engine = create_engine('sqlite:///:memory:')
>>> engine
Engine(sqlite:///:memory:)
>>> type(engine)
<class 'sqlalchemy.engine.base.Engine'>

If you want to learn more about Python and names, I recommend you read Ned Batchelder's essay on Facts and myths about Python names and values.

137

answered Oct 09 '22 16:10

Martijn Pieters

Related questions
                            
                                Multiprocessing, Pool.map()
                            
                                How to find the exact intersection of a curve (as np.array) with y==0?
                            
                                Matlab repr function
                            
                                Reducing the number of arguments in function in Python?
                            
                                How to make new decorators available within a class without explicitly importing them?
                            
                                Googleapiclient and python3
                            
                                How to read the contents of a csv file into a class with each csv row as a class instance
                            
                                Translate using dictionaries
                            
                                Cuda GPU is slower than CPU in simple numpy operation
                            
                                How can I select a html element no matter what frame it is in in selenium?
                            
                                Python passing self to the decorator
                            
                                Pandas - Convert columns to new rows after groupby
                            
                                parent-child relationship query in simple_salesforce python, extracting from ordered dicts
                            
                                method object is not JSON serializable
                            
                                Python __dict__
                            
                                Installation of PyCairo on Windows
                            
                                Removing leading zeros from pandas.core.series.Series
                            
                                I want to know the sample bucket name in boto3
                            
                                Headless chrome with selenium, can only find ways to scroll non-headless
                            
                                how to get unique values in all columns in pandas data frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With