I am relatively new to python and am experimenting with SQLAlchemy. I noticed that, to create an engine, I have to use the create_engine()
function, imported via from sqlalchemy import create_engine
.
Now, the create_engine
function returns an instance of the sqlalchemy.engine.base.Engine
class. However I never imported this class, I only imported the create_engine
module. So, how does Python know about the sqlalchemy.engine.base.Engine
class?
The create_engine() method of sqlalchemy library takes in the connection URL and returns a sqlalchemy engine that references both a Dialect and a Pool, which together interpret the DBAPI's module functions as well as the behavior of the database.
Every pool implementation in SQLAlchemy is thread safe, including the default QueuePool . This means that 2 threads requesting a connection simultaneously will checkout 2 different connections. By extension, an engine will also be thread-safe.
A connection pool is a standard technique used to maintain long running connections in memory for efficient re-use, as well as to provide management for the total number of connections an application might use simultaneously.
You probably don't understand what importing does.
Python imports modules globally. There is a single structure, called sys.modules
, that stores imported modules as a dictionary:
>>> import sys
>>> sys.modules
{'builtins': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, '_frozen_importlib': <module 'importlib._bootstrap' (frozen)>, '_imp': <module '_imp' (built-in)>, ...}
When you import SQLAlchemy, you import a package, a structure of multiple modules, where one import triggers more imports. All those imported modules are stored in that same place:
>>> import sqlalchemy
>>> [name for name in sys.modules if 'sqlalchemy' in name]
['sqlalchemy', 'sqlalchemy.sql', 'sqlalchemy.sql.expression', 'sqlalchemy.sql.visitors', 'sqlalchemy.util', 'sqlalchemy.util.compat', 'sqlalchemy.util._collections', 'sqlalchemy.util.langhelpers', 'sqlalchemy.exc', 'sqlalchemy.util.deprecations', 'sqlalchemy.sql.functions', 'sqlalchemy.sql.sqltypes', 'sqlalchemy.sql.elements', 'sqlalchemy.inspection', 'sqlalchemy.sql.type_api', 'sqlalchemy.sql.operators', 'sqlalchemy.sql.base', 'sqlalchemy.sql.annotation', 'sqlalchemy.processors', 'sqlalchemy.cprocessors', 'sqlalchemy.event', 'sqlalchemy.event.api', 'sqlalchemy.event.base', 'sqlalchemy.event.attr', 'sqlalchemy.event.registry', 'sqlalchemy.event.legacy', 'sqlalchemy.sql.schema', 'sqlalchemy.sql.selectable', 'sqlalchemy.sql.ddl', 'sqlalchemy.util.topological', 'sqlalchemy.sql.util', 'sqlalchemy.sql.dml', 'sqlalchemy.sql.default_comparator', 'sqlalchemy.sql.naming', 'sqlalchemy.events', 'sqlalchemy.pool', 'sqlalchemy.log', 'sqlalchemy.interfaces', 'sqlalchemy.util.queue', 'sqlalchemy.engine', 'sqlalchemy.engine.interfaces', 'sqlalchemy.sql.compiler', 'sqlalchemy.sql.crud', 'sqlalchemy.engine.base', 'sqlalchemy.engine.util', 'sqlalchemy.cutils', 'sqlalchemy.engine.result', 'sqlalchemy.cresultproxy', 'sqlalchemy.engine.strategies', 'sqlalchemy.engine.threadlocal', 'sqlalchemy.engine.url', 'sqlalchemy.dialects', 'sqlalchemy.types', 'sqlalchemy.schema', 'sqlalchemy.engine.default', 'sqlalchemy.engine.reflection']
Once a module is loaded from disk and added to that structure, Python doesn't need to load it a second time. Dots separate module names in a hierarchy, so everything starting with sqlalchemy.
lives inside the sqlalchemy
package as a tree structure. There are a lot of sqlalchemy
modules here, this is a large project, and they were all loaded (directly or indirectly) by the root package module, sqlalchemy/__init__.py
.
The other thing import
does is bind a name in your current namespace. Each module is a 'global' namespace, all names in that namespace are visible in that namespace. Your Python script is imported as the __main__
namespace, and all names in it are available to your script. If you create a module foo
, then that is a separate namespace with their own names. import
adds names to your global namespace from another module. And in Python, names are just references; the actual objects each of these names reference all live on a big pile in memory, called the heap.
The line
from sqlalchemy import create_engine
first makes sure that the object sys.modules['sqlalchemy']
exists, and adds the name create_engine
to your current namespace, a reference to sqlalchemy.create_engine
, as if the line create_engine = sys.modules['sqlalchemy'].create_engine
was executed:
>>> sys.modules['sqlalchemy'].create_engine
<function create_engine at 0x10188bbf8>
>>> from sqlalchemy import create_engine
>>> create_engine is sys.modules['sqlalchemy'].create_engine
True
Again, all names in Python are just references to a big pile of objects in memory.
When you call the create_engine()
function, the code for that function is executed, and that function has access to all the globals in the namespace it was defined in. In this case the function is defined in the sqlalchemy.engine
module (the top-level sqlalchemy
module itself has imported it as from sqlalchemy.engine import create_engine
so you can access it from a more convenient location):
>>> create_engine.__module__
'sqlalchemy.engine'
>>> sys.modules['sqlalchemy.engine']
<module 'sqlalchemy.engine' from '/Users/mjpieters/Development/venvs/stackoverflow-3.6/lib/python3.6/site-packages/sqlalchemy/engine/__init__.py'>
>>> sorted(vars(sys.modules['sqlalchemy.engine']))
['BaseRowProxy', 'BufferedColumnResultProxy', 'BufferedColumnRow', 'BufferedRowResultProxy', 'Compiled', 'Connectable', 'Connection', 'CreateEnginePlugin', 'Dialect', 'Engine', 'ExceptionContext', 'ExecutionContext', 'FullyBufferedResultProxy', 'NestedTransaction', 'ResultProxy', 'RootTransaction', 'RowProxy', 'Transaction', 'TwoPhaseTransaction', 'TypeCompiler', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'base', 'connection_memoize', 'create_engine', 'ddl', 'default', 'default_strategy', 'engine_from_config', 'interfaces', 'reflection', 'result', 'strategies', 'threadlocal', 'url', 'util']
That list of names are all the names defined in the same module as create_engine
is defined in. The module was already loaded by code executed when you imported the sqlalchemy
module. The function has access to all those and can return you any such object. You'll note that the is a Engine
name defined there:
>>> sys.modules['sqlalchemy.engine'].Engine
<class 'sqlalchemy.engine.base.Engine'>
So that object is already loaded into Python memory. All the function does is create an instance of that class for you and return it:
>>> engine = create_engine('sqlite:///:memory:')
>>> engine
Engine(sqlite:///:memory:)
>>> type(engine)
<class 'sqlalchemy.engine.base.Engine'>
If you want to learn more about Python and names, I recommend you read Ned Batchelder's essay on Facts and myths about Python names and values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With