I am going over asyncpg
's documentation, and I am having trouble understanding why use a connection pool instead of a single connection.
In the example given, a pool is used:
async with pool.acquire() as connection:
async with connection.transaction():
result = await connection.fetchval('select 2 ^ $1', power)
return web.Response(
text="2 ^ {} is {}".format(power, result))
but it could also be done by creating a connection when necessary:
connection = await asyncpg.connect(user='postgres')
async with connection.transaction():
result = await connection.fetchval('select 2 ^ $1', power)
return web.Response(
text="2 ^ {} is {}".format(power, result))
What's the advantage of using pools over connections as necessary?
asyncpg is a database interface library designed specifically for PostgreSQL and Python/asyncio. asyncpg is an efficient, clean implementation of PostgreSQL server binary protocol for use with Python's asyncio framework. asyncpg requires Python 3.6 or later and is supported for PostgreSQL versions 9.5 to 14.
Connection pooling refers to the method of creating a pool of connections and caching those connections so that it can be reused again. PostgreSQL has a postmaster process, which spawns new processes for each new connection to the database.
asyncpg provides an advanced pool implementation, which eliminates the need to use an external connection pooler such as PgBouncer. To create a connection pool, use the asyncpg. create_pool() function. The resulting Pool object can then be used to borrow connections from the pool.
Establishing a connection to a database server is an expensive operation. Connection pools are a common technique allowing to avoid paying that cost. A pool keeps the connections open and leases them out when necessary.
It's easy to see the benefits of a pool by doing a simple benchmark:
async def bench_asyncpg_con():
power = 2
start = time.monotonic()
for i in range(1, 1000):
con = await asyncpg.connect(user='postgres', host='127.0.0.1')
await con.fetchval('select 2 ^ $1', power)
await con.close()
end = time.monotonic()
print(end - start)
The above completes on my machine in 1.568 seconds.
Whereas the pool version:
async def bench_asyncpg_pool():
pool = await asyncpg.create_pool(user='postgres', host='127.0.0.1')
power = 2
start = time.monotonic()
for i in range(1, 1000):
async with pool.acquire() as con:
await con.fetchval('select 2 ^ $1', power)
await pool.close()
end = time.monotonic()
print(end - start)
Runs in 0.234 seconds, or 6.7 times faster.
Elvis Pranskevichus showed two benchmarks above. I took the first one bench_asyncpg_con
and edited it, moving the con
outside the loop:
async def bench_asyncpg_con_2():
power = 2
start = time.monotonic()
con = await asyncpg.connect(**data)
for i in range(1, 1000):
await con.fetchval('select 2 ^ $1', power)
await con.close()
end = time.monotonic()
print(end - start)
And this worked much faster than bench_asyncpg_pool
I got 0.45
on this, whereas I got
1.62
on bench_asyncpg_pool
and
63.18
on bench_asyncpg_con
I'm kinda newbie in using this lib and thinking that one connection con
for a whole project would be a good choice. Correct me if I'm wrong. I will appreciate it
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With