Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

asyncpg - connection vs connection pool

Tags:

python

I am going over asyncpg's documentation, and I am having trouble understanding why use a connection pool instead of a single connection.

In the example given, a pool is used:

async with pool.acquire() as connection:
    async with connection.transaction():
        result = await connection.fetchval('select 2 ^ $1', power)
        return web.Response(
            text="2 ^ {} is {}".format(power, result))

but it could also be done by creating a connection when necessary:

connection = await asyncpg.connect(user='postgres')
async with connection.transaction():
    result = await connection.fetchval('select 2 ^ $1', power)
    return web.Response(
            text="2 ^ {} is {}".format(power, result))

What's the advantage of using pools over connections as necessary?

like image 703
ursan Avatar asked Feb 15 '17 06:02

ursan


People also ask

What is Asyncpg?

asyncpg is a database interface library designed specifically for PostgreSQL and Python/asyncio. asyncpg is an efficient, clean implementation of PostgreSQL server binary protocol for use with Python's asyncio framework. asyncpg requires Python 3.6 or later and is supported for PostgreSQL versions 9.5 to 14.

What is a connection pool Postgres?

Connection pooling refers to the method of creating a pool of connections and caching those connections so that it can be reused again. PostgreSQL has a postmaster process, which spawns new processes for each new connection to the database.

How do I use Asyncpg?

asyncpg provides an advanced pool implementation, which eliminates the need to use an external connection pooler such as PgBouncer. To create a connection pool, use the asyncpg. create_pool() function. The resulting Pool object can then be used to borrow connections from the pool.


2 Answers

Establishing a connection to a database server is an expensive operation. Connection pools are a common technique allowing to avoid paying that cost. A pool keeps the connections open and leases them out when necessary.

It's easy to see the benefits of a pool by doing a simple benchmark:

async def bench_asyncpg_con():
    power = 2
    start = time.monotonic()
    for i in range(1, 1000):
        con = await asyncpg.connect(user='postgres', host='127.0.0.1')
        await con.fetchval('select 2 ^ $1', power)
        await con.close()

    end = time.monotonic()
    print(end - start)

The above completes on my machine in 1.568 seconds.

Whereas the pool version:

async def bench_asyncpg_pool():
    pool = await asyncpg.create_pool(user='postgres', host='127.0.0.1')
    power = 2
    start = time.monotonic()
    for i in range(1, 1000):
        async with pool.acquire() as con:
            await con.fetchval('select 2 ^ $1', power)

    await pool.close()
    end = time.monotonic()
    print(end - start)

Runs in 0.234 seconds, or 6.7 times faster.

like image 70
Elvis Pranskevichus Avatar answered Sep 17 '22 13:09

Elvis Pranskevichus


Elvis Pranskevichus showed two benchmarks above. I took the first one bench_asyncpg_con and edited it, moving the con outside the loop:

async def bench_asyncpg_con_2():
    power = 2
    start = time.monotonic()
    con = await asyncpg.connect(**data)
    for i in range(1, 1000):
        await con.fetchval('select 2 ^ $1', power)

    await con.close()
    end = time.monotonic()
    print(end - start) 

And this worked much faster than bench_asyncpg_pool

I got 0.45 on this, whereas I got

1.62 on bench_asyncpg_pool and

63.18 on bench_asyncpg_con

I'm kinda newbie in using this lib and thinking that one connection con for a whole project would be a good choice. Correct me if I'm wrong. I will appreciate it

like image 45
GooDeeJAY Avatar answered Sep 16 '22 13:09

GooDeeJAY