Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between aiosqlite and SQLite in multi-threaded mode?

I'm trying to asynchronously process multiple files, and processing each file requires some reads and writes to an SQLite database. I've been looking at some options, and I found the aiosqlite module here. However, I was reading the SQLite documentation here, and it says that it supports multi-threaded mode. In fact, the default mode is "serialized" which means it "can be safely used by multiple threads with no restriction."

I don't understand what the difference is. The aiosqlite documentation says:

aiosqlite allows interaction with SQLite databases on the main AsyncIO event loop without blocking execution of other coroutines while waiting for queries or data fetches. It does this by using a single, shared thread per connection.

I get that there is a difference between aiosqlite and the "multi-threaded" mode on sqlite because the multi-threaded mode requires only one connection per thread, whereas in aiosqlite, you can reuse this single connection across multiple threads. But isn't this the same as serialized mode where it can be "used by multiple threads with no restriction"?

Edit: My question right now is "Is my current understanding below is correct?":

  1. Sqlite in "serialized" mode can be used by multiple threads at one time, so this would be used if I used the threading module in python and spawned multiple threads. Here I have the options of either using a separate connection per thread or sharing the connection across multiple threads.
  2. aiosqlite is used with asyncio. So since asyncio has multiple coroutines that share one thread, aiosqlite also works with one thread. So I create one connection that I share among all the coroutines.
  3. Since aiosqlite is basically a wrapper for sqlite, I can combine the functionality of 1 and 2. So I can have multiple threads where each thread has an asyncio event loop with multiple coroutines. So the basic sqlite functionality will handle the multi-threading and the aiosqlite will handle the coroutines.
like image 795
fooiey Avatar asked Sep 09 '20 14:09

fooiey


Video Answer


1 Answers

First of all about threads:

Sqlite ... can be used by multiple threads at one time

It will still be not the same time because of GIL, Threads are always running concurrently (not in parallel). The only thing that with GIL you don't know when thread will be interrupted. But asyncio allows you to switch between threads "manually" and on waiting for some IO operations (like database communication).


Let me explain differences between different modes:

  • Single-thread - creates single database connection without any mutexes or any other mechanisms to prevent multi-threading issues.
  • Multi-thread - creates single shared database connection with mutexes that locks that connection for each operation/communication with database.
  • Serialized - creates multiple database connections per thread.

Answering questions in update:

  1. Yes

    Sqlite in "serialized" mode can be used by multiple threads at one time, so this would be used if I used the threading module in python and spawned multiple threads. Here I have the options of either using a separate connection per thread or sharing the connection across multiple threads.

  2. Yes, it will share a single connection between them.

    aiosqlite is used with asyncio. So since asyncio has multiple coroutines that share one thread, aiosqlite also works with one thread. So I create one connection that I share among all the coroutines

  3. Yes.

    Since aiosqlite is basically a wrapper for sqlite, I can combine the functionality of 1 and 2. So I can have multiple threads where each thread has an asyncio event loop with multiple coroutines. So the basic sqlite functionality will handle the multi-threading and the aiosqlite will handle the coroutines.

like image 171
wowkin2 Avatar answered Sep 30 '22 02:09

wowkin2