Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the rules about concurrently accessing a persistent database

It seems the rules about concurrent access are undocumented (on the Haskell side) and simply assume the developer is familiar with the particular backend being used. For production needs this is a perfectly legitimate assumption, but for casual prototyping and development it would be nice if the persistent-* packages were a bit more self contained.

So, what are the rules governing concurrent access to persistent-sqlite and family? Implicitly, there must be some degree of concurrency allowed if we have pools of connections, but trivially creating a single connection pool and calling replicateM x $ forkIO (useThePool connectionPool) gives the below error.

user error (SQLite3 returned ErrorBusy while attempting to perform step.)

EDIT: Some example code is now below.

In the below code I fork off 6 threads (an arbitrary number - my actual application does 3 threads). Each thread constantly stores and looks up a record (a unique record from the one being accessed by the other threads, but that doesn't matter), printing one of the fields.

{-# LANGUAGE TemplateHaskell, QuasiQuotes
           , TypeFamilies, FlexibleContexts, GADTs
           , OverloadedStrings #-}
import Control.Concurrent (forkIO, threadDelay)
import Database.Persist
import Database.Persist.Sqlite hiding (get)
import Database.Persist.TH
import Control.Monad
import Control.Monad.IO.Class

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persist|
SomeData
    myId Int
    myData Double
    MyId myId
|]

main = withSqlitePool "TEST" 40 $ \pool -> do
  runSqlPool (runMigration migrateAll) pool
  mapM_ forkIO [runSqlPool (dbThread i) pool | i <- [0..5]]
  threadDelay maxBound

dbThread :: Int -> SqlPersist IO ()
dbThread i = forever $ do
   x <- getBy (MyId i)
   insert (SomeData i (fromIntegral i))
   liftIO (print x)
   liftIO (threadDelay 100000) -- Just to calm down the CPU,
                               -- not needed for demonstrating
                               -- the problem

NB The values of 40, TEST, and all records are arbitrary for this example. Many values, including more realistic ones, cause the same behavior.

Also note that, while it might be obviously broken when you nest a non-terminating action (via forever) inside of a DB transaction (started by runSqlPool), this isn't the core issue. You can invert those operations and make the transactions arbitrarily small but still end up with periodic exceptions.

The output is usually like:

$ ./so
Nothing
so: user error (SQLite3 returned ErrorBusy while attempting to perform step.)
so: user error (SQLite3 returned ErrorBusy while attempting to perform step.)
so: user error (SQLite3 returned ErrorBusy while attempting to perform step.)
so: user error (SQLite3 returned ErrorBusy while attempting to perform step.)
so: user error (SQLite3 returned ErrorBusy while attempting to perform step.)
so: user error (SQLite3 returned ErrorConstraint while attempting to perform step.)
like image 741
Thomas M. DuBuisson Avatar asked Jan 29 '12 17:01

Thomas M. DuBuisson


1 Answers

Something worth noting is that SQLite has issues with locking when stored on NFS-like volumes (vboxsf, NFS, SMB, mvfs, etc.) on many systems which cause SQLite to give that error even before you've successfully opened the database. These volumes may implement fcntl() read/write locks incorrectly. ( http://www.sqlite.org/faq.html#q5 )

Assuming that's not the issue, it's also worth mentioning that SQLite doesn't really natively support concurrent "connections" ( http://www.sqlite.org/faq.html#q6 ) as it uses file system locks to ensure that two writes don't occur at the same time. (See section 3.0 of http://www.sqlite.org/lockingv3.html)

Assuming all of this is known, you may also check which version of sqlite3 you have available to your environment, as some changes to the way in which different kinds of locks are acquired occurred in the the 3.x series: http://www.sqlite.org/sharedcache.html

Edit: Some additional information from the persist-sqlite3 library This package includes a thin sqlite3 wrapper based on the direct-sqlite package, as well as the entire C library

'Thin' wrapper made me decide to take a look at it to see just how thin it is; looking at the code it doesn't look as if the persistent wrapper has any guards against a statement to the pool failing except the required guard to translate/emit the error and interrupt execution, though I must provide the caveat that I am not comfortable with Haskell.

It appears that you will have to guard against a statement in the pool failing and reattempt, or that you limit the pool size at initialization to 1 (which seems less than ideal.)

like image 79
Tom B Avatar answered Oct 18 '22 00:10

Tom B