Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLite vs serializing to disk

I'm doing some performance comparison whether to go for serializing data or to store them in a a DB. The application receives hell of a lot of data (x GB) that needs to be persisted with a minimum speed rate of 18mb/s (as for now)

Storing in DB offers easier functionality in terms of searching and accessing data at a later time, data snapshots, data migration and etc, but my tests so far shows a huge difference in performance time.

The test saves 1000 objects (of about 7hundredsomething kb each). Either to their respective columns in table or to disk by saving them as a generic List. (The SQLite ends up with a bit more data)

  1. Saving to SQLite v3, total size 745mb: 30.7seconds (~speed: 24,3 mb/s)
  2. Serializing to disk, total size 741mb: 0.33 seconds (~speed: 2245 mb/s)

I haven't done any performance tweaks to SQLite, just use it out of the box with Fluent nHibernate and the SQLite.Data adapter (no transaction), but at first thought that is a huge difference.

Obviously I know that going through a ORM mapper and DB to write to disk gives an overhead compared to serializing, but that was a lot.

Also into considerations are to persist the data right away as I recieve them. If there is a power failure I need the last data recieved.

Any thoughts?

----- Updates (as I continue to investigate solutions) ------

  • Wrapping the 1000 inserts in a transaction the time was now ~14s = 53mb/s, however if I throw an exception halfway I loose all my data.
  • Using a IStatelessSession seems to improve the time by 0.5-1s
  • Didn't see any performance gain by assigning the ID to the entity instead of having it automaticly assigned in the table and thus getting rid of (select row_generatedid()) for every insert sql. -> Id(x => x.Id).GeneratedBy.Assigned();
  • the nosync() alternative in SQLite is not an alternative as the DB might be corrupted in case of a power failure.
like image 643
Islandwind Avatar asked Oct 26 '11 08:10

Islandwind


2 Answers

I had a similar problem once and I suggest you go the SQLite route.

As for your performance issues, I'm pretty sure you'll get a very significant boost if you:

  1. execute all INSERTs in a single transaction - write queries must acquire (and release) a lock to the SQLite file, this is very expensive in terms of disk I/O and you should notice a huge boost***
  2. consider using multi-INSERTs (this probably won't work for you since you rely on a ORM)
  3. as @user896756 mentioned you should also prepare your statements

Test 1: 1000 INSERTs

CREATE TABLE t1(a INTEGER, b INTEGER, c VARCHAR(100));
INSERT INTO t1 VALUES(1,13153,'thirteen thousand one hundred fifty three');
INSERT INTO t1 VALUES(2,75560,'seventy five thousand five hundred sixty');
... 995 lines omitted
INSERT INTO t1 VALUES(998,66289,'sixty six thousand two hundred eighty nine');
INSERT INTO t1 VALUES(999,24322,'twenty four thousand three hundred twenty two');
INSERT INTO t1 VALUES(1000,94142,'ninety four thousand one hundred forty two');
  • PostgreSQL: 4.373
  • MySQL: 0.114
  • SQLite 2.7.6: 13.061
  • SQLite 2.7.6 (nosync): 0.223

Test 2: 25000 INSERTs in a transaction

BEGIN;
CREATE TABLE t2(a INTEGER, b INTEGER, c VARCHAR(100));
INSERT INTO t2 VALUES(1,59672,'fifty nine thousand six hundred seventy two');
... 24997 lines omitted
INSERT INTO t2 VALUES(24999,89569,'eighty nine thousand five hundred sixty nine');
INSERT INTO t2 VALUES(25000,94666,'ninety four thousand six hundred sixty six');
COMMIT;
  • PostgreSQL: 4.900
  • MySQL: 2.184
  • SQLite 2.7.6: 0.914
  • SQLite 2.7.6 (nosync): 0.757

*** These benchmarks are for SQLite 2, SQLite 3 should be even faster.

like image 169
Alix Axel Avatar answered Oct 23 '22 04:10

Alix Axel


You should consider using compiled statements for sqlite.

Check this

On insert/update queries there is a huge performance boost, I managed to obtain from 2x to 10x faster execution time using compiled statements, although from 33 sec to 0.3 sec is long way.

On the other hand, the SQLite execution speed depends on the schema of the table you are using, ex: if you have an index on a huge data, it would result a slow insert.

like image 2
Adam Bartha Avatar answered Oct 23 '22 04:10

Adam Bartha