Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pouch DB : database size

I am storing ~2500 images as attachments in same doc in CouchDB. These images occupy roughly about 15MB in hard drive and resultant CouchDB is roughly 17MB.

When I push this document to my client through PouchDB I saw that resultant database is over 40 MB. I made some tests following these steps:

  1. upload X images attachments to CouchDB document.

  2. Compact CouchDB

  3. Clear client cache fully

  4. Reload client (in my app I replicate data on reload).

This is the result:

number of attached files | Total size (KB) in HD | Inc | Size in Indexed DB | Inc
17                         129                           207
27                         168.2                   39.2  267                  60
37                         219.6                   51.4  335                  68
47                         275.5                   55.9  414                  79
57                         327.7                   52.2  493                  79
67                         384.9                   57.2  579                  86
77                         428.5                   43.6  654                  75

So, it seems that:

  1. PouchDB adds roughly 2K control data to each attachment.

  2. This control data increases when more attachments are added. (1.6K -> 2.3K -> 2.6K -> 2.8K...)

Images have content_type:image/png both in CouchDB and PouchDB. I understood that this should prevent storing them as base64. Am I correct?

Has anyone seen this earlier? Has anyone been able to workaround it? This is a big problem when aiming to fit an app within iOS 50MB space limitation.

EDIT

I went ahead and checked the size of some images in pouchDB vs original files:

  1. File 1: orig size = 7.4K / PouchDB size = 10.2K

  2. File 2: orig size = 5.1K / PouchDB size = 6.8K

So I think the increase of size when storing attachments in PouchDB does not come from any control data (at least it is not relevant) but from the way the binary file is stored in browser IndexedDB (I am using Chrome for these calculations).

So, is anything additional that needs to be done in order to avoid this increase of binary sizes in PouchDB?

like image 711
tup Avatar asked May 18 '14 16:05

tup


People also ask

Is CouchDB better than MongoDB?

MongoDB is faster than CouchDB. MongoDB provides faster read speeds. It follows the Map/Reduce query method. It follows Map/Reduce creating a collection and object-based query language.

Is CouchDB scalable?

Scalability. The architectural design of CouchDB makes it extremely adaptable when partitioning databases and scaling data onto multiple nodes. CouchDB supports both horizontal partitioning and replication to create an easily managed solution for balancing both read and write loads during a database deployment.

Is CouchDB slow?

Quite the opposite: CouchDB is slower than many people expect. To some degree it has room to improve and optimize; but primarily CouchDB has decided that those costs are worthwhile for the broader good it brings. CouchDB fails the benchmarks, and aces the college of hard knocks.

Is CouchDB SQL or NoSQL?

Apache CouchDB is an open-source document-oriented NoSQL database that uses multiple formats and protocols to store, transfer, and process its data, it uses JSON to store data, JavaScript as its query language using MapReduce, and HTTP for an API.


1 Answers

A very good question, but the answer depends on which adapter you're using, which is not clear from the description.

Edit: just realized you did say Chrome, but I'm keeping the original answer for posterity :)

  • In Node.js we use LevelDB via LevelUP, which stores the binary data directly on disk.
  • In Safari/iOS we use WebSQL, which stores binary blobs, so again no overhead.
  • In everything else we use IndexedDB, which takes in Blob objects at the API level in everything but Chrome, since Chrome doesn't support that yet (issue).

I'm going to guess you're testing in Chrome. So the reason you're seeing bad performance is because we have to store everything in base64 as a workaround (source).

On the bright side, that Chromium bug is pretty active (last comment was 48 hours ago), so presumably the Chrome team is on it and will publish the fix shortly. When they do, PouchDB will automatically detect that blob support is available and start using it.

Update: the Chrome team fixed this, and Blobs are supported in PouchDB as of Chrome v43. :)

like image 88
nlawson Avatar answered Oct 13 '22 12:10

nlawson