Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpectedly large Realm file size

This question is about using two different ways to insert objects into a Realm. I noticed that the first method is a lot faster, but the size result is huge comparing with the second method. The diference between the two approaches is moving the write transaction outside vs inside of the for loop.

// Create realm file
let realm = try! Realm(fileURL: banco_url!)

When I add objects like this, the Realm file grows to 75.5MB:

try! realm.write {
    for i in 1...40000 {
        let new_realm_obj = realm_obj(value: ["id" : incrementID(),
                                              "a": "123",
                                              "b": 12.12,
                                              "c": 66,
                                              "d": 13.13,
                                              "e": 0.6,
                                              "f": "01100110",
                                              "g": DateTime,
                                              "h": 3])

        realm.add(new_realm_obj)
        print("🔹 \(i) Added")
    }
}

When I add objects like this, the Realm file only grows to 5.5MB:

for i in 1...40000 {
    let new_realm_obj = realm_obj(value: ["id" : incrementID(),
                                          "a": "123",
                                          "b": 12.12,
                                          "c": 66,
                                          "d": 13.13,
                                          "e": 0.6,
                                          "f": "01100110",
                                          "g": DateTime,
                                          "h": 3])
    try! realm.write {
        realm.add(new_realm_obj)
        print("🔹 \(i) Added")
    }
}

My Class to add to realm file

class realm_obj: Object {
    dynamic var id = Int()
    dynamic var a = ""
    dynamic var b = 0.0
    dynamic var c = Int8()
    dynamic var d = 0.0
    dynamic var e = 0.0
    dynamic var f = ""
    dynamic var g = Date()
    dynamic var h = Int8()
}

Auto increment function

func incrementID() -> Int {
    let realm = try! Realm(fileURL: banco_url!)
    return (realm.objects(realm_obj.self).max(ofProperty: "id") as Int? ?? 0) + 1
}

Is there a better or correct way to do this? Why do I get such different file sizes in these cases?

like image 606
Cilas Avatar asked Sep 14 '17 20:09

Cilas


2 Answers

The large file size when adding all of the objects in a single transaction is due to an unfortunate interaction between Realm's transaction log subsystem and Realm's memory allocation algorithm for large blobs. Realm's memory layout algorithm requires that the file size be at least 8x the size of the largest single blob stored in the Realm file. Transaction log entries, summarizing the modifications made during a single transaction, are stored as blobs within the Realm file.

When you add 40,000 objects in one transaction, you end up with a single transaction log entry that's around 5MB in size. This means that the file has to be at least 40MB in size in order to store it. (I'm not quite sure how it ends up being nearly twice that size again. It might be that the blob size is rounded up to a power of two somewhere along the line…)

When you add one object in 40,000 transactions, you still end up with a single transaction log entry only this time it's on a hundred or so bytes in size. This happens because when Realm commits a transaction, it attempts to first reclaim unused transaction log entries before allocating space for new entries. Since the Realm file is not open elsewhere, the previous entry can be reclaimed as each new commit is performed.

realm/realm-core#2343 tracks improving how Realm stores transaction log entries to avoid the significant overallocation you're seeing.

For now my suggestion would be to split the difference between the two approaches and add groups of objects per write transaction. This will trade off a little performance by increasing the number of commits but will reduce the impact of the memory layout algorithm by reducing the size of the largest transaction log entry you create. From a quick test, committing every 2,000 objects results in a file size of around 4MB, while being significantly quicker than adding each object in a separate write transaction.

like image 160
bdash Avatar answered Oct 26 '22 06:10

bdash


You should in most cases try to minimize the number of write transactions. A write transaction has a significant overhead, hence if you start a new write transaction for every object you want to add to realm, your code will be significantly slower than if you added all objects using a single write transaction.

In my experience, the best way to add several elements to realm is to create the elements, add them to an array and then add the array as a whole to Realm using a single write transaction.

So this is what you should be doing:

var objects = [realmObj]()
for i in 1...40000{
    let newRealmObj = realmObj(value: ["id" : incrementID(), "a": "123","b": 12.12,"c": 66,"d": 13.13,"e": 0.6,"f": "01100110","g": DateTime, "h": 3])
    objects.append(newRealmObj)
}
try! realm.write {
    realm.add(objects)
}

As for the size issue, see the Limitations - File Size part of the Realm documentation. I am not 100% sure on the cause of the issue, but I would say that the issue is caused by writing code inside the write transaction that doesn't need to happen there and shouldn't happen inside the write transaction. I guess due to this, Realm creates a lot of intermediate versions of your objects and since releasing reserved storage capacity is quite an expensive operation, it doesn't happen by the time you are checking the file size.

Keep in mind, that the creation of objects doesn't need to happen inside a write transaction. You only need to create a write transaction for modifying persisted data in Realm (which includes adding new objects to Realm, deleting persisted objects and modifying persisted objects directly).

like image 40
Dávid Pásztor Avatar answered Oct 26 '22 07:10

Dávid Pásztor