I'm running a Python program which uses the shelve
module on top of pickle
. After running this program sometimes I get one output file as a.data
but at other times I get three output files as a.data.bak
, a.data.dir
and a.data.dat
.
Why is that?
The shelve module implements persistent storage for arbitrary Python objects which can be pickled, using a dictionary-like API. The shelve module can be used as a simple persistent storage option for Python objects when a relational database is overkill. The shelf is accessed by keys, just as with a dictionary.
The shelve module in Python's standard library is a simple yet effective tool for persistent data storage when using a relational database solution is not required. The shelf object defined in this module is dictionary-like object which is persistently stored in a disk file.
There is quite some indirection here. Follow me carefully.
The shelve
module is implemented on top of the dbm
module. This module acts as a facade for 3(* different specific DBM implementations, and it will pick the first module available when creating a new database, in the following order:
dbm.gnu
, Python module for the GNU DBM library; you would use it directly if you needed the extra functionality it offers over the base dbm
module (it lets you iterate over the keys in stored order and 'pack' the database to free up space from deleted objects).dbm.ndbm
, a proxy module using either the ndbm
, BSD DB and GNU DBM libraries (choosen when Python is compiled).dbm.dumb
, a pure-python implementation.It is this range of choices that makes shelve
files appear to grow extra extensions on different platforms.
The dbm.dumb
module is the one that adds the .bak
, .dat
and .dir
extensions:
Open a dumbdbm database and return a dumbdbm object. The filename argument is the basename of the database file (without any specific extensions). When a dumbdbm database is created, files with
.dat
and.dir
extensions are created.
The .dir
file is moved to .bak
as new index dicts are committed for changes made to the data structures (when adding a new key, deleting a key, or by calling .sync()
or .close()
).
It means that the other three options for anydbm
are not available on your platform.
The other formats may give you other extensions. The dbm
module may use .dir
, .pag
or .db
, depending on what library was used for that module.
(* Python 2 had four dbm modules, it would default to the deprecated dbhash
module, which in turn was built on top of the bsddb
module. These were both removed from Python 3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With