I have been using Python's pickle module for implementing a thin file-based persistence layer. The persistence layer (part of a larger library) relies heavily on pickle's persistent_id feature to save objects of specified classes as separate files.
The only issue with this approach is that pickle files are not human editable, and I'd much rather have objects saved in a format that is human readable and editable with a text editor (e.g., YAML or JSON).
Do you know of any library that uses a human-editable format and
offers features similar to pickle's persistent_id
? Alternatively,
do you have suggestions for implementing them on top of a YAML- or
JSON-based serialization library, without rewriting a large subset of
pickle?
I haven't tried this yet myself, but I think you should be able to do this elegantly with PyYAML using what they call "representers" and "resolvers".
EDIT
After an extensive exchange of comments with the poster, here is a method to achieve the required behavior with PyYAML.
Important Note: If a Persistable
instance has another such instance as an attribute, or contained somehow inside one of its attributes, then the contained Persistable
instance will not be saved to yet another separate file, rather it will be saved inline in the same file as the parent Persistable
instance. To the best of my understanding, this limitation also existed in the OP's pickle-based system, and may be acceptable for his/her use cases. I haven't found an elegant solution for this which doesn't involve hacking yaml.representer.BaseRepresenter
.
import yaml
from functools import partial
class Persistable(object):
# simulate a unique id
_unique = 0
def __init__(self, *args, **kw):
Persistable._unique += 1
self.persistent_id = ("%s.%d" %
(self.__class__.__name__, Persistable._unique))
def persistable_representer(dumper, data):
id = data.persistent_id
print "Writing to file: %s" % id
outfile = open(id, 'w')
outfile.write(yaml.dump(data))
outfile.close()
return dumper.represent_scalar(u'!xref', u'%s' % id)
class PersistingDumper(yaml.Dumper):
pass
PersistingDumper.add_representer(Persistable, persistable_representer)
my_yaml_dump = partial(yaml.dump, Dumper=PersistingDumper)
def persistable_constructor(loader, node):
xref = loader.construct_scalar(node)
print "Reading from file: %s" % id
infile = open(xref, 'r')
value = yaml.load(infile.read())
infile.close()
return value
yaml.add_constructor(u'!xref', persistable_constructor)
# example use, also serves as a test
class Foo(Persistable):
def __init__(self):
self.one = 1
Persistable.__init__(self)
class Bar(Persistable):
def __init__(self, foo):
self.foo = foo
Persistable.__init__(self)
foo = Foo()
bar = Bar(foo)
print "=== foo ==="
dumped_foo = my_yaml_dump(foo)
print dumped_foo
print yaml.load(dumped_foo)
print yaml.load(dumped_foo).one
print "=== bar ==="
dumped_bar = my_yaml_dump(bar)
print dumped_bar
print yaml.load(dumped_bar)
print yaml.load(dumped_bar).foo
print yaml.load(dumped_bar).foo.one
baz = Bar(Persistable())
print "=== baz ==="
dumped_baz = my_yaml_dump(baz)
print dumped_baz
print yaml.load(dumped_baz)
From now on use my_yaml_dump
instead of yaml.dump
when you want to save instances of the Persistable
class to separate files. But don't use it inside persistable_representer
and persistable_constructor
! No special loading function is necessary, just use yaml.load
.
Phew, that took some work... I hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With