I'm looking for a fast (xml is way too slow) serialization method that could be used in both Perl and Python.
Unfortunately, I can't use JSON (and many others) because it always changes type of a dict key from integer to a string. I need serialization/deserialization that preserves key type.
Python:
>>> import json
>>> dict_before = {1:'one', 20: 'twenty'}
>>> data = json.dumps(dict_before)
>>> dict_after = json.loads(data)
>>> dict_before
{1: 'one', 20: 'twenty'} #integer keys
>>> dict_after
{u'1': u'one', u'20': u'twenty'} #string keys
Any suggestions are welcome.
You can use yaml.
>>> import yaml
>>> dict_before = {1:'one', 20: 'twenty'}
>>> data = yaml.safe_dump(dict_before)
>>> dict_after = yaml.safe_load(data)
>>> dict_after
{1: 'one', 20: 'twenty'}
I had similar problem. I wanted to share a configuration file in Perl and Python and I had to use yaml.
You can install yaml module in python with:
pip install PyYAML
Although, integer keys will be converted to string in perl => Legal values for Perl hash key
Mu. You have started with the wrong premise.
Perl does not have a meaningful type system that distinguishes between numbers and strings. Any given value can be both. It can't be determined using only the Perl language whether a given value is considered to be a number only (although you can use modules like Devel::Peek
). It is utterly impossible to know what type a given value originally was.
my $x = 1; # an integer (IV), right?
say "x = $x"; # not any more! It's a PVIV now (string and integer)
Furthermore, in a hash map (“dictionary”), the key type is always coerced to a string. In arrays, the key is always coerced to an integer. Other types can only be faked.
This is wonderful when parsing text, but of course introduces endless pain when serializing a data structure. JSON maps perfectly to Perl data structures, so I suggest you stick to that (or YAML, as it is a superset of JSON) to protect yourself from the delusion that a serialization could infer information that isn't possibly there.
What do we take from this?
If interop is important, refrain from using creative dictionary types in Python.
You can always encode type information in the serialization should it really be important (hint: it probably isn't): {"type":"interger dict", "data":{"1":"foo","2":"bar"}}
It would also be premature to dismiss XML as too slow. See this recent article, although I disagree with the methods, and it restricts itself to JS (last week's HN thread for perspective).
If it is native, it will probably be fast enough, so obviously don't use any pure-Perl or pure-Python implementations. This also holds for JSON- and YAML- and whatnot -parsers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With