Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save results in YAML file with python

Tags:

python

yaml

numpy

I'm stuck in this little problem after an hour of searching previous answer. I want to store matrices from my code in .yaml file

What I obtain from my code

Matrix
[[  1.00665266e+03   0.00000000e+00   5.08285432e+02]
 [  0.00000000e+00   1.01086937e+03   3.45995536e+02]
 [  0.00000000e+00   0.00000000e+00   1.00000000e+00]]

How I tried to save this matrix (mtx is the shorter name in my code)

fname = "calibrationC300.yaml"

data = dict(
    Matrix = mtx,
)

with open(fname, "w") as f:
    yaml.dump(data, f, default_flow_style=False)   

But what I read in my YAML file is totally wrong (just only bad conversion?)

Matrix: !!python/object/apply:numpy.core.multiarray._reconstruct
  args:
  - &id001 !!python/name:numpy.ndarray ''
  - !!python/tuple [0]
  - b
  state: !!python/tuple
  - 1
  - !!python/tuple [3, 3]
  - !!python/object/apply:numpy.dtype
    args: [f8, 0, 1]
    state: !!python/tuple [3, <, null, null, null, -1, -1, 0]
  - false
  - !!binary |
    cWM87e1YkEAAAAAAAAAAAIUEEyb5SH1AAAAAAAAAAACp/Z3yc2qQQFv0vPqb5nZAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAPA/

It is the first time I use Yaml files, what am I doing wrong? Is there a method to obtain the matrix in the simple form (as I obtain it from the code) in the yaml file? Thank you in advance

like image 602
marcoresk Avatar asked Feb 05 '23 16:02

marcoresk


2 Answers

The difference is between float and numpy.float64. Yaml uses more sophisticated way to represent numpy.float64. You can change to float if you like more readble yaml. See the following example:

print(yaml.dump({'test': 1, 'data':float(0.2)}, default_flow_style=False))
print(yaml.dump({'test': 2, 'data':numpy.float64(0.2)}, default_flow_style=False))

Output are:

data: 0.2

test: 1

data: !!python/object/apply:numpy.core.multiarray.scalar
- !!python/object/apply:numpy.dtype
  args:
  - f8
  - 0
  - 1
  state: !!python/tuple
  - 3
  - <
  - null
  - null
  - null
  - -1
  - -1
  - 0
- !!binary |
  mpmZmZmZyT8=

test: 2
like image 130
Tao Cheng Avatar answered Feb 08 '23 15:02

Tao Cheng


The only thing wrong here seems to be your expectation of how numpy internals can and should be dumped to YAML.

An easy check to see that what you have gotten as YAML is correct, is to load what you dump-ed:

import ruamel.yaml
import numpy
import pprint

mtx = [[1.00665266e+03, 0.00000000e+00, 5.08285432e+02],
       [0.00000000e+00, 1.01086937e+03, 3.45995536e+02],
       [0.00000000e+00, 0.00000000e+00, 1.00000000e+00],]

data = dict(Matrix=mtx)

yaml_str = ruamel.yaml.dump(data, default_flow_style=False)
data = ruamel.yaml.load(yaml_str)
print(data)

which gives:

{'Matrix': [[1006.65266, 0.0, 508.285432], [0.0, 1010.86937, 345.995536], [0.0, 0.0, 1.0]]}

The special types that numpy uses are not dumped as simple (and readable) YAML, there is no guarantee that that could be reloaded. It might be possible for some constructs, although it easily leads to ambiguity, and AFAIK simplification it is not done for any of the numpy types.

Of course you can dump that YAML without having numpy supply its restore information, by doing:

ruamel.yaml.round_trip_dump(data, sys.stdout)

which gives:

Matrix:
- - 1006.65266
  - 0.0
  - 508.285432
- - 0.0
  - 1010.86937
  - 345.995536
- - 0.0
  - 0.0
  - 1.0

much more readable, but not something that will ever become a numpy.multiarray automatically when you load() it again from its YAML representation.

like image 43
Anthon Avatar answered Feb 08 '23 14:02

Anthon