Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing puppet-api yaml with python

I am creating a script which need to parse the yaml output that the puppet outputs.

When I does a request agains example https://puppet:8140/production/catalog/my.testserver.no I will get some yaml back that looks something like:

--- &id001 !ruby/object:Puppet::Resource::Catalog
  aliases: {}
  applying: false
  classes: 
    - s_baseconfig
    ...
  edges: 
    - &id111 !ruby/object:Puppet::Relationship
      source: &id047 !ruby/object:Puppet::Resource
        catalog: *id001
        exported: 

and so on... The problem is when I do an yaml.load(yamlstream), I will get an error like:

yaml.constructor.ConstructorError: could not determine a constructor for the tag '!ruby/object:Puppet::Resource::Catalog'
 in "<string>", line 1, column 5:
   --- &id001 !ruby/object:Puppet::Reso ... 
       ^

As far as I know, this &id001 part is supported in yaml.

Is there any way around this? Can I tell the yaml parser to ignore them? I only need a couple of lines from the yaml stream, maybe regex is my friend here? Anyone done any yaml cleanup regexes before?

You can get the yaml output with curl like:

curl --cert /var/lib/puppet/ssl/certs/$(hostname).pem --key /var/lib/puppet/ssl/private_keys/$(hostname).pem --cacert /var/lib/puppet/ssl/certs/ca.pem -H 'Accept: yaml' https://puppet:8140/production/catalog/$(hostname)

I also found some info about this in the puppet mailinglist @ http://www.mail-archive.com/[email protected]/msg24143.html. But I cant get it to work correctly...

like image 826
xeor Avatar asked Dec 02 '11 14:12

xeor


Video Answer


2 Answers

I have emailed Kirill Simonov, the creator of PyYAML, to get help to parse Puppet YAML file.

He gladly helped with the following code. This code is for parsing Puppet log, but I'm sure you can modify it to parse other Puppet YAML file.

The idea is to create the correct loader for the Ruby object, then PyYAML can read the data after that.

Here goes:

#!/usr/bin/env python

import yaml

def construct_ruby_object(loader, suffix, node):
    return loader.construct_yaml_map(node)

def construct_ruby_sym(loader, node):
    return loader.construct_yaml_str(node)

yaml.add_multi_constructor(u"!ruby/object:", construct_ruby_object)
yaml.add_constructor(u"!ruby/sym", construct_ruby_sym)


stream = file('201203130939.yaml','r')
mydata = yaml.load(stream)
print mydata
like image 166
Sharuzzaman Ahmat Raslan Avatar answered Oct 12 '22 19:10

Sharuzzaman Ahmat Raslan


I believe the crux of the matter is the fact that puppet is using yaml "tags" for ruby-fu, and that's confusing the default python loader. In particular, PyYAML has no idea how to construct a ruby/object:Puppet::Resource::Catalog, which makes sense, since that's a ruby object.

Here's a link showing some various uses of yaml tags: http://www.yaml.org/spec/1.2/spec.html#id2761292

I've gotten past this in a brute-force approach by simply doing something like:

cat the_yaml | sed 's#\!ruby/object.*$##gm' > cleaner.yaml

but now I'm stuck on an issue where the *resource_table* block is confusing PyYAML with its complex keys (the use of '? ' to indicate the start of a complex key, specifically).

If you find a nice way around that, please let me know... but given how tied at the hip puppet is to ruby, it may just be easier to do you script directly in ruby.

like image 39
Bret McMillan Avatar answered Oct 12 '22 18:10

Bret McMillan