Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accessing YAML data in Python

Tags:

python

yaml

I have a YAML file that parses into an object, e.g.:

{'name': [{'proj_directory': '/directory/'},
          {'categories': [{'quick': [{'directory': 'quick'},
                                     {'description': None},
                                     {'table_name': 'quick'}]},
                          {'intermediate': [{'directory': 'intermediate'},
                                            {'description': None},
                                            {'table_name': 'intermediate'}]},
                          {'research': [{'directory': 'research'},
                                        {'description': None},
                                        {'table_name': 'research'}]}]},
          {'nomenclature': [{'extension': 'nc'}
                            {'handler': 'script'},
                            {'filename': [{'id': [{'type': 'VARCHAR'}]},
                                          {'date': [{'type': 'DATE'}]},
                                          {'v': [{'type': 'INT'}]}]},
                            {'data': [{'time': [{'variable_name': 'time'},
                                                {'units': 'minutes since 1-1-1980 00:00 UTC'},

                                      {'latitude': [{'variable_n...

I'm having trouble accessing the data in python and regularly see the error TypeError: list indices must be integers, not str

I want to be able to access all elements corresponding to 'name' so to retrieve each data field I imagine it would look something like:

import yaml
settings_stream = open('file.yaml', 'r')                                                                                       
settingsMap = yaml.safe_load(settings_stream)                                                                                    
yaml_stream = True                                                                                                               

print 'loaded settings for: ',                                                                                                    
for project in settingsMap:                                                                                                       
    print project + ', ' + settingsMap[project]['project_directory']

and I would expect each element would be accessible via something like ['name']['categories']['quick']['directory']

and something a little deeper would just be:

['name']['nomenclature']['data']['latitude']['variable_name']

or am I completely wrong here?

like image 708
frankV Avatar asked Mar 28 '13 17:03

frankV


People also ask

How do I read a YAML file?

YAML is a digestible data serialization language often used to create configuration files with any programming language. Designed for human interaction, YAML is a strict superset of JSON, another data serialization language. But because it's a strict superset, it can do everything that JSON can and more.

How do I read and update a YAML file in Python?

The yaml file should be parsed and updated as below. How do I parse the values and update them appropriately? If you use PyYaml, you can use Loader to load data, and Dumper to write data to file. The data loaded is an ordinary dictionary in Python so you can access element by key and thus change it as you wish.


1 Answers

The brackets, [], indicate that you have lists of dicts, not just a dict.

For example, settingsMap['name'] is a list of dicts.

Therefore, you need to select the correct dict in the list using an integer index, before you can select the key in the dict.

So, giving your current data structure, you'd need to use:

settingsMap['name'][1]['categories'][0]['quick'][0]['directory']

Or, revise the underlying YAML data structure.


For example, if the data structure looked like this:

settingsMap = {
    'name':
    {'proj_directory': '/directory/',
     'categories': {'quick': {'directory': 'quick',
                              'description': None,
                              'table_name': 'quick'}},
     'intermediate': {'directory': 'intermediate',
                      'description': None,
                      'table_name': 'intermediate'},
     'research': {'directory': 'research',
                  'description': None,
                  'table_name': 'research'},
     'nomenclature': {'extension': 'nc',
                      'handler': 'script',
                      'filename': {'id': {'type': 'VARCHAR'},
                                   'date': {'type': 'DATE'},
                                   'v': {'type': 'INT'}},
                      'data': {'time': {'variable_name': 'time',
                                        'units': 'minutes since 1-1-1980 00:00 UTC'}}}}}

then you could access the same value as above with

settingsMap['name']['categories']['quick']['directory']
# quick
like image 52
unutbu Avatar answered Nov 15 '22 06:11

unutbu