Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why my PanelND factory throwing a KeyError?

Tags:

python

pandas

I'm using Pandas version 0.12.0 on Ubuntu 13.04. I'm trying to create a 5D panel object to contain some EEG data split by condition.

How I'm chosing to structure my data:

Let me begin by demonstrating my use of pandas.core.panelnd.creat_nd_panel_factory.

Subject = panelnd.create_nd_panel_factory(
    klass_name='Subject',
    axis_orders=['setsize', 'location', 'vfield', 'channels', 'samples'],
    axis_slices={'labels': 'location',
            'items': 'vfield',
            'major_axis': 'major_axis',
            'minor_axis': 'minor_axis'},
    slicer=pd.Panel4D,
    axis_aliases={'ss': 'setsize',
            'loc': 'location',
            'vf': 'vfield',
            'major': 'major_axis',
            'minor': 'minor_axis'}
    # stat_axis=2  # dafuq is this?
    )

Essentially, the organization is as follows:

  • setsize: an experimental condition, can be 1 or 2
  • location: an experimental condition, can be "same", "diff" or None
  • vfield: an experimental condition, can be "lvf" or "rvf"

The last two axes correspond to a DataFrame's major_axis and minor_axis. They have been renamed for clarity:

  • channels: columns, the EEG channels (129 of them)
  • samples: rows, the individual samples. samples can be though of as a time axis.

What I'm trying to do:

Each experimental condition (subject x setsize x location x vfield) is stored in it's own tab-delimited file, which I am reading in with pandas.read_table, obtaining a DataFrame object. I want to create one 5-dimensional panel (i.e. Subject) for each subject, which will contain all experimental conditions (i.e. DataFrames) for that subject.

To start, I'm building a nested dictionary for each subject/Subject:

# ... do some boring stuff to get the text files, etc...
for _, factors in df.iterrows():
    # `factors` is a 4-tuple containing
    #  (subject number, setsize, location, vfield, 
    #  and path to the tab-delimited file).
    sn, ss, loc, vf, path = factors
    eeg = pd.read_table(path, sep='\t', names=range(1, 129) + ['ref'], header=None)

    # build nested dict
    subjects.setdefault(sn, {}).setdefault(ss, {}).setdefault(loc, {})[vf] = eeg

# and now attempt to build `Subject`
for sn, d in subjects.iteritems():
    subjects[sn] = Subject(d)

Full stack trace

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-2-831fa603ca8f> in <module>()
----> 1 import_data()

/home/louist/Dropbox/Research/VSTM/scripts/vstmlib.py in import_data()
     64 
     65     import ipdb; ipdb.set_trace()
---> 66     for sn, d in subjects.iteritems():
     67         subjects[sn] = Subject(d)
     68 

/usr/local/lib/python2.7/dist-packages/pandas/core/panelnd.pyc in __init__(self, *args, **kwargs)
     65         if 'dtype' not in kwargs:
     66             kwargs['dtype'] = None
---> 67         self._init_data(*args, **kwargs)
     68     klass.__init__ = __init__
     69 

/usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _init_data(self, data, copy, dtype, **kwargs)
    250             mgr = data
    251         elif isinstance(data, dict):
--> 252             mgr = self._init_dict(data, passed_axes, dtype=dtype)
    253             copy = False
    254             dtype = None

/usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _init_dict(self, data, axes, dtype)
    293         raxes = [self._extract_axis(self, data, axis=i)
    294                  if a is None else a for i, a in enumerate(axes)]
--> 295         raxes_sm = self._extract_axes_for_slice(self, raxes)
    296 
    297         # shallow copy

/usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _extract_axes_for_slice(self, axes)
   1477         """ return the slice dictionary for these axes """
   1478         return dict([(self._AXIS_SLICEMAP[i], a) for i, a
-> 1479                      in zip(self._AXIS_ORDERS[self._AXIS_LEN - len(axes):], axes)])
   1480 
   1481     @staticmethod

KeyError: 'location'

I understand that panelnd is an experimental feature, but I'm fairly certain that I'm doing something wrong. Can somebody please point me in the right direction? If it is a bug, is there something that can be done about it?

As usual, thank you very much in advance!

like image 560
Louis Thibault Avatar asked Oct 22 '22 01:10

Louis Thibault


1 Answers

Working example. You needed to specify the mapping of your axes to the internal axes names via the slices. This fiddles with the internal structure, but the fixed names of pandas still exist (and are somewhat hardcoded via Panel/Panel4D), so you need to provide the mapping.

I would create a Panel4D first, then your Subject as I did below.

Pls post on github / here if you find more bugs. This is not a heavily used feature.

Output

<class 'pandas.core.panelnd.Subject'>
Dimensions: 3 (setsize) x 1 (location) x 1 (vfield) x 10 (channels) x 2 (samples)
Setsize axis: level0_0 to level0_2
Location axis: level1_0 to level1_0
Vfield axis: level2_0 to level2_0
Channels axis: level3_0 to level3_9
Samples axis: level4_1 to level4_2

Code

import pandas as pd
import numpy as np
from pandas.core import panelnd

Subject = panelnd.create_nd_panel_factory(
    klass_name='Subject',
    axis_orders=['setsize', 'location', 'vfield', 'channels', 'samples'],
    axis_slices={'location' : 'labels',
                 'vfield' : 'items',
                 'channels' : 'major_axis',
                 'samples': 'minor_axis'},
    slicer=pd.Panel4D,
    axis_aliases={'ss': 'setsize',
                  'loc': 'labels',
                  'vf': 'items',
                  'major': 'major_axis',
                  'minor': 'minor_axis'})


subjects = dict()
for i in range(3):
    eeg = pd.DataFrame(np.random.randn(10,2),columns=['level4_1','level4_2'],index=[ "level3_%s" % x for x in range(10)])

    loc, vf = ('level1_0','level2_0')
    subjects["level0_%s" % i] = pd.Panel4D({ loc : { vf : eeg }})

print Subject(subjects)
like image 191
Jeff Avatar answered Nov 15 '22 07:11

Jeff