I have the following CSV data:
id,gene,celltype,stem,stem,stem,bcell,bcell,tcell id,gene,organs,bm,bm,fl,pt,pt,bm 134,foo,about_foo,20,10,11,23,22,79 222,bar,about_bar,17,13,55,12,13,88
And I can successfully summarize them this way:
import pandas as pd df = pd.read_csv("http://dpaste.com/1X74TNP.txt",header=None,index_col=[1,2]).iloc[:, 1:] df.columns = pd.MultiIndex.from_arrays(df.ix[:2].values) df = df.ix[2:].astype(int) df.index.names = ['cell', 'organ'] df = df.reset_index('organ', drop=True) result = df.groupby(level=[0, 1], axis=1).mean() result = result.stack().replace(np.nan, 0).unstack() result = result.swaplevel(0,1, axis=1).sort_index(axis=1)
Which looks like:
In [341]: result Out[341]: bm fl pt bcell stem tcell bcell stem tcell bcell stem tcell cell foo 0 15 79 0 11 0 22.5 0 0 bar 0 15 88 0 55 0 12.5 0 0
My question is, from result
how can I get the column index of the first level as list:
['bm','fl','pt']
You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.
result.columns
returns a pandas.core.index.MultiIndex
which has a levels attribute.
list(result.columns.levels[0])
returns
['bm', 'fl', 'pt']
Additionally you could use columnns.get_level_values(level)
>>> result.columns.get_level_values(0).unique() array(['bm', 'fl', 'pt'], dtype=object) >>> list(result.columns.get_level_values(0)) ['bm', 'bm', 'bm', 'fl', 'fl', 'fl', 'pt', 'pt', 'pt']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With