There are many postings on slicing the level[0] of a multiindex by a range of level1. However, I cannot find a solution for my problem; that is, I need a range of the level1 index for level[0] index values dataframe: First is A to Z, Rank is 1 to 400; I need the first 2 and last 2 for each level[0] (First), but not in the same step. <pre class="prettyprint"><code> Title Score First Rank A 1 foo 100 2 bar 90 3 lime 80 4 lame 70 B 1 foo 400 2 lime 300 3 lame 200 4 dime 100 </code></pre> I am trying to get the last 2 rows for each level1 index with the below code, but it slices properly only for the first level[0] value. <pre class="prettyprint"><code>[IN] df.ix[x.index.levels[1][-2]:] [OUT] Title Score First Rank A 3 lime 80 4 lame 70 B 1 foo 400 2 lime 300 3 lame 200 4 dime 100 </code></pre> The first 2 rows I get by swapping the indices, but I cannot make it work for the last 2 rows. <pre class="prettyprint"><code>df.index = df.index.swaplevel("Rank", "First") df= df.sortlevel() #to sort by Rank df.ix[1:2] #Produces the first 2 ranks with 2 level[1] (First) each. Title Score Rank First 1 A foo 100 B foo 400 2 A bar 90 B lime 300 </code></pre> Of course I can swap this back to get this: <pre class="prettyprint"><code>df2 = df.ix[1:2] df2.index = ttt.index.swaplevel("First","rank") #change the order of the indices back. df2.sortlevel() Title Score First Rank A 1 foo 100 2 bar 90 B 1 foo 400 2 lime 300 </code></pre> Any help is appreciated to get with the same procedure: <ul> <li>Last 2 rows for index1 (Rank)</li> <li>And a better way to get the first 2 rows</li> </ul> <hr> Edit following feedback by @ako: Using <code>pd.IndexSlice</code> truly makes it easy to slice any level index. Here a more generic solution and below my step-wise approach to get the first and last two rows. More information here: http://pandas.pydata.org/pandas-docs/stable/advanced.html#using-slicers <pre class="prettyprint"><code>""" Slicing a dataframe at the level[2] index of the major axis (row) for specific and at the level[1] index for columns. """ df.loc[idx[:,:,['some label','another label']],idx[:,'yet another label']] """ Thanks to @ako below is my solution, including how I get the top and last 2 rows. """ idx = pd.IndexSlice # Top 2 df.loc[idx[:,[1,2],:] #[1,2] is NOT a row index, it is the rank label. # Last 2 max = len(df.index.levels[df.index.names.index("rank")]) # unique rank labels last2=[x for x in range(max-2,max)] df.loc[idx[:,last2],:] #for last 2 - assuming all level[0] have the same lengths. </code></pre>

Use an indexer to slice arbitrary values in arbitrary dimensions--just pass a list with whatever the desired levels / values are for that dimension. <pre class="prettyprint"><code>idx = pd.IndexSlice df.loc[idx[:,[3,4]],:] Title Score First Rank A 3 lime 80 4 lame 70 B 3 lame 200 4 dime 100 </code></pre> For reproducing the data: <pre class="prettyprint"><code>from io import StringIO s=""" First Rank Title Score A 1 foo 100 A 2 bar 90 A 3 lime 80 A 4 lame 70 B 1 foo 400 B 2 lime 300 B 3 lame 200 B 4 dime 100 """ df = pd.read_csv(StringIO(s), sep='\s+', index_col=["First", "Rank"]) </code></pre>

Another way to slice by 2nd (sub) level in a multi level index is to Use <code>slice(None)</code> with <code>.loc[]</code>. <code>.loc[]</code> will take a tuple for multi level index, using <code>slice(None)</code> for a level indicates that particular index is not being sliced, then pass a single item or list for the index that is being sliced. Hope it helps future readers <pre class="prettyprint"><code>df.loc[ ( slice(None), [3, 4] ), : ] </code></pre>

Python Pandas slice multiindex by second level index (or any other level)

Tags:

python

slice

sorting

pandas

multi-index

There are many postings on slicing the level[0] of a multiindex by a range of level1. However, I cannot find a solution for my problem; that is, I need a range of the level1 index for level[0] index values

dataframe: First is A to Z, Rank is 1 to 400; I need the first 2 and last 2 for each level[0] (First), but not in the same step.

           Title Score
First Rank 
A     1    foo   100
      2    bar   90
      3    lime  80
      4    lame  70
B     1    foo   400
      2    lime  300
      3    lame  200
      4    dime  100

I am trying to get the last 2 rows for each level1 index with the below code, but it slices properly only for the first level[0] value.

[IN]  df.ix[x.index.levels[1][-2]:]
[OUT] 
               Title Score
    First Rank 
    A     3    lime  80
          4    lame  70
    B     1    foo   400
          2    lime  300
          3    lame  200
          4    dime  100

The first 2 rows I get by swapping the indices, but I cannot make it work for the last 2 rows.

df.index = df.index.swaplevel("Rank", "First")
df= df.sortlevel() #to sort by Rank
df.ix[1:2] #Produces the first 2 ranks with 2 level[1] (First) each.
           Title Score
Rank First 
1     A    foo   100
      B    foo   400
2     A    bar   90
      B    lime  300

Of course I can swap this back to get this:

df2 = df.ix[1:2]
df2.index = ttt.index.swaplevel("First","rank") #change the order of the indices back.
df2.sortlevel()
               Title Score
    First Rank 
    A     1    foo   100
          2    bar   90
    B     1    foo   400
          2    lime  300

Any help is appreciated to get with the same procedure:

Last 2 rows for index1 (Rank)
And a better way to get the first 2 rows

Edit following feedback by @ako:

Using pd.IndexSlice truly makes it easy to slice any level index. Here a more generic solution and below my step-wise approach to get the first and last two rows. More information here: http://pandas.pydata.org/pandas-docs/stable/advanced.html#using-slicers

"""    
Slicing a dataframe at the level[2] index of the
major axis (row) for specific and at the level[1] index for columns.

"""
    df.loc[idx[:,:,['some label','another label']],idx[:,'yet another label']]

"""
Thanks to @ako below is my solution, including how I
get the top and last 2 rows.
"""
    idx = pd.IndexSlice
    # Top 2
    df.loc[idx[:,[1,2],:] #[1,2] is NOT a row index, it is the rank label. 
    # Last 2
    max = len(df.index.levels[df.index.names.index("rank")]) # unique rank labels
    last2=[x for x in range(max-2,max)]
    df.loc[idx[:,last2],:] #for last 2 - assuming all level[0] have the same lengths.

404

asked Oct 18 '15 03:10

raummensch

2 Answers

Use an indexer to slice arbitrary values in arbitrary dimensions--just pass a list with whatever the desired levels / values are for that dimension.

idx = pd.IndexSlice
df.loc[idx[:,[3,4]],:]

           Title  Score
First Rank             
A     3     lime     80
      4     lame     70
B     3     lame    200
      4     dime    100

For reproducing the data:

from io import StringIO

s="""
First Rank Title Score
A      1    foo   100
A      2    bar   90
A      3    lime  80
A      4    lame  70
B      1    foo   400
B      2    lime  300
B      3    lame  200
B      4    dime  100
"""
df = pd.read_csv(StringIO(s),
                 sep='\s+',
                 index_col=["First", "Rank"])

136

answered Oct 11 '22 18:10

ako

Another way to slice by 2nd (sub) level in a multi level index is to Use slice(None) with .loc[]. .loc[] will take a tuple for multi level index, using slice(None) for a level indicates that particular index is not being sliced, then pass a single item or list for the index that is being sliced. Hope it helps future readers

df.loc[ ( slice(None), [3, 4] ),  : ]

answered Oct 11 '22 19:10

Ash

Related questions
                            
                                Unserialize PHP data in python [duplicate]
                            
                                Check if request is AJAX in Python
                            
                                How do I get SQLAlchemy to correctly insert a unicode ellipsis into a mySQL table?
                            
                                In dictionary, converting the value from string to integer
                            
                                Python how to replace backslash with re.sub()
                            
                                Python metaclasses: Why isn't __setattr__ called for attributes set during class definition?
                            
                                How to print colour/color in python?
                            
                                How use python on ipad?
                            
                                Managing parameters of URL (Python Flask)
                            
                                Swap values in a tuple/list inside a list in python?
                            
                                Implementing an optional logger in code
                            
                                how to compute 'nearby' nodes with networkx
                            
                                Changing Pipe separated data to Dataframe in Python Pandas
                            
                                how to append/insert an item at the beginning of a series?
                            
                                Upgrading to Django 1.7. Getting error: Cannot serialize: <storages.backends.s3boto.S3BotoStorage object
                            
                                Debugging flask with pdb
                            
                                Python Connection to Hive
                            
                                How do you get the first 3 elements in Python OrderedDict?
                            
                                What is the checkmark icon next to my project in PyCharm?
                            
                                Assertion failure : size.width>0 && size.height>0 in function imshow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With