Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame sort ignoring the case

Tags:

I have a Pandas dataframe in Python. The contents of the dataframe are from here. I modified the case of the first alphabet in the "Single" column slightly. Here is what I have:

import pandas as pd df = pd.read_csv('test.csv') print df  Position                       Artist                  Single               Year     Weeks        1                Frankie Laine               I Believe               1953  18 weeks        2                  Bryan Adams         I Do It for You               1991  16 weeks        3                  Wet Wet Wet      love Is All Around               1994  15 weeks        4  Drake (feat. Wizkid & Kyla)               One Dance               2016  15 weeks        5                        Queen       bohemian Rhapsody  1975/76 & 1991/92  14 weeks        6                 Slim Whitman              Rose Marie               1955  11 weeks        7              Whitney Houston  i Will Always Love You               1992  10 weeks 

I would like to sort by the Single column in ascending order (a to z). When I run

df.sort_values(by='Single',inplace=True) 

it seems that the sort is not able to combine upper and lowercase. Here is what I get:

Position                       Artist                  Single               Year     Weeks        1                Frankie Laine               I Believe               1953  18 weeks        2                  Bryan Adams         I Do It for You               1991  16 weeks        4  Drake (feat. Wizkid & Kyla)               One Dance               2016  15 weeks        6                 Slim Whitman              Rose Marie               1955  11 weeks        5                        Queen       bohemian Rhapsody  1975/76 & 1991/92  14 weeks        7              Whitney Houston  i Will Always Love You               1992  10 weeks        3                  Wet Wet Wet      love Is All Around               1994  15 weeks 

So, it is sorting by uppercase first and then performing a separate sort by lower case. I want a combined sort, regardless of the case of the starting alphabet in the Single column. The row with "bohemian Rhapsody" is in the wrong location after sorting. It should be first; instead it is appearing as the 5th row after the sort.

Is there a way to do sort a Pandas DataFrame while ignoring the case of the text in the Single column?

like image 602
edesz Avatar asked Jan 15 '17 00:01

edesz


People also ask

Is pandas Str case sensitive?

str. contains has a case parameter that is True by default. Set it to False to do a case insensitive match. Show activity on this post.

How do I ignore a column in pandas?

You can use the following syntax to exclude columns in a pandas DataFrame: #exclude column1 df. loc[:, df. columns!='

What is the correct way to sort a Dataframe?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order.

How do you sort a Dataframe in decreasing order?

To group Pandas dataframe, we use groupby(). To sort grouped dataframe in descending order, use sort_values(). The size() method is used to get the dataframe size.


2 Answers

You can convert all strings to upper/lower case and then call argsort() which gives the index value to reorder the data frame by Single ignoring the case:

df.iloc[df.Single.str.lower().argsort()] 

enter image description here

like image 173
Psidom Avatar answered Nov 14 '22 04:11

Psidom


Pandas 1.1.0 introduced the key argument as a more intuitive way to achieve this:

df.sort_values(by='Single', inplace=True, key=lambda col: col.str.lower()) 
like image 45
RafG Avatar answered Nov 14 '22 02:11

RafG