I have a dataframe in a Python script (using pandas) that needs to be sorted by multiple columns, but the case of the values currently messes up the sorting. For example a and A are not equally sorted. First, the upper-case letters are sorted and then the lower-case ones. Is there any easy way to sort them ignoring case. Currently I have something like this:
df = df.sort(['column1', 'column2', 'column3', 'column4', 'column5', 'column6', 'column7'], ascending=[True, True, True, True, True, True, True])
It is important that the case needs to be ignored for all of the columns and the values mustn't change their case in the final sorted dataframe.
For example column 1 could be sorted like this (ignoring case):
Aaa
aaB
aaC
Bbb
bBc
bbD
CCc
ccd
Also, it would be awesome, if the functionality would work with x number of columns (no hard-coding).
if you just want to sort according to lower, you could use something like this:
def sort_naive_lowercase(df, columns, ascending=True):
df_temp = pd.DataFrame(index = df.index, columns=columns)
for kol in columns:
df_temp[kol] = df[kol].str.lower()
new_index = df_temp.sort_values(columns, ascending=ascending).index
return df.reindex(new_index)
If you expect unicode problems, you might do something like this (borrowing from @nick-hale's comment):
def sort_by_caseless_columns(df, columns, ascending=True):
# https://stackoverflow.com/a/29247821/1562285
import unicodedata
def normalize_caseless(text):
return unicodedata.normalize("NFKD", text.casefold())
df_temp = pd.DataFrame(index = df.index, columns=columns)
for kol in columns:
df_temp[kol] = df[kol].apply(normalize_caseless)
new_index = df_temp.sort_values(columns, ascending=ascending).index
return df.reindex(new_index)
If you have more possible arguments to pass to the sort_values
, you can use **kwargs
If not all the columns are strings, but some are numerical, you might have to include an additional mask
or set
for the non-numerical columns
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With