Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sort_values() method in pandas

I have the following subset of data and I need to sort the Education column in ascending order; from 0 to 17.

enter image description here

I tried the following code without success.

suicide_data.sort_index(axis=0, kind='mergesort')

also...

suicide_data.Education.sort_values()

and...

suicide_data.sort_values('Education')

Here is the error I'm getting...

TypeError: '>' not supported between instances of 'float' and 'str'

The documentation says that str can be sort with the sort_values() method. Does anyone know how to sort the Education column in ascending order?

like image 434
redeemefy Avatar asked Feb 27 '17 03:02

redeemefy


People also ask

How do I select top 10 rows in pandas DataFrame?

pandas.DataFrame.head() In Python's Pandas module, the Dataframe class provides a head() function to fetch top rows from a Dataframe i.e. It returns the first n rows from a dataframe. If n is not provided then default value is 5.

How do I delete duplicate rows in pandas?

Pandas drop_duplicates() Function Syntax If 'first', duplicate rows except the first one is deleted. If 'last', duplicate rows except the last one is deleted. If False, all the duplicate rows are deleted. inplace: if True, the source DataFrame is changed and None is returned.

How do I sort values in a pandas DataFrame?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.

How do I sort multiple columns in pandas?

You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order. To specify the order, you have to use ascending boolean property; False for descending and True for ascending.


1 Answers

It looks like you must have mixed types within the Education column of your DataFrame. The error message is telling you that it cannot compare the strings to the floats in your column. Assuming you want to sort the values numerically, you could convert them to integer type and then sort. I'd advise you do this anyways, as mixed types won't be too useful for any operations in your DataFrame. Then use DataFrame.sort_values.

suicide_data['Education'] = suicide_data['Education'].astype('int')
suicide_data.sort_values(by='Education')

It is also worth pointing out that your first attempt,

suicide_data.sort_index(axis=0, kind='mergesort')

would sort your DataFrame by the index, which you don't want, and your second attempt

suicide_data.Education.sort_values()

would only return the sorted Series - they are completely invalid approaches.

like image 164
miradulo Avatar answered Sep 27 '22 22:09

miradulo