Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort column in pandas, then sort another column while maintaining previous column sorted

Tags:

python

pandas

So I have some data on lots of publicly traded stock. Each data row contains an id, a date, and some other information. Naturally, a stock might appear many times in the dataframe (i.e Google might have several entries that correspond to different dates at which the price was updated).

I want to be able to sort the ids, then for each sorted block, sort the dates.

NOTE: sorting is done in ascending order for the sake of the example.

    id        date price
0  123  2015/01/13     x
1  114  2017/02/15     y
2   12  2016/12/02     z
3  123  1996/04/26     w
4  114  2014/02/23     u
5  114  1995/05/25     v

Sorting the ids gives:

    id        date price
0   12  2016/12/02     z
1  123  2015/01/13     x
2  123  1996/04/26     w
3  114  2017/02/15     y
4  114  2014/02/23     u
5  114  1995/05/25     v

Sorting the dates WHILE the ids are fixed gives:

    id        date price
0   12  2016/12/02     z
1  123  1996/04/26     w
2  123  2015/01/13     x
3  114  1995/05/25     v
4  114  2014/02/23     u
5  114  2017/02/15     y
like image 826
Ziad BHA Avatar asked Jun 24 '17 15:06

Ziad BHA


People also ask

How do I sort one column based on another in pandas?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order.

Can you sort by two columns in pandas?

You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order.

How do I rearrange the order of columns in pandas?

Reorder Columns using Pandas . Another way to reorder columns is to use the Pandas . reindex() method. This allows you to pass in the columns= parameter to pass in the order of columns that you want to use.

How do I sort a column in descending order in pandas?

We can sort it by using the dataframe. sort_index() function. Alternatively, you can sort the index in descending order by passing in the ascending=False the argument in the function above.


1 Answers

It seems you need DataFrame.sort_values:

df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(['id','date'])
print (df)
    id       date price
2   12 2016-12-02     z
5  114 1995-05-25     v
4  114 2014-02-23     u
1  114 2017-02-15     y
3  123 1996-04-26     w
0  123 2015-01-13     x

Or if id column is string:

df['id'] = df['id'].astype(str)
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(['id','date'])
print (df)
    id       date price
5  114 1995-05-25     v
4  114 2014-02-23     u
1  114 2017-02-15     y
2   12 2016-12-02     z
3  123 1996-04-26     w
0  123 2015-01-13     x

You can also sort one column descending and another ascending:

df['id'] = df['id'].astype(str)
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(['id','date'], ascending=[False, True])
print (df)
    id       date price
3  123 1996-04-26     w
0  123 2015-01-13     x
2   12 2016-12-02     z
5  114 1995-05-25     v
4  114 2014-02-23     u
1  114 2017-02-15     y
like image 192
jezrael Avatar answered Oct 17 '22 07:10

jezrael