Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas changing order of columns after data retrieval

I would like to change the column names of a pd dataframe but I'm finding that the order of the columns change after data is retrieved. The below code specifies sector ETF symbols and fetches the data from yahoo finance.

The problem is that once I run the code, for example, 'XLY' is no longer the first series in the dataframe, so I cannot just run sec_perf.columns = ['Name1', 'Name2', etc] as I normally would because it will not name the columns properly. What am I messing up here?

import pandas as pd
import pandas_datareader.data as web
import datetime as datetime

end = datetime.date.today()
secs = ['XLY', 'XLP', 'XLE', 
       'XLF', 'XLV', 'XLI', 
       'XLB', 'XLRE', 'XLK', 'XLU']

sec_perf = web.DataReader(secs, 'yahoo', 
           start = datetime.datetime(2016,12,31), 
           end = end)['Adj Close']
like image 323
Merv Merzoug Avatar asked Jan 05 '23 15:01

Merv Merzoug


1 Answers

use reindex_axis

sec_perf.reindex_axis(secs, 1)

enter image description here

You could also have used sec_perf[secs] to do the same thing. But we did this a while ago and determined that reindex_axis was quickest.

enter image description here

like image 119
piRSquared Avatar answered Jan 08 '23 11:01

piRSquared