Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keeping columns in the specified order when using UseCols in Pandas Read_CSV

Tags:

I have a csv file with 50 columns of data. I am using Pandas read_csv function to pull in a subset of these columns, using the usecols parameter to choose the ones I want:

cols_to_use = [0,1,5,16,8]
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)

The trouble is df_ret contains the correct columns, but not in the order I specified. They are in ascending order, so [0,1,5,8,16]. (By the way the column numbers can change from run to run, this is just an example.) This is a problem because the rest of the code has arrays which are in the "correct" order and I would rather not have to reorder all of them.

Is there any clever pandas way of pulling in the columns in the order specified? Any help would be much appreciated!

like image 733
AButkov Avatar asked Oct 13 '16 14:10

AButkov


People also ask

How do I rearrange the order of columns in pandas?

Reorder Columns using Pandas . Another way to reorder columns is to use the Pandas . reindex() method. This allows you to pass in the columns= parameter to pass in the order of columns that you want to use.

Does pandas preserve column order?

Pandas. DataFrame doesn't preserve the column order when converting from a DataFrames.

Does DataFrame maintain order?

I read somewhere else Dataframes do not guarantee line order. My experience is that the order of the CSV will be maintained when read. If you do a transform on the dataframe, the order can be lost. Dataframes do have sort support, if you are not sure.


2 Answers

you can reuse the same cols_to_use list for selecting columns in desired order:

df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use] 
like image 98
MaxU - stop WAR against UA Avatar answered Sep 23 '22 13:09

MaxU - stop WAR against UA


Just piggybacking off this question here (hi from 2018).

I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.

pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings] 
like image 45
PeptideWitch Avatar answered Sep 24 '22 13:09

PeptideWitch