Keeping columns in the specified order when using UseCols in Pandas Read_CSV

Tags:

I have a csv file with 50 columns of data. I am using Pandas read_csv function to pull in a subset of these columns, using the usecols parameter to choose the ones I want:

cols_to_use = [0,1,5,16,8]
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)

The trouble is df_ret contains the correct columns, but not in the order I specified. They are in ascending order, so [0,1,5,8,16]. (By the way the column numbers can change from run to run, this is just an example.) This is a problem because the rest of the code has arrays which are in the "correct" order and I would rather not have to reorder all of them.

Is there any clever pandas way of pulling in the columns in the order specified? Any help would be much appreciated!

733

asked Oct 13 '16 14:10

AButkov

2 Answers

you can reuse the same cols_to_use list for selecting columns in desired order:

df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use]

answered Sep 23 '22 13:09

MaxU - stop WAR against UA

Just piggybacking off this question here (hi from 2018).

I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.

pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings]

answered Sep 24 '22 13:09

PeptideWitch

Related questions
                            
                                Live video streaming with HTML 5?
                            
                                Vue.js single file components WITHOUT a build process
                            
                                XGBoost for multilabel classification?
                            
                                Laravel 5.3 - Single Notification for User Collection (followers)
                            
                                Reuse/restart the same node inspect session
                            
                                Why is BigQuery so slow on non-large data sizes?
                            
                                Is it possible to define string.Empty in TypeScript?
                            
                                Trying to replicate GridLayout column alignment with ConstraintLayout
                            
                                Best use of R and SQL if restricted to a local machine
                            
                                When do I use a consensus algorithm like Paxos vs using a something like a Vector Clock?
                            
                                What does Override application root URL really do?
                            
                                Importing Module without routes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With