Read certain column in excel to dataframe

Tags:

I want to read certain column from excel file into dataframe however I want to specify the column with its column header name.

for an example, I have an excel file with two columns in Sheet 2: "number" in column A and "ForeignKey" in column B). I want to import the "ForeignKey" into a dataframe. I did this with the following script:

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols=[0,1])

It shows the following in my xl_file:

       number ForeignKey
0       1        abc
1       2        def
2       3        ghi

in case a small number of column, I can get the "ForeignKey" by specifying usecols=[1]. However if I have many column and know the column name pattern, it will be easier by specifying the column name. I tried the following code but it gives empty dataframe.

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols=['ForeignKey'])

According to discussion in the following link, the code above works well but for read_csv.

[How to drop a specific column of csv file while reading it using pandas?

Is there a way to do this for reading excel file?

thank you in advance

645

asked Jan 09 '19 09:01

Fadri

1 Answers

there is a solution but csv are not treated the same way excel does.

from documentation, for csv:

usecols : list-like or callable, default None

For example, a valid list-like usecols parameter would be [0, 1, 2] or [‘foo’, ‘bar’, ‘baz’].

for excel:

usecols : int or list, default None

If None then parse all columns,

If int then indicates last column to be parsed

If list of ints then indicates list of column numbers to be parsed

If string then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides

so you need to call it like this:

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols='ForeignKey')

and if you need also 'number':

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols='number,ForeignKey')

EDIT: you need to put the name of the excel column not the name of the data. the other answer solve this. however you won't need 'B:B', 'B' will do the trick BUT that won't improve the usecols with numbers.

if you can load all the datas in not time maybe the best way to solve this is to parse all columns and then select the desired columns:

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2')['ForeignKey']

answered Oct 01 '22 16:10

Frayal

Related questions
                            
                                Change ChromeOptions in an existing webdriver
                            
                                Get current (or basic) python logging configuration as a dictionary
                            
                                Find out if two symmetric matrices are the same up to a permutation of the rows/columns
                            
                                pandas.errors.ParserError: Error could possibly be due to quotes being ignored when a multi-char delimiter is used
                            
                                Strange behavior when using toDF() function to transfrom RDD to Dataframe in PySpark
                            
                                How to update a URL every 3 hours in Python
                            
                                Itemgetter Except Columns
                            
                                Firestore DeadlineExceeded exception for big collections
                            
                                plot with polycollection disappears when polygons get too small
                            
                                return the top_k masked softmax of each row for a 2D tensor
                            
                                iterating re.split() on a dataframe
                            
                                Matplotlib Animation: how to dynamically extend x limits?
                            
                                How to set index on categorical type?
                            
                                Using `gpiozero` on `raspberry pi` to control pins, but output pins are reset upon script exit, even though state is remembered between runs
                            
                                matplotlib bar chart with highlights values only
                            
                                How to find out if (the source code of) a function contains a call to a method from a specific module?
                            
                                Creating a numpy array decorated by njit from numba
                            
                                How to include multiple interactive widgets in the same cell in Jupyter notebook
                            
                                Python: Converting excel file to JSON format
                            
                                UndefinedMetricWarning: No positive samples in y_true, true positive value should be meaningless UndefinedMetricWarning)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Read certain column in excel to dataframe

Tags:

python

pandas

dataframe

Fadri

People also ask

1 Answers

Frayal

Recent Activity

Donate For Us