I have a lot of different table (and other unstructured data in an excel sheet) .. I need to create a dataframe out of range 'A3:D20' from 'Sheet2' of Excel sheet 'data'. All examples that I come across drilldown up to sheet level, but not how to pick it from an exact range. <pre class="prettyprint"><code>import openpyxl import pandas as pd wb = openpyxl.load_workbook('data.xlsx') sheet = wb.get_sheet_by_name('Sheet2') range = ['A3':'D20'] #<-- how to specify this? spots = pd.DataFrame(sheet.range) #what should be the exact syntax for this? print (spots) </code></pre> Once I get this, I plan to look up data in column A and find its corresponding value in column B. Edit 1: I realised that openpyxl takes too long, and so have changed that to <code>pandas.read_excel('data.xlsx','Sheet2')</code> instead, and it is much faster at that stage at least. Edit 2: For the time being, I have put my data in just one sheet and: <ul> <li>removed all other info</li> <li>added column names, </li> <li>applied <code>index_col</code> on my leftmost column</li> <li>then used <code>wb.loc[]</code> </li> </ul>

Use the following arguments from pandas read_excel documentation: <blockquote> <ul> <li>skiprows : list-like <ul> <li>Rows to skip at the beginning (0-indexed)</li> </ul> </li> <li>parse_cols : int or list, default None <ul> <li>If None then parse all columns,</li> <li>If int then indicates last column to be parsed</li> <li>If list of ints then indicates list of column numbers to be parsed</li> <li>If string then indicates comma separated list of column names and column ranges (e.g. “A:E” or “A,C,E:F”)</li> </ul> </li> </ul> </blockquote> I imagine the call will look like: <pre class="prettyprint"><code>df = read_excel(filename, 'Sheet2', skiprows = 2, parse_cols = 'A:D') </code></pre>

Python Pandas dataframe reading exact specified range in an excel sheet

Tags:

I have a lot of different table (and other unstructured data in an excel sheet) .. I need to create a dataframe out of range 'A3:D20' from 'Sheet2' of Excel sheet 'data'.

All examples that I come across drilldown up to sheet level, but not how to pick it from an exact range.

import openpyxl import pandas as pd  wb = openpyxl.load_workbook('data.xlsx') sheet = wb.get_sheet_by_name('Sheet2') range = ['A3':'D20']   #<-- how to specify this? spots = pd.DataFrame(sheet.range) #what should be the exact syntax for this?  print (spots)

Once I get this, I plan to look up data in column A and find its corresponding value in column B.

Edit 1: I realised that openpyxl takes too long, and so have changed that to pandas.read_excel('data.xlsx','Sheet2') instead, and it is much faster at that stage at least.

Edit 2: For the time being, I have put my data in just one sheet and:

removed all other info
added column names,
applied index_col on my leftmost column
then used wb.loc[]

224

asked Jul 25 '16 06:07

spiff

2 Answers

Use the following arguments from pandas read_excel documentation:

skiprows : list-like

Rows to skip at the beginning (0-indexed)

parse_cols : int or list, default None

If None then parse all columns,

If int then indicates last column to be parsed

If list of ints then indicates list of column numbers to be parsed

If string then indicates comma separated list of column names and column ranges (e.g. “A:E” or “A,C,E:F”)

I imagine the call will look like:

df = read_excel(filename, 'Sheet2', skiprows = 2, parse_cols = 'A:D')

answered Sep 18 '22 07:09

shane

One way to do this is to use the openpyxl module.

Here's an example:

from openpyxl import load_workbook  wb = load_workbook(filename='data.xlsx',                     read_only=True)  ws = wb['Sheet2']  # Read the cell values into a list of lists data_rows = [] for row in ws['A3':'D20']:     data_cols = []     for cell in row:         data_cols.append(cell.value)     data_rows.append(data_cols)  # Transform into dataframe import pandas as pd df = pd.DataFrame(data_rows)

answered Sep 18 '22 07:09

DocZerø

Related questions
                            
                                Search recursively for value in object by property name
                            
                                Can a custom view be used as a TabItem?
                            
                                How to match struct fields in Rust?
                            
                                Download data from a jupyter server
                            
                                cannot resolve symbol 'LocationServices'
                            
                                Import color variables to my styles
                            
                                Macros in the Airflow Python operator
                            
                                Android Studio 3 - It is possible to take a screenshot or record screen?
                            
                                Angular - including CSS file in index.html
                            
                                How can I get last commit from GitHub API
                            
                                ExpiredKeyMapError on newly generated API key
                            
                                How to clean up wwwroot folder on the target Azure Websites Windows Server before each deployment in VSTS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With