I am using Python 3.4 and xlrd. I want to sort the Excel sheet based on the primary column before processing it. Is there any library to perform this ?

I just wanted to refresh the answer as the Pandas implementation has changed a bit over time. Here's the code that should work now (pandas 1.1.2). <pre class="prettyprint"><code>import pandas as pd xl = pd.ExcelFile("test.xlsx") df = xl.parse("Sheet1") df = df.sort_values(by="Header Row") ... </code></pre> The <code>sort</code> function is now called <code>sort_by</code> and <code>columns</code> is replaced by <code>by</code>.

How to sort Excel sheet using Python

2 Answers

There are a couple ways to do this. The first option is to utilize xlrd, as you have this tagged. The biggest downside to this is that it doesn't natively write to XLSX format.

These examples use an excel document with this format:

Text Excel Layout

Utilizing xlrd and a few modifications from this answer:

import xlwt
from xlrd import open_workbook

target_column = 0     # This example only has 1 column, and it is 0 indexed

book = open_workbook('test.xlsx')
sheet = book.sheets()[0]
data = [sheet.row_values(i) for i in xrange(sheet.nrows)]
labels = data[0]    # Don't sort our headers
data = data[1:]     # Data begins on the second row
data.sort(key=lambda x: x[target_column])

bk = xlwt.Workbook()
sheet = bk.add_sheet(sheet.name)

for idx, label in enumerate(labels):
     sheet.write(0, idx, label)

for idx_r, row in enumerate(data):
    for idx_c, value in enumerate(row):
        sheet.write(idx_r+1, idx_c, value)

bk.save('result.xls')    # Notice this is xls, not xlsx like the original file is

This outputs the following workbook:

XLRD output

Another option (and one that can utilize XLSX output) is to utilize pandas. The code is also shorter:

import pandas as pd

xl = pd.ExcelFile("test.xlsx")
df = xl.parse("Sheet1")
df = df.sort(columns="Header Row")

writer = pd.ExcelWriter('output.xlsx')
df.to_excel(writer,sheet_name='Sheet1',columns=["Header Row"],index=False)
writer.save()

This outputs:

Pandas Output

In the to_excel call, the index is set to False, so that the Pandas dataframe index isn't included in the excel document. The rest of the keywords should be self explanatory.

answered Oct 10 '22 14:10

Andy

I just wanted to refresh the answer as the Pandas implementation has changed a bit over time. Here's the code that should work now (pandas 1.1.2).

import pandas as pd

xl = pd.ExcelFile("test.xlsx")
df = xl.parse("Sheet1")
df = df.sort_values(by="Header Row")
...

The sort function is now called sort_by and columns is replaced by by.

answered Oct 10 '22 15:10

akshayranganath

Related questions
                            
                                Complex number troubles with numpy
                            
                                When designing a Python API, is it more Pythonic to throw exceptions or return false/None, etc?
                            
                                Pygame: Collision by Sides of Sprite
                            
                                How to activate python virtual environment by shell script [duplicate]
                            
                                replace block within {{ super() }}
                            
                                How to parse date days that contain "st", "nd", "rd", or "th"?
                            
                                Python Image distortion
                            
                                How to fully uninstall pip installed with easy_install?
                            
                                Python code for sum with condition
                            
                                python requests bot detection?
                            
                                How to return html of a page using robobrowser
                            
                                identify graph uptrend or downtrend
                            
                                get input file name and file extention using flask
                            
                                elasticsearch scrolling using python client
                            
                                AttributeError when running unittest sample in iPy Notebook
                            
                                Does Python's 'in' operator for lists have an early-out for successful searches
                            
                                How do I "randomly" select numbers with a specified bias toward a particular number
                            
                                Why is there no uuid.uuid2 in Python?
                            
                                dateutil.tz package apparently missing when using Pandas?
                            
                                How to add a character to the end of every string in a list? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to sort Excel sheet using Python

Tags:

python

xlrd

Ree

People also ask

2 Answers

Andy

akshayranganath

Recent Activity

Donate For Us