From password-protected Excel file to pandas DataFrame

I can open a password-protected Excel file with this:

import sys
import win32com.client
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename, password = sys.argv[1:3]
xlwb = xlApp.Workbooks.Open(filename, Password=password)
# xlwb = xlApp.Workbooks.Open(filename)
xlws = xlwb.Sheets(1) # counts from 1, not from 0
print xlws.Name
print xlws.Cells(1, 1) # that's A1

I'm not sure though how to transfer the information to a pandas dataframe. Do I need to read cells one by one and all, or is there a convenient method for this to happen?

2 Answers

Simple solution

import io
import pandas as pd
import msoffcrypto

passwd = 'xyz'

decrypted_workbook = io.BytesIO()
with open(i, 'rb') as file:
    office_file = msoffcrypto.OfficeFile(file)

df = pd.read_excel(decrypted_workbook, sheet_name='abc')

pip install --user msoffcrypto-tool

Exporting all sheets of each excel from directories and sub-directories to seperate csv files

from glob import glob
PATH = "Active Cons data"

# Scaning all the excel files from directories and sub-directories
excel_files = [y for x in os.walk(PATH) for y in glob(os.path.join(x[0], '*.xlsx'))] 

for i in excel_files:
    decrypted_workbook = io.BytesIO()
    with open(i, 'rb') as file:
        office_file = msoffcrypto.OfficeFile(file)

    df = pd.read_excel(decrypted_workbook, sheet_name=None)
    sheets_count = len(df.keys())
    sheet_l = list(df.keys())  # list of sheet names
    for i in range(sheets_count):
        sheet = sheet_l[i]
        df = pd.read_excel(decrypted_workbook, sheet_name=sheet)
        new_file = f"D:\\all_csv\\{sheet}.csv"
        df.to_csv(new_file, index=False)
from David Hamann's site (all credits go to him) https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/

Use xlwings, opening the file will first launch the Excel application so you can enter the password.

import pandas as pd
import xlwings as xw

PATH = '/Users/me/Desktop/xlwings_sample.xlsx'
wb = xw.Book(PATH)
sheet = wb.sheets['sample']

df = sheet['A1:C4'].options(pd.DataFrame, index=False, header=True).value
