Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to read a xlsx stream file using openpyxl

I am trying to read a streamed xlsx file. The user inputs the data via user interface and then the data is passed to me via streaming in xlsx. How do i read it ? I couldn't find any documentation.

cat text.xlsx | python myprogram.py

How do i read this stream ? Any help would be appreciated. I am not sure whether openpyxl allows this read. I am using python3

like image 436
Kumar Govind Avatar asked Sep 11 '25 00:09

Kumar Govind


2 Answers

openpyxl.load_workbook's first argument filename can be not only a filename, but also a file-like object, and sys.stdin is the file-like object representing your program's stdin.

You need it in binary mode though, see the note in the docs regarding binary standard streams.

import sys
from openpyxl import load_workbook

wb = load_workbook(sys.stdin.buffer)
print(wb.sheetnames)

Run:

$ cat test.xlsx | python test.py
['Sheet1', 'Sheet2']
like image 71
Norrius Avatar answered Sep 12 '25 16:09

Norrius


I do not have enough rep to comment, therefore I am posting this as an answer.

As @CheradenineZK mentions, if the input is coming from an (Azure) blob storage, it needs to be wrapped in a byte buffer. I was struggling to figure out exactly how to do this...

import io
import openpyxl

bytes_in = io.BytesIO(myblob.read())
wb = openpyxl.load_workbook(bytes_in)

is doing the trick for me.

like image 22
Andreas Avatar answered Sep 12 '25 16:09

Andreas