Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read only the headers of Excel files

I have a large number of Excel files that I need to download from the web and then extract only the header (column names) from and then move on. So far I have only managed to download the whole file and then read it into a Pandas DF from which I can extract the column names.

Is there a faster way to read, rather than download, or parse only the header, rather than the whole Excel file?

resp = requests.get(test_url)

with open('test.xls', 'wb') as output:
    output.write(resp.content)


headers = pd.ExcelFile("test.xls").parse(sheetname = 2)

headers.columns

If there is not an efficient way to "partially" download the Excel file to get only the header, is there an efficient way to read only the header after it has already been downloaded?

like image 610
Josh Avatar asked Dec 06 '25 03:12

Josh


1 Answers

I would say no, because xls Excel files are binary files. So the parser of pandas ExcelFile needs a complete file. If you give it a partial file, it should report an incorrect file (with some reason...).

If you really want to do that, you will have to thoroughly analyze (in binary form) some of the Excel files you want to process, and try to identify the minimum size you need to find the names in the first row. Then you should download them by implementing the http protocol at a low enough level to be able to close the connection, or at least stop reading as soon as you have enough bytes. Finally, you have just to write a dedicated parser hoping that nothing changes in those files - because you no longer use high level maintained tools for that but only binary reads.

TL/DR: unless you have a very strong reason to do that, just forget it, because it will be hard, error prone and hardly maintainable if only possible.

like image 140
Serge Ballesta Avatar answered Dec 07 '25 18:12

Serge Ballesta



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!