I'm on a Debian GNU/Linux computer, working with Python 2.7.9.
As a part of my job, I have been making python scripts that read inputs in various formats (e.g. Excel, Csv, Txt) and parse the information to more standarized files. It's not my first time opening or working with Excel files.
There's a particular file which is giving me problems, I just can't open it. When I tried with xlrd (version 0.9.3), it gave me the following error:
xlrd.open_workbook('sample.xls')
XLRDError: Unsupported format, or corrupt file: BOF not workbook/worksheet: op=0x0009 vers=0x0002 strm=0x000a build=0 year=0 -> BIFF21
I tried to investigate the matter on my own, found a couple of answers in StackOverflow but I couldn't open it anyway. This particular answer I found may be the problem (the second explanation), but it doesn't include a workaround: https://stackoverflow.com/a/16518707/4345659
A tool that could conert the file to csv/txt would also solve the problem.
I already tried with:
A sample file is available here: https://ufile.io/r4m6j
As a side note, I can open it with LibreOffice Calc and MS Excel, so I could eventually change it to csv that way. The thing is, I need to do it all with a python script.
Thanks in advance!
It seems like MS Problem. The xls file is very strange, maybe you should contact xlrd support.
But I have a crazy workaround for you: xls2ods. It works for me even though xls2csv doesn't (SiC!).
So, install catdoc first:
$sudo apt-get install catdoc
Then convert your xls file to ods and open ods using pyexcel_ods or whatever you prefer. To use pyexcel_ods install it first using pip install pyexcel_ods
.
import subprocess
from pyexcel_ods import get_data
file_basename = 'sample'
returncode = subprocess.call(['xls2ods', '{}.xls'.format(file_basename)])
if returnecode > 0:
# consider to use subprocess.Popen if you need more control on stderr
exit(returncode)
data = get_data('{}.ods'.format(file_basename))
print(data)
I'm getting following output:
OrderedDict([(u'sample',
[[u'labo',
u'codfarm',
u'farmacia',
u'direccion',
u'localidad',
u'nom_medico',
u'matricula',
u'troquel',
u'producto',
u'cant_total']])])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With