I downloaded a google-spreadsheet as an object in python.
How can I use openpyxl use the workbook without having it to save to disk first?
I know that xlrd can do this by:
book = xlrd.open_workbook(file_contents=downloaded_spreadsheet.read())
with "downloaded_spreadsheet" being my downloaded xlsx-file as an object.
Instead of xlrd, I want to use openpyxl because of better xlsx-support(I read).
I'm using this so far...
#!/usr/bin/python import openpyxl import xlrd # which to use..? import re, urllib, urllib2 class Spreadsheet(object): def __init__(self, key): super(Spreadsheet, self).__init__() self.key = key class Client(object): def __init__(self, email, password): super(Client, self).__init__() self.email = email self.password = password def _get_auth_token(self, email, password, source, service): url = "https://www.google.com/accounts/ClientLogin" params = { "Email": email, "Passwd": password, "service": service, "accountType": "HOSTED_OR_GOOGLE", "source": source } req = urllib2.Request(url, urllib.urlencode(params)) return re.findall(r"Auth=(.*)", urllib2.urlopen(req).read())[0] def get_auth_token(self): source = type(self).__name__ return self._get_auth_token(self.email, self.password, source, service="wise") def download(self, spreadsheet, gid=0, format="xls"): url_format = "https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=%s&exportFormat=%s&gid=%i" headers = { "Authorization": "GoogleLogin auth=" + self.get_auth_token(), "GData-Version": "3.0" } req = urllib2.Request(url_format % (spreadsheet.key, format, gid), headers=headers) return urllib2.urlopen(req) if __name__ == "__main__": email = "[email protected]" # (your email here) password = '.....' spreadsheet_id = "......" # (spreadsheet id here) # Create client and spreadsheet objects gs = Client(email, password) ss = Spreadsheet(spreadsheet_id) # Request a file-like object containing the spreadsheet's contents downloaded_spreadsheet = gs.download(ss) # book = xlrd.open_workbook(file_contents=downloaded_spreadsheet.read(), formatting_info=True) #It works.. alas xlrd doesn't support the xlsx-funcionality that i want... #i.e. being able to read the cell-colordata..
I hope anyone can help because I'm struggling for months to get the color-data from given cell in google-spreadsheet. (I know the google-api doesn't support it..)
Reading Excel file is magnitudes slower using openpyxl compared to xlrd.
Developers describe openpyxl as "A Python library to read/write Excel 2010 xlsx/xlsm files". A Python library to read/write Excel 2010 xlsx/xlsm files. On the other hand, pandas is detailed as "Powerful data structures for data analysis".
If you are working with large files or are particularly concerned about speed then you may find XlsxWriter a better choice than OpenPyXL. XlsxWriter is a Python module that can be used to write text, numbers, formulas and hyperlinks to multiple worksheets in an Excel 2007+ XLSX file.
In the docs for load_workbook
it says:
#:param filename: the path to open or a file-like object
..so it was capable of it all the time. It reads a path or takes a file-like object. I only had to convert my file-like object returned by urlopen
, to a bytestream
with:
from io import BytesIO wb = load_workbook(filename=BytesIO(input_excel.read()))
and I can read every piece of data in my Google-spreadsheet.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With