I want to download a CSV file stored in Azure storage into a stream and directly used in my python script, but after I did this with help from Thomas, I cannot use pandas read_csv method, the error message is: pandas.io.common.EmptyDataError: No columns to parse from file,thus I assume the download CSV stream is actually empty, but after check in storage account, the CSV file is fine with all data inside it, what the problem here? below is the code from Thomas:
from azure.storage.blob import BlockBlobService
import io
from io import BytesIO, StringIO
import pandas as pd
from shutil import copyfileobj
with BytesIO() as input_blob:
with BytesIO() as output_blob:
block_blob_service = BlockBlobService(account_name='my account', account_key='mykey')
block_blob_service.get_blob_to_stream('my counter', 'datatest1.csv', input_blob)
df=pd.read_csv(input_blob)
print(df)
copyfileobj(input_blob, output_blob)
#print(output_blob)
# Create the a new blob
block_blob_service.create_blob_from_stream('my counter', 'datatest2.csv', output_blob)
if i dont execute the read_csv code, the create_blob_from_stream will create a empty file, but if i execute the read_csv code, i got error:
pandas.parser.TextReader.cinit (pandas\parser.c:6171) pandas.io.common.EmptyDataError: No columns to parse from file
the download file stored fine in the blob storage with all data in it. as showing below:
Azure Blob storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing massive amounts of unstructured data, such as text or binary data. Blob storage is ideal for: Serving images or documents directly to a browser.
i finally figure out, after spend so many time on this !
have to EXECUTE :
input_blob.seek(0)
to use the stream after save the stream to input_blob !!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With