I am trying to read and excel file in Azure Blob Storage with .xlsx extension in my azure data factory dataset. it throws following error
Error found when processing 'Csv/Tsv Format Text' source 'Filename.xlsx' with row number 3: found more columns than expected column count: 1.
What are the right Column and row delimiters for excel files to be read in azure Data factory
The service supports both ". xls" and ". xlsx". Excel format is supported for the following connectors: Amazon S3, Amazon S3 Compatible Storage, Azure Blob, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure Files, File System, FTP, Google Cloud Storage, HDFS, HTTP, Oracle Cloud Storage and SFTP.
In Azure SQL Database, you cannot import directly from Excel. You must first export the data to a text (CSV) file. Before you can run a distributed query, you have to enable the ad hoc distributed queries server configuration option, as shown in the following example.
An XLSX file is a Microsoft Excel Open XML Format Spreadsheet file. Open one with Excel, Excel Viewer, Google Sheets, or another spreadsheet program.
Update March 2022: ADF now has better support for Excel via Mapping Data Flows:
https://docs.microsoft.com/en-us/azure/data-factory/format-excel
Excel files have a proprietary format and are not simple delimited files. As indicated here, Azure Data Factory does not have a direct option to import Excel files, eg you cannot create a Linked Service to an Excel file and read it easily. Your options are:
Let us know how you get on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With