Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Data Factory CSV with double quotes

I have a pipeline that retrieves an FTP hosted CSV file. It is comma delimited with double quote identifiers. The issue exists where a string is encapsulated in double quotes, but the string itself contains double quotes.

string example: "Spring Sale" this year.

How it looks in the csv (followed and lead by two null columns):

"","""Spring Sale"" this year",""

SSIS handles this fine, but Data Factory wants to transform it into an extra column that isn't separated by a comma. I have removed the extra quotes on this line and it works fine.

Is there a way around this besides altering the source?

like image 270
Dan Peter Avatar asked Jun 07 '19 22:06

Dan Peter


People also ask

Can CSV have double quotes?

You can put quotes, dashes and spaces in the CSV file. Fields that contain a special character (comma, newline, or double quote), must be enclosed in double quotes. If a field's value contains a double quote character it is escaped by placing another double quote character next to it.

What is quote character in Azure data Factory?

The single character to quote column values if it contains column delimiter. The default value is double quotes " . When quoteChar is defined as empty string, it means there is no quote char and column value is not quoted, and escapeChar is used to escape the column delimiter and itself.


1 Answers

I got this to work using Escape character set as quote (") with the Azure Data Factory Copy Task. Screen shot:

ADF Copy Task

This was based on a file as per your spec:

"","""Spring Sale"" this year",""

and also worked as in insert into an Azure SQL Database table. The sample JSON:

{
    "name": "DelimitedText1",
    "properties": {
        "linkedServiceName": {
            "referenceName": "linkedService2",
            "type": "LinkedServiceReference"
        },
        "type": "DelimitedText",
        "typeProperties": {
            "location": {
                "type": "AzureBlobStorageLocation",
                "fileName": "quotes.txt",
                "container": "someContainer"
            },
            "columnDelimiter": ",",
            "escapeChar": "\"",
            "quoteChar": "\""
        },
        "schema": [
            {
                "name": "Prop_0",
                "type": "String"
            },
            {
                "name": "Prop_1",
                "type": "String"
            },
            {
                "name": "Prop_2",
                "type": "String"
            }
        ]
    }
}

Maybe the example file is too simple but it did work for me in this configuration.

Alternately, just use SSIS and host it in Data Factory.

like image 141
wBob Avatar answered Oct 27 '22 04:10

wBob