Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write a string containing commas and double quotes to CSV

I'm trying to produce a Google Shopping feed of 30,000+ items in NetSuite, a CRM system that runs server-side JavaScript that it calls Suitescript 2.0. Essentially, it's just JavaScript with a few more restrictions. I've been tasked with outputting this product feed as a CSV.

The problem is that the product descriptions of these items contain variables amounts of commas, double quotes, single quotes and HTML. At first, it was just the commas causing me problems, so after a bit of research, I wrapped the strings I was outputting in double quotes:

//This function isn't terribly important, but is referenced below

function sanitizeString (desc) {
    var itemDesc;
    if (desc) {
        itemDesc = desc.replace(/(\r\n|\n|\r|\s+|\t| )/gm,' ');
        itemDesc = itemDesc.replace(/,/g, '\,');
        itemDesc = itemDesc.replace(/"/g, '\"');
        itemDesc = itemDesc.replace(/'/g, '\'');
        itemDesc = itemDesc.replace(/ +(?= )/g,'');
    } else {
        itemDesc = '';
    }
    return itemDesc;
}

var row = '';

for (var i = 0; i < columns.length; i++) {
    var col = columns[i];
    row += '"' + sanitizeString(val[col]) + '"';
    if (i != columns.length - 1) {
        row += ',';
    }
}
newFeed.appendLine({value: row});

However, it seems that these double quotes are interacting strangely with double quotes within the string causing some weird formatting, even though my sanitizeString() function should be escaping them. Any time that a description contains a double quote, the next row doesn't get it's own line. It gets appended to the last column.

So, naturally, I escaped the external quotes like this:

row += '\"' + sanitizeString(val[col]) + '\"';

Doing that makes things go completely haywire, a lot of items don't get pushed to new lines and I max out the number of columns I'm allowed because it just keeps on going.

The other natural solution would be to go edit the product descriptions, but I'm not terribly anxious to do that for 30,000+ items...

Does anybody know what might be going on here? I feel like there's something really simple I'm overlooking...

like image 214
B1gJ4k3 Avatar asked Oct 09 '17 02:10

B1gJ4k3


2 Answers

It turns out that, according to the CSV specs, to include double quotes within a string that is already quoted, you need to use two double quotes (""). I changed:

itemDesc = itemDesc.replace(/"/g, '\"');

to

itemDesc = itemDesc.replace(/"/g, '""');

I also removed

itemDesc = itemDesc.replace(/,/g, '\,');
itemDesc = itemDesc.replace(/'/g, '\'');

Since the column in the CSV is being quoted already. These are unnecessary.

like image 113
B1gJ4k3 Avatar answered Sep 26 '22 15:09

B1gJ4k3


I use this simple function to convert an string[][] to a csv file. It quotes the cell, if it contains a ", a , or other whitespace (except blanks):

/**
 * Takes an array of arrays and returns a `,` sparated csv file.
 * @param {string[][]} table
 * @returns {string}
 */
export function toCSV(table: string[][]) {
    return table
        .map(row =>
            row
                .map(cell => {
                    // We remove blanks and check if the column contains
                    // other whitespace,`,` or `"`.
                    // In that case, we need to quote the column.
                    if (cell.replace(/ /g, '').match(/[\s,"]/)) {
                        return '"' + cell.replace(/"/g, '""') + '"';
                    }
                    return cell;
                })
                .join(',')
        )
        .join('\n');
}
like image 32
Michael_Scharf Avatar answered Sep 22 '22 15:09

Michael_Scharf