Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse CSV data that contains newlines in field using JavaScript

Given a CSV file that has newline/return characters in certain fields, how do we parse the data without splitting a field into multiple rows.

Example CSV Data:

ID;Name;Country;ISO-2;Address;Latitude;Longitude
022wje3;Europa;Italy;IT;"Viale Kennedy 3
34073 Grado";"45,67960";"13,40070"
024oua5;Hiberia;Italy;IT;"Via XXIV Maggio 8
00187 Rome";"41,89720";"12,48680"
028gupn;Regal Riverside;Hong Kong;HK;"34-36 Tai Chung Kiu Road
Shatin
Hong Kong";"22,38260";"114,19600"
02j7qry;Okaliptus Holiday Villas Apart;Turkey;TR;"Sevket Sabanci Caddesi No. 70
Bahçelievler Mevkii
Turgutreis";"37,02130";"27,25120"
02pc99z;California Apartementos;Spain;ES;"Prat d'en Carbó
43840 Salou";"41,07620";"1,14667"
02tu1jz;Elvis Presley's Heartbreak;United States;US;"3677 Elvis Presley Blvd.
Memphis
Tennessee 38116";"35,04850";"-90,02710"

Note: fields are actually separated by semicolon ; because Address can contain commas

Each row has 7 fields but we don't want to mistakenly parse data in a field containing newline characters as multiple rows...

We found a few Perl-focussed answers on StackOverflow:

  • How to parse CSVs with newline and commas inside a field in Perl?
  • Text::CSV parsing when data contains newline

but I'm a bit rusty on Perl and have not discovered a JS-focussed answer.

like image 433
nelsonic Avatar asked Mar 29 '16 15:03

nelsonic


People also ask

Can CSV have newlines?

New Line Characters (Reference: https://en.wikipedia.org/wiki/Newline.) If you upload a CSV file that contains new line characters in the fields other than Text Area Field, it will be saved to the system but only Text Area Field preserves the new line if you edit the Asset.

Can JavaScript parse CSV?

Comma Separated Values (CSV) is used as a common format to exchange tabular data between spreadsheets and relational databases. In a JavaScript action, you can parse CSV data using the csv library.

How do you read and parse CSV file in JavaScript?

To convert or parse CSV data into an array , you need to use JavaScript's FileReader class, which contains a method called readAsText() that will read a CSV file data and parse the result as a string text. The FileReader class is a web API, so this solution only works in the browser.


1 Answers

Have you tried CSVToArray by Ben Nadel?

/**
 * CSVToArray parses any String of Data including '\r' '\n' characters,
 * and returns an array with the rows of data.
 * @param {String} CSV_string - the CSV string you need to parse
 * @param {String} delimiter - the delimeter used to separate fields of data
 * @returns {Array} rows - rows of CSV where first row are column headers
 */
function CSVToArray (CSV_string, delimiter) {
   delimiter = (delimiter || ","); // user-supplied delimeter or default comma

   var pattern = new RegExp( // regular expression to parse the CSV values.
     ( // Delimiters:
       "(\\" + delimiter + "|\\r?\\n|\\r|^)" +
       // Quoted fields.
       "(?:\"([^\"]*(?:\"\"[^\"]*)*)\"|" +
       // Standard fields.
       "([^\"\\" + delimiter + "\\r\\n]*))"
     ), "gi"
   );

   var rows = [[]];  // array to hold our data. First row is column headers.
   // array to hold our individual pattern matching groups:
   var matches = false; // false if we don't find any matches
   // Loop until we no longer find a regular expression match
   while (matches = pattern.exec( CSV_string )) {
       var matched_delimiter = matches[1]; // Get the matched delimiter
       // Check if the delimiter has a length (and is not the start of string)
       // and if it matches field delimiter. If not, it is a row delimiter.
       if (matched_delimiter.length && matched_delimiter !== delimiter) {
         // Since this is a new row of data, add an empty row to the array.
         rows.push( [] );
       }
       var matched_value;
       // Once we have eliminated the delimiter, check to see
       // what kind of value was captured (quoted or unquoted):
       if (matches[2]) { // found quoted value. unescape any double quotes.
        matched_value = matches[2].replace(
          new RegExp( "\"\"", "g" ), "\""
        );
       } else { // found a non-quoted value
         matched_value = matches[3];
       }
       // Now that we have our value string, let's add
       // it to the data array.
       rows[rows.length - 1].push(matched_value);
   }
   return rows; // Return the parsed data Array
}

in your case invoke it with:

var rows = CSVToArray(CSV_string, ';');

where CSV_string is your string of CSV data.

like image 92
iteles Avatar answered Oct 19 '22 00:10

iteles