Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing CSV in PHP and trying to keep line breaks in field contents

Tags:

php

csv

I have a csv file in which one of the columns there are line breaks. The column is a description field so it stores lines of text, bulleted points, and most importantly line breaks--and sometimes two between paragraphs. To complicate matters, the description field also contains quotes.

I have tried everything I know to get the content (and keeping the formatting) into a variable. I have tried file_get_contents, str_getcsv and fgetcsv to no avail. One of the functions let me put the entire description field into a variable, but it stripped all the new lines, so everything is in one giant paragraph. Another one separately parsed each paragraph within the description field, instead of one single unit.

This is the general format of my csv file:

"391","The Great Gatsby","The Great Gatsby, F. Scott Fitzgerald’s third book, stands as the supreme achievement of his career. 

This exemplary novel of the Jazz Age has been acclaimed by generations of readers. The story of the fabulously wealthy Jay Gatsby and his love for the beautiful Daisy Buchanan, of lavish parties on Long Island at a time when The New York Times noted “gin was the national drink and sex the national obsession,” it is an exquisitely crafted tale of America in the 1920s.

The Great Gatsby is one of the great classics of twentieth-century literature.","No","Yes",

I'd like to have a variable where $array[0]=391, $array[1]=The Great Gatsby and most importantly $array[2] contains the text in the description with all the line breaks and formatting, including the quotes within the field itself.

Or if there's a better way to parse this using PHP, please let me know.

like image 905
whistler Avatar asked Aug 09 '14 15:08

whistler


People also ask

Can csv have line breaks?

New Line Characters (Reference: https://en.wikipedia.org/wiki/Newline.) If you upload a CSV file that contains new line characters in the fields other than Text Area Field, it will be saved to the system but only Text Area Field preserves the new line if you edit the Asset.

How do I skip the first row while reading a CSV file in PHP?

php $row = 1; if (($handle = fopen("ptt. csv", "r")) !== FALSE) { while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) { $num = count($data); $row++; for ($c=0; $c < $num; $c++) { if(strpos($data[$c], 'Finished') !==


2 Answers

if someone still having this problem in 2019 my solution was fgetcsv($file). Automatically respect the linebreaks inside text fields.

like image 182
nyu.exe Avatar answered Sep 28 '22 04:09

nyu.exe


I recently wrote a function myself for precisely this use-case.

The PHP function str_getcsv() is for interpreting a single line of CSV data. Often, you can simply explode the file on new lines, and then parse each line individually, but what do you do when fields themselves may contain new lines?

The trick is to use str_getcsv() twice: once to split the file into lines, and again to split each individual line.

/**
 * Receives CSV string and returns as an array.
 *
 * *************************************************************************
 *
 * Based on code by fab at tradermail dot info at http://php.net/manual/en/function.str-getcsv.php#119666
 */
function csv_to_array($csv, $delimiter=',', $header_line=true) {
    // CSV from external sources may have Unix or DOS line endings. str_getcsv()
    // requires that the "delimiter" be one character only, so we don't want
    // to pass the DOS line ending \r\n to that function. So first we ensure
    // that we have Unix line endings only.
    $csv = str_replace("\r\n", "\n", $csv);

    // Read the CSV lines into a numerically indexed array. Use str_getcsv(),
    // rather than splitting on all linebreaks, as fields may themselves contain
    // linebreaks.
    $all_lines = str_getcsv($csv, "\n");
    if (!$all_lines) {
        return false;
    }

    $csv = array_map(
        function(&$line) use ($delimiter) {
            return str_getcsv($line, $delimiter);
        },
        $all_lines
    );

    if ($header_line) {
        // Use the first row's values as keys for all other rows.
        array_walk(
            $csv,
            function(&$a) use ($csv) {
                $a = array_combine($csv[0], $a);
            }
        );
        // Remove column header row.
        array_shift($csv);
    }

    return $csv;
}

Bonus: This function can also (if $header_line) return an associative array, using the first line as key names.

like image 38
TRiG Avatar answered Sep 28 '22 04:09

TRiG