Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specify Newline character ('\n') in reading csv using Python

I want to read a csv file with each line dictated by a newline character ('\n') using Python 3. This is my code:

import csv
with open(input_data.csv, newline ='\n') as f:
        csvread = csv.reader(f)
        batch_data = [line for line in csvread]

This above code gave error:

batch_data = [line for line in csvread].
_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

Reading these posts: CSV new-line character seen in unquoted field error, also tried these alternatives that I could think about:

with open(input_data.csv, 'rU', newline ='\n') as f:
        csvread = csv.reader(f)
        batch_data = [line for line in csvread]


with open(input_data.csv, 'rU', newline ="\n") as f:
        csvread = csv.reader(f)
        batch_data = [line for line in csvread]

No luck of geting this correct yet. Any suggestions?

I am also reading the documentation about newline: if newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n line on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

So my understanding of this newline method is:

1) it is a necessity,

2) does it indicate the input file would be split into lines by empty space character?

like image 729
enaJ Avatar asked Nov 07 '16 23:11

enaJ


People also ask

How do you handle a new line character in csv?

To embed a newline in an Excel cell, press Alt+Enter. Then save the file as a . csv. You'll see that the double-quotes start on one line and each new line in the file is considered an embedded newline in the cell.

What is newline in Python csv?

The csv module does its own handling of newlines within Reader objects, so it wants the file object to pass along the newline characters unmodified. That's what newline='' tells the open function you want. Follow this answer to receive notifications. answered Dec 20, 2021 at 22:42.

What is the use of newline in CSV file?

Including the newline parameter allows the csv module to handle the line endings itself - replicating the format as defined in your csv.

Which of the following is the default character for the newline parameter for a CSV file object opened in write mode in Python idle?

It defaults to '\r\n'.


1 Answers

  1. newline='' is correct in all csv cases, and failing to specify it is an error in many cases. The docs recommend it for the very reason you're encountering.

  2. newline='' doesn't mean "empty space" is used for splitting; it's specifically documented on the open function:

If [newline] is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated.

So with newline='' all original \r and \n characters are returned unchanged. Normally, in universal newlines mode, any newline like sequence (\r, \n, or \r\n) is converted to \n in the input. But you don't want this for CSV input, because CSV dialects are often quite picky about what constitutes a newline (Excel dialect requires \r\n only).

Your code should be:

import csv
with open('input_data.csv', newline='') as f:
    csvread = csv.reader(f)
    batch_data = list(csvread)

If that doesn't work, you need to look at your CSV dialect and make sure you're initializing csv.reader correctly.

like image 80
ShadowRanger Avatar answered Sep 26 '22 14:09

ShadowRanger