Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Line contains NULL byte" in CSV reader (Python)

Tags:

python

csv

I'm trying to write a program that looks at a .CSV file (input.csv) and rewrites only the rows that begin with a certain element (corrected.csv), as listed in a text file (output.txt).

This is what my program looks like right now:

import csv  lines = [] with open('output.txt','r') as f:     for line in f.readlines():         lines.append(line[:-1])  with open('corrected.csv','w') as correct:     writer = csv.writer(correct, dialect = 'excel')     with open('input.csv', 'r') as mycsv:         reader = csv.reader(mycsv)         for row in reader:             if row[0] not in lines:                 writer.writerow(row) 

Unfortunately, I keep getting this error, and I have no clue what it's about.

Traceback (most recent call last):   File "C:\Python32\Sample Program\csvParser.py", line 12, in <module>     for row in reader: _csv.Error: line contains NULL byte 

Credit to all the people here to even to get me to this point.

like image 463
James Roseman Avatar asked Oct 25 '11 19:10

James Roseman


People also ask

Can CSV have NULL values?

Empty Strings and NULL Values In CSV files, a NULL value is typically represented by two successive delimiters (e.g. ,, ) to indicate that the field contains no data; however, you can use string values to denote NULL (e.g. null ) or any unique string.

How do I read a csv file in a line?

Step 1: Load the CSV file using the open method in a file object. Step 2: Create a reader object with the help of DictReader method using fileobject. This reader object is also known as an iterator can be used to fetch row-wise data. Step 3: Use for loop on reader object to get each row.


2 Answers

I've solved a similar problem with an easier solution:

import codecs csvReader = csv.reader(codecs.open('file.csv', 'rU', 'utf-16')) 

The key was using the codecs module to open the file with the UTF-16 encoding, there are a lot more of encodings, check the documentation.

like image 69
K. David C. Avatar answered Sep 20 '22 11:09

K. David C.


I'm guessing you have a NUL byte in input.csv. You can test that with

if '\0' in open('input.csv').read():     print "you have null bytes in your input file" else:     print "you don't" 

if you do,

reader = csv.reader(x.replace('\0', '') for x in mycsv) 

may get you around that. Or it may indicate you have utf16 or something 'interesting' in the .csv file.

like image 44
retracile Avatar answered Sep 16 '22 11:09

retracile