So I've got about 5008 rows in a CSV file, a total of 5009 with the headers. I'm creating and writing this file all within the same script. But when i read it at the end, with either pandas pd.read_csv, or python3's csv module, and print the len, it outputs 4967. I checked the file for any weird characters that may be confusing python but don't see any. All the data is delimited by commas. I also opened it in sublime and it shows 5009 rows not 4967. I could try other methods from pandas like merge or concat, but if python wont read the csv correct, that's no use. This is one method i tried. <pre class="prettyprint"><code>df1=pd.read_csv('out.csv',quoting=csv.QUOTE_NONE, error_bad_lines=False) df2=pd.read_excel(xlsfile) print (len(df1))#4967 print (len(df2))#5008 df2['Location']=df1['Location'] df2['Sublocation']=df1['Sublocation'] df2['Zone']=df1['Zone'] df2['Subnet Type']=df1['Subnet Type'] df2['Description']=df1['Description'] newfile = input("Enter a name for the combined csv file: ") print('Saving to new csv file...') df2.to_csv(newfile, index=False) print('Done.') target.close() </code></pre> Another way I tried is <pre class="prettyprint"><code>dfcsv = pd.read_csv('out.csv') wb = xlrd.open_workbook(xlsfile) ws = wb.sheet_by_index(0) xlsdata = [] for rx in range(ws.nrows): xlsdata.append(ws.row_values(rx)) print (len(dfcsv))#4967 print (len(xlsdata))#5009 df1 = pd.DataFrame(data=dfcsv) df2 = pd.DataFrame(data=xlsdata) df3 = pd.concat([df2,df1], axis=1) newfile = input("Enter a name for the combined csv file: ") print('Saving to new csv file...') df3.to_csv(newfile, index=False) print('Done.') target.close() </code></pre> But not matter what way I try the CSV file is the actual issue, python is writing it correctly but not reading it correctly. Edit: Weirdest part is that i'm getting absolutely no encoding errors or any errors when running the code... Edit2: Tried testing it with nrows param in first code example, works up to 4000 rows. Soon as i specify 5000 rows, it reads only 4967. Edit3: manually saved csv file with my data instead of using the one written by the program, and it read 5008 rows. Why is python not writing the csv file correctly?

I ran into this issue also. I realized that some of my lines had open-ended quotes, which was for some reason interfering with the reader. So for example, some rows were written as: <pre class="prettyprint"><code>GO:0000026 molecular_function "alpha-1 GO:0000027 biological_process ribosomal large subunit assembly GO:0000033 molecular_function "alpha-1 </code></pre> and this led to rows being read incorrectly. (Unfortunately I don't know enough about how csvreader works to tell you why. Hopefully someone can clarify the quote behavior!) I just removed the quotes and it worked out. Edited: This option works too, if you want to maintain the quotes: <pre class="prettyprint"><code>quotechar=None </code></pre>

python csv reader not reading all rows

So I've got about 5008 rows in a CSV file, a total of 5009 with the headers. I'm creating and writing this file all within the same script. But when i read it at the end, with either pandas pd.read_csv, or python3's csv module, and print the len, it outputs 4967. I checked the file for any weird characters that may be confusing python but don't see any. All the data is delimited by commas.

I also opened it in sublime and it shows 5009 rows not 4967.

I could try other methods from pandas like merge or concat, but if python wont read the csv correct, that's no use.

This is one method i tried.

df1=pd.read_csv('out.csv',quoting=csv.QUOTE_NONE, error_bad_lines=False)
df2=pd.read_excel(xlsfile)

print (len(df1))#4967
print (len(df2))#5008

df2['Location']=df1['Location']
df2['Sublocation']=df1['Sublocation']
df2['Zone']=df1['Zone']
df2['Subnet Type']=df1['Subnet Type']
df2['Description']=df1['Description']

newfile = input("Enter a name for the combined csv file: ")
print('Saving to new csv file...')
df2.to_csv(newfile, index=False)
print('Done.')

target.close()

Another way I tried is

dfcsv = pd.read_csv('out.csv')

wb = xlrd.open_workbook(xlsfile)
ws = wb.sheet_by_index(0)
xlsdata = []
for rx in range(ws.nrows):
    xlsdata.append(ws.row_values(rx))

print (len(dfcsv))#4967
print (len(xlsdata))#5009

df1 = pd.DataFrame(data=dfcsv)
df2 = pd.DataFrame(data=xlsdata)

df3 = pd.concat([df2,df1], axis=1)

newfile = input("Enter a name for the combined csv file: ")
print('Saving to new csv file...')
df3.to_csv(newfile, index=False)    
print('Done.')

target.close()

But not matter what way I try the CSV file is the actual issue, python is writing it correctly but not reading it correctly.

Edit: Weirdest part is that i'm getting absolutely no encoding errors or any errors when running the code...

Edit2: Tried testing it with nrows param in first code example, works up to 4000 rows. Soon as i specify 5000 rows, it reads only 4967.

Edit3: manually saved csv file with my data instead of using the one written by the program, and it read 5008 rows. Why is python not writing the csv file correctly?

How do I read all lines in a csv file in Python?

Step 1: In order to read rows in Python, First, we need to load the CSV file in one object. So to load the csv file into an object use open() method. Step 2: Create a reader object by passing the above-created file object to the reader function. Step 3: Use for loop on reader object to get each row.

I ran into this issue also. I realized that some of my lines had open-ended quotes, which was for some reason interfering with the reader.

So for example, some rows were written as:

GO:0000026  molecular_function  "alpha-1
GO:0000027  biological_process  ribosomal large subunit assembly
GO:0000033  molecular_function  "alpha-1

and this led to rows being read incorrectly. (Unfortunately I don't know enough about how csvreader works to tell you why. Hopefully someone can clarify the quote behavior!)

I just removed the quotes and it worked out.

Edited: This option works too, if you want to maintain the quotes:

quotechar=None

python csv reader not reading all rows

Tags:

python

python-3.x

pandas

csv

DaVinci

People also ask

Video Answer

1 Answers

Anna Pamela Calinawan

Recent Activity

Donate For Us

python csv reader not reading all rows

Tags:

python

python-3.x

pandas

csv

DaVinci

People also ask

Video Answer

1 Answers

Anna Pamela Calinawan

Related questions

Recent Activity

Donate For Us