Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

read_csv doesn't read the column names correctly on this file?

I have a csv file as follows:

0 5
1 10
2 15
3 20
4 25

I want to save it as a dataframe with x,y axes as names, then plot it. However when I assign x,y I get a messed up DataFrame, what is happening?

column_names = ['x','y']
x = pd.read_csv('csv-file.csv', header = None, names = column_names)
print(x)

          x   y
0   0 5 NaN
1  1 10 NaN
2  2 15 NaN
3  3 20 NaN
4  4 25 NaN

I've tried without specifying None for header, to no avail.

like image 714
Vyraj Avatar asked May 31 '16 18:05

Vyraj


People also ask

How do I add a column to a CSV file in Jupyter notebook?

Use pandas to add a column to a CSV file DataFrame from the CSV filename . Use DataFrame[column_name] = "" to create a new column column_name . Call DataFrame. to_csv(filename, index=False) to output the DataFrame as a CSV file, ignoring the index values.

What is the correct reader function for CSV files in pandas?

To read a CSV file, call the pandas function read_csv () and pass the file path as input. Step 1: Import Pandas import pandas as pd. Step 2: Read the CSV # Read the csv file df = pd. read_csv("data1.csv") # First 5 rows df.

What does CSV in read_csv () stand for?

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. A sample CSV file | Image by Author. The pandas. read_csv() has about 50 optional calling parameters permitting very fine-tuned data import.


1 Answers

Add parameter sep="\s+" or delim_whitespace=True to read_csv:

import pandas as pd

temp=u"""0 5
1 10
2 15
3 20
4 25"""
#after testing replace io.StringIO(temp) to filename
column_names = ['x','y']
df = pd.read_csv(pd.compat.StringIO(temp), sep="\s+", header = None, names = column_names)

print (df)
   x   y
0  0   5
1  1  10
2  2  15
3  3  20
4  4  25

Or:

column_names = ['x','y']
df = pd.read_csv(pd.compat.StringIO(temp),
                 delim_whitespace=True, 
                 header = None, 
                 names = column_names)

print (df)
   x   y
0  0   5
1  1  10
2  2  15
3  3  20
4  4  25
like image 112
jezrael Avatar answered Nov 07 '22 14:11

jezrael