Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: No column names in data file

I am trying to process a dataset to play with for DataScience but it does not have column names. The output of df.head() as shown below:

   1  73                  Not in universe   0  0.1   0.2  Not in universe.1
0  2  58   Self-employed-not incorporated   4   34     0    Not in universe
1  3  18                  Not in universe   0    0     0        High school
2  4   9                  Not in universe   0    0     0    Not in universe
3  5  10                  Not in universe   0    0     0    Not in universe
4  6  48                          Private  40   10  1200    Not in universe

What I would like to see is

0  1  73                  Not in universe   0  0.1   0.2  Not in universe.1
1  2  58   Self-employed-not incorporated   4   34     0    Not in universe
2  3  18                  Not in universe   0    0     0        High school
3  4   9                  Not in universe   0    0     0    Not in universe
4  5  10                  Not in universe   0    0     0    Not in universe
5  6  48                          Private  40   10  1200    Not in universe

I could assign random column names but is there a nicer way?

like image 648
chintan s Avatar asked Jul 06 '16 15:07

chintan s


1 Answers

You loaded the file without specifying whether it had a header row or not, by default it infers it from the first row, if it's missing then pass header=None:

df = pd.read_csv(file_path, header=None)
like image 168
EdChum Avatar answered Oct 14 '22 01:10

EdChum