I am trying to process a dataset to play with for DataScience but it does not have column names. The output of df.head()
as shown below:
1 73 Not in universe 0 0.1 0.2 Not in universe.1
0 2 58 Self-employed-not incorporated 4 34 0 Not in universe
1 3 18 Not in universe 0 0 0 High school
2 4 9 Not in universe 0 0 0 Not in universe
3 5 10 Not in universe 0 0 0 Not in universe
4 6 48 Private 40 10 1200 Not in universe
What I would like to see is
0 1 73 Not in universe 0 0.1 0.2 Not in universe.1
1 2 58 Self-employed-not incorporated 4 34 0 Not in universe
2 3 18 Not in universe 0 0 0 High school
3 4 9 Not in universe 0 0 0 Not in universe
4 5 10 Not in universe 0 0 0 Not in universe
5 6 48 Private 40 10 1200 Not in universe
I could assign random column names but is there a nicer way?
You loaded the file without specifying whether it had a header row or not, by default it infers it from the first row, if it's missing then pass header=None
:
df = pd.read_csv(file_path, header=None)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With