Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Importing a CSV file in pandas into a pandas dataframe

Tags:

python

pandas

csv

I have a CSV file taken from a SQL dump that looks like the below (first few lines using head file.csv from terminal):

??AANAT,AANAT1576,4
AANAT,AANAT1704,1
AAP,AAP-D-12-00691,8
AAP,AAP-D-12-00834,3

When I use the pd.read_csv('file.csv') command I get an error "ValueError: No columns to parse from file".

Any ideas on how to import the CSV file into a table and avoid the error?

ELABORATION OF QUESTION (following Ed's comment)

I have tried header = None, skiprows=1 to avoid the ?? (which appear when using the head command from the terminal).

The file path to the extract is http://goo.gl/jyYlIK

like image 490
user7289 Avatar asked Sep 29 '14 10:09

user7289


1 Answers

So the ?? characters you see are in fact non-printable characters which after looking at your raw csv file using a hex editor show that they are in fact utf-16 little endian \FFEE which is the Byte-Order-Mark.

So all you need to do is to pass this as the encoding type and it reads in fine:

In [46]:

df = pd.read_csv('otherfile.csv', encoding='utf-16', header=None)
df
Out[46]:
       0               1   2
0  AANAT       AANAT1576   4
1  AANAT       AANAT1704   1
2    AAP  AAP-D-12-00691   8
3    AAP  AAP-D-12-00834   3
4    AAP  AAP-D-13-00215  10
5    AAP  AAP-D-13-00270   7
6    AAP  AAP-D-13-00435   5
7    AAP  AAP-D-13-00498   4
8    AAP  AAP-D-13-00530   0
9    AAP  AAP-D-13-00747   3
like image 181
EdChum Avatar answered Oct 22 '22 00:10

EdChum