Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why pandas read_csv issues this warning? (elementwise comparison failed)

I have a set of led data saved on disk (led.csv) as such:

 , LEDC1, LEDC2, LEDC3
0, 54859, 11349, 56859
1, 54850, 12135, 56890
2, 54836, 12400, 56892
3, 54840, 15725, 56897
4, 54841, 19038, 56896
5, 54837, 21232, 56911
.,  ... ,  ... ,  ...

I am reading this data from .csv using pandas read_csv function:

data = pd.read_csv("Data/led.csv", index_col=0)

Providing the index_col argument to this function issues the following (numpy) warning:

C:\Program Files\Python\lib\site-packages\numpy\lib\arraysetops.py:466: 
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
mask |= (ar1 == a)

What is the proper way of reading a .csv file with existing index using Pandas?

Any insight on the warning is much appreciated...

like image 798
Mr.Uyan Avatar asked Feb 16 '18 00:02

Mr.Uyan


People also ask

What function would you use to load a CSV file in pandas PD read_csv?

Pandas read_csv() function imports a CSV file to DataFrame format. header: this allows you to specify which row will be used as column names for your dataframe. Expected an int value or a list of int values. Default value is header=0 , which means the first row of the CSV file will be treated as column names.

How do I convert a CSV file to pandas?

The Pandas to_csv() function is used to convert the DataFrame into CSV data. To write the CSV data into a file, we can simply pass a file object to the function. Otherwise, the CSV data is returned in a string format.

What does CSV in read_csv () stand for?

A comma-separated values (csv) file is returned as two-dimensional data structure with labeled axes. See also DataFrame.to_csv. Write DataFrame to a comma-separated values (csv) file. read_csv. Read a comma-separated values (csv) file into DataFrame.


2 Answers

This is one way to get the same result as with index_col = 0 but without the warning. It may not be the most concise way though:

data = pd.read_csv("Data/led.csv")
data.set_index([data.columns.values[0]], inplace=True)
data.index.names = [None]

This is a great post on the type of error and, below it, there is a solution for a named column, e.g., index_col=['0'])

like image 51
F Humphreys Avatar answered Sep 21 '22 14:09

F Humphreys


I'm not sure exactly why you have an error, though a guess is it could occur if you have numeric and non-numeric data in your index column. Then numpy gets confused when it tries to check whether the index is ordered.

A possible hack:

data = pd.read_csv("Data/led.csv")

# assuming first column is named '0'
data['0'] = data['0'].astype(int).fillna(0)
data = data.set_index('0')
like image 41
jpp Avatar answered Sep 19 '22 14:09

jpp