I have a set of led data saved on disk (led.csv) as such:
, LEDC1, LEDC2, LEDC3
0, 54859, 11349, 56859
1, 54850, 12135, 56890
2, 54836, 12400, 56892
3, 54840, 15725, 56897
4, 54841, 19038, 56896
5, 54837, 21232, 56911
., ... , ... , ...
I am reading this data from .csv using pandas read_csv function:
data = pd.read_csv("Data/led.csv", index_col=0)
Providing the index_col argument to this function issues the following (numpy) warning:
C:\Program Files\Python\lib\site-packages\numpy\lib\arraysetops.py:466:
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
mask |= (ar1 == a)
What is the proper way of reading a .csv file with existing index using Pandas?
Any insight on the warning is much appreciated...
Pandas read_csv() function imports a CSV file to DataFrame format. header: this allows you to specify which row will be used as column names for your dataframe. Expected an int value or a list of int values. Default value is header=0 , which means the first row of the CSV file will be treated as column names.
The Pandas to_csv() function is used to convert the DataFrame into CSV data. To write the CSV data into a file, we can simply pass a file object to the function. Otherwise, the CSV data is returned in a string format.
A comma-separated values (csv) file is returned as two-dimensional data structure with labeled axes. See also DataFrame.to_csv. Write DataFrame to a comma-separated values (csv) file. read_csv. Read a comma-separated values (csv) file into DataFrame.
This is one way to get the same result as with index_col = 0 but without the warning. It may not be the most concise way though:
data = pd.read_csv("Data/led.csv")
data.set_index([data.columns.values[0]], inplace=True)
data.index.names = [None]
This is a great post on the type of error and, below it, there is a solution for a named column, e.g., index_col=['0'])
I'm not sure exactly why you have an error, though a guess is it could occur if you have numeric and non-numeric data in your index column. Then numpy
gets confused when it tries to check whether the index is ordered.
A possible hack:
data = pd.read_csv("Data/led.csv")
# assuming first column is named '0'
data['0'] = data['0'].astype(int).fillna(0)
data = data.set_index('0')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With