Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - Drop function error (label not contained in axis) [duplicate]

Tags:

I have a CSV file that is as the following:

index,Avg,Min,Max Build1,56.19,39.123,60.1039 Build2,57.11,40.102,60.2 Build3,55.1134,35.129404123,60.20121 

Based off my question here I am able to add some relevant information to this csv via this short script:

import pandas as pd  df = pd.read_csv('newdata.csv') print(df)  df_out = pd.concat([df.set_index('index'),df.set_index('index').agg(['max','min','mean'])]).rename(index={'max':'Max','min':'Min','mean':'Average'}).reset_index()  with open('newdata.csv', 'w') as f:     df_out.to_csv(f,index=False) 

This results in this CSV:

index,Avg,Min,Max Build1,56.19,39.123,60.1039 Build2,57.11,40.102,60.2 Build3,55.1134,35.129404123,60.20121 Max,57.11,40.102,60.20121 Min,55.1134,35.129404123,60.1039 Average,56.1378,38.1181347077,60.16837 

I would like to now have it so I can update this csv. For example if I ran a new build (build4 for instance) I could add that in and then redo the Max, Min, Average rows. My idea is that I therefore delete the rows with labels Max, Min, Average, add my new row, redo the stats. I believe the code I need is as simple as (just for Max but would have lines for Min and Average as well):

df = pd.read_csv('newdata.csv') df = df.drop('Max') 

However this always results in an ValueError: labels ['Max'] not contained in axis

I have created the csv files in sublime text, could this be part of the issue? I have read other SO posts about this and none seem to help my issue.

I am unsure if this allowed but here is a download link to my csv just in case something is wrong with the file itself.

I would be okay with two possible answers:

  1. How to fix this drop issue
  2. How to add more builds and update the statistics (a method without drop)
like image 345
Abdall Avatar asked Jul 05 '17 16:07

Abdall


People also ask

How do I use the drop function in Pandas?

Pandas DataFrame drop() Method The drop() method removes the specified row or column. By specifying the column axis ( axis='columns' ), the drop() method removes the specified column. By specifying the row axis ( axis='index' ), the drop() method removes the specified row.

How do I get Colnames in python?

You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.

How do I get rid of Pandas indexing?

The most straightforward way to drop a Pandas dataframe index is to use the Pandas . reset_index() method. By default, the method will only reset the index, forcing values from 0 - len(df)-1 as the index.

What is inplace in drop Pandas?

Inplace is an argument used in different functions. Some functions in which inplace is used as an attributes like, set_index(), dropna(), fillna(), reset_index(), drop(), replace() and many more. The default value of this attribute is False and it returns the copy of the object. Here we are using fillna() methods.


2 Answers

You must specify the axis argument. default is axis = 0 which is rows columns is axis = 1.

so this should be your code.

df = df.drop('Max',axis=1) 

edit: looking at this piece of code:

df = pd.read_csv('newdata.csv') df = df.drop('Max') 

The code you used does not specify that the first column of the csv file contains the index for the dataframe. Thus pandas creates an index on the fly. This index is purely a numerical one. So your index does not contain "Max".

try the following:

df = pd.read_csv("newdata.csv",index_col=0) df = df.drop("Max",axis=0) 

This forces pandas to use the first column in the csv file to be used as index. This should mean the code works now.

like image 102
error Avatar answered Nov 03 '22 09:11

error


To delete a particular column in pandas; do simply:

del df['Max'] 
like image 22
glegoux Avatar answered Nov 03 '22 07:11

glegoux