Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: subtracting two columns and saving result as an absolute

I have the code where I have a csv file opened in pandas and a new one I'm creating. There's a row I need to create "two last lines commented out" of an absolute value of subtracting two rows. I've tried a number of ideas in my head all bring an error.

import pandas as pd
import numpy as np

df = pd.read_csv(filename_read)
ids = df['id']

oosDF = pd.DataFrame()
oosDF['id'] = ids
oosDF['pred'] = pred
oosDF['y'] = df['target']
#oosDF['diff'] = oosdF['pred'] - oosDF['y']
#oosDF['diff'] = oosDF.abs()
like image 807
Sam B. Avatar asked Feb 19 '18 13:02

Sam B.


2 Answers

I think you need for new DataFrame by subset (columns names in double []) and then get abs value of difference of columns:

oosDF = df[['id','pred', 'target']].replace(columns={'target':'y'})
oosDF['diff'] = (oosDF['pred'] - oosDF['y']).abs()
like image 82
jezrael Avatar answered Sep 27 '22 20:09

jezrael


In your first commented line, you have oosdF instead of oosDF.

In your second commented line, you're setting the column to be abs() applied to the whole dataframe. That should be oosDF['diff'].abs()

Hope this helps!

like image 25
David Stevens Avatar answered Sep 27 '22 19:09

David Stevens