Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace ones in binary columns with values from another column

I have a data frame that looks like this:

df = pd.DataFrame({"value": [4, 5, 3], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
df

  value item1   item2   item3
0   4   0      1         0
1   5   1      0         0
2   3   0      0         1

Basically what I want to do is replace the value of the one hot encoded elements with the value from the "value" column and then delete the "value" column. The resulting data frame should be like this:

df_out = pd.DataFrame({"item1": [0, 5, 0], "item2": [4, 0, 0], "item3": [0, 0, 3]})

   item1    item2   item3
0   0        4      0
1   5        0      0
2   0        0      3
like image 746
gorjan Avatar asked Dec 05 '18 12:12

gorjan


People also ask

How do I replace a value in a column with another value in pandas?

DataFrame. replace() function is used to replace values in column (one value with another value on all columns). This method takes to_replace, value, inplace, limit, regex and method as parameters and returns a new DataFrame. When inplace=True is used, it replaces on existing DataFrame object and returns None value.

How do you replace a column in a DataFrame with another column?

In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values.

How do I replace multiple column values in pandas?

Pandas replace multiple values in column replace. By using DataFrame. replace() method we will replace multiple values with multiple new strings or text for an individual DataFrame column. This method searches the entire Pandas DataFrame and replaces every specified value.


2 Answers

Why not just multiply?

df.pop('value').values * df

   item1  item2  item3
0      0      5      0
1      4      0      0
2      0      0      3

DataFrame.pop has the nice effect of in-place removing and returning a column, so you can do this in a single step.


if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:

df.pop('value').values * df.astype(bool)

   item1  item2  item3
0      0      5      0
1      4      0      0
2      0      0      3

If your DataFrame has other columns, then do this:

df
   value  name  item1  item2  item3
0      4  John      0      1      0
1      5  Mike      1      0      0
2      3  Stan      0      0      1

# cols = df.columns[df.columns.str.startswith('item')]
cols = df.filter(like='item').columns
df[cols] = df.pop('value').values * df[cols]

df
  name  item1  item2  item3
0  John      0      5      0
1  Mike      4      0      0
2  Stan      0      0      3
like image 87
cs95 Avatar answered Sep 20 '22 17:09

cs95


You could do something like:

df = pd.DataFrame([df['value']*df['item1'],df['value']*df['item2'],df['value']*df['item3']])
df.columns = ['item1','item2','item3']

EDIT: As this answer will not scale well to many columns as @coldspeed comments, it should be done iterating a loop:

 cols = ['item1','item2','item3']
 for c in cols:
     df[c] *= df['value']
 df.drop('value',axis=1,inplace=True)
like image 27
horro Avatar answered Sep 17 '22 17:09

horro