I have a pandas dataframe filled with users and categories, but multiple columns for those categories.
| user | category | val1 | val2 | val3 |
| ------ | ------------------| -----| ---- | ---- |
| user 1 | c1 | 3 | NA | None |
| user 1 | c2 | NA | 4 | None |
| user 1 | c3 | NA | NA | 7 |
| user 2 | c1 | 5 | NA | None |
| user 2 | c2 | NA | 7 | None |
| user 2 | c3 | NA | NA | 2 |
I want to get it so the values are compressed into a single column.
| user | category | value|
| ------ | ------------------| -----|
| user 1 | c1 | 3 |
| user 1 | c2 | 4 |
| user 1 | c3 | 7 |
| user 2 | c1 | 5 |
| user 2 | c2 | 7 |
| user 2 | c3 | 2 |
Ultimately, to get a matrix like the following:
np.array([[3, 4, 7], [5, 7, 2]])
You can use pd.DataFrame.bfill
to backfill values over selected columns.
val_cols = ['val1', 'val2', 'val3']
df['value'] = pd.to_numeric(df[val_cols].bfill(axis=1).iloc[:, 0], errors='coerce')
print(df)
user0 category val1 val2 val3 value
0 user 1 c1 3.0 NaN None 3.0
1 user 1 c2 NaN 4.0 None 4.0
2 user 1 c3 NaN NaN 7 7.0
3 user 2 c1 5.0 NaN None 5.0
4 user 2 c2 NaN 7.0 2 7.0
5 user 2 c3 NaN NaN 2 2.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With