I have the following dataset
df
id medication_date
1 2000-01-01
1 2000-01-04
1 2000-01-06
2 2000-04-01
2 2000-04-02
2 2000-04-03
I would like to first reshape the data set into days after the first observation per patient:
id day1 day2 day3 day4
1 yes no no yes
2 yes yes yes no
in order to ultimately create a plot with the above table: columns the dates and in black if yes, and white if not.
any help really appreciated it
Transform the sparse Series ('yes' medication) to dense Series by adding missing days ('no' medication) then reset the Series index (2000-01-01 -> 0, 2000-04-01 -> 0). Finally, reshape your dataframe.
def f(sr):
# Create missing dates
dti = pd.date_range(sr.min(), sr.max(), freq='D')
# Fill the Series with 'yes' or 'no'
return (pd.Series('yes', index=sr.tolist())
.reindex(dti, fill_value='no')
.reset_index(drop=True))
df['medication_date'] = pd.to_datetime(df['medication_date'])
out = (df.groupby('id')['medication_date'].apply(f).unstack(fill_value='no')
.rename(columns=lambda x: f'day{x+1}').reset_index())
Output:
>>> out
id day1 day2 day3 day4 day5 day6
0 1 yes no no yes no yes
1 2 yes yes yes no no no
Update
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap
colors = ["white", "black"]
cmap = LinearSegmentedColormap.from_list('Custom', colors, len(colors))
plt.matshow(out.set_index('id').eq('yes').astype(int), cmap=cmap)
plt.show()

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With