Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: creating plot based on observation dates (not as a time series)

I have the following dataset

df
id medication_date 
1  2000-01-01
1  2000-01-04
1  2000-01-06
2  2000-04-01
2  2000-04-02
2  2000-04-03

I would like to first reshape the data set into days after the first observation per patient:

id day1 day2 day3 day4 
1  yes  no   no   yes 
2  yes  yes  yes  no

in order to ultimately create a plot with the above table: columns the dates and in black if yes, and white if not.

any help really appreciated it

like image 832
Economist_Ayahuasca Avatar asked Apr 11 '26 08:04

Economist_Ayahuasca


1 Answers

Transform the sparse Series ('yes' medication) to dense Series by adding missing days ('no' medication) then reset the Series index (2000-01-01 -> 0, 2000-04-01 -> 0). Finally, reshape your dataframe.

def f(sr):
    # Create missing dates
    dti = pd.date_range(sr.min(), sr.max(), freq='D')
    # Fill the Series with 'yes' or 'no'
    return (pd.Series('yes', index=sr.tolist())
              .reindex(dti, fill_value='no')
              .reset_index(drop=True))

df['medication_date'] = pd.to_datetime(df['medication_date'])
out = (df.groupby('id')['medication_date'].apply(f).unstack(fill_value='no')
         .rename(columns=lambda x: f'day{x+1}').reset_index())

Output:

>>> out
   id day1 day2 day3 day4 day5 day6
0   1  yes   no   no  yes   no  yes
1   2  yes  yes  yes   no   no   no

Update

import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap

colors = ["white", "black"] 
cmap = LinearSegmentedColormap.from_list('Custom', colors, len(colors))
plt.matshow(out.set_index('id').eq('yes').astype(int), cmap=cmap)
plt.show()

enter image description here

like image 56
Corralien Avatar answered Apr 13 '26 22:04

Corralien



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!