Lets say I have the following data set, turned into a dataframe:
data = [
    ['Job 1', datetime.date(2019, 6, 9), 'Jim', 'Tom'],
    ['Job 1', datetime.date(2019, 6, 9), 'Bill', 'Tom'],
    ['Job 1', datetime.date(2019, 6, 9), 'Tom', 'Tom'],
    ['Job 1', datetime.date(2019, 6, 10), 'Bill', None],
    ['Job 2', datetime.date(2019,6,10), 'Tom', 'Tom']
]
df = pd.DataFrame(data, columns=['Job', 'Date', 'Employee', 'Manager'])
This yields a dataframe that looks like:
     Job        Date Employee Manager
0  Job 1  2019-06-09      Jim     Tom
1  Job 1  2019-06-09     Bill     Tom
2  Job 1  2019-06-09      Tom     Tom
3  Job 1  2019-06-10     Bill    None
4  Job 2  2019-06-10      Tom     Tom
What I am trying to generate is a pivot on each unique Job/Date combo, with a column for Manager, and a column for a string with comma separated, non-manager employees. A couple of things to assume:
I'd like the resulting dataframe to look like:
     Job        Date  Manager     Employees
0  Job 1  2019-06-09      Tom     Jim, Bill
1  Job 1  2019-06-10     None          Bill
2  Job 2  2019-06-10      Tom          None
Which leads to my questions:
I suspect 1) is possible, and 2) might be more difficult. If 2) is a no, I can get around it in other ways later in my code.
The tricky part here is removing the Manager from the Employee column.
u = df.melt(['Job', 'Date'])
f = u[~u.duplicated(['Job', 'Date', 'value'], keep='last')].astype(str)
f.pivot_table(
    index=['Job', 'Date'],
    columns='variable', values='value',
    aggfunc=','.join
).rename_axis(None, axis=1)
                  Employee Manager
Job   Date
Job 1 2019-06-09  Jim,Bill     Tom
      2019-06-10      Bill    None
Job 2 2019-06-10       NaN     Tom
                        Group to aggregate, then fix the Employees by removing the Manager and setting to None where appropriate. Since the employees are unique, sets will work nicely here to remove the Manager.
s = df.groupby(['Job', 'Date']).agg({'Manager': 'first', 'Employee': lambda x: set(x)})
s['Employee'] = [', '.join(x.difference({y})) for x,y in zip(s.Employee, s.Manager)]
s['Employee'] = s.Employee.replace({'': None})
                 Manager   Employee
Job   Date                         
Job 1 2019-06-09     Tom  Jim, Bill
      2019-06-10    None       Bill
Job 2 2019-06-10     Tom       None
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With