Let's take a look at a simple HR dataframe as an example:
| Name | Year | Department | Salary | New column: hit100after |
|---|---|---|---|---|
| John | 2018 | R&D | 80 | 0 |
| Marie | 2018 | Legal | 90 | 0 |
| Jill | 2018 | Legal | 100 | 0 |
| John | 2019 | R&D | 85 | 0 |
| Marie | 2019 | Legal | 95 | 0 |
| Jill | 2019 | Legal | 105 | 1 |
| John | 2020 | R&D | 90 | 0 |
| Marie | 2020 | Legal | 100 | 2 |
| Jill | 2020 | Legal | 110 | 2 |
Question: How much time (in years) did it take for any data set to achieve a salary >= 100 AFTER the first person in the same department hit that number? (see new column 'hit100after')
I am actually struggeling with the implementation. My steps would be to group by department and find the first occurrence of salary >= 100 and mark that year with a new boolean type or a second dataframe. Then I would calculate the differences between >= 100 of an individual and the first occurrence in its department.
Any idea how to code that for a large dataframe?
You can groupby() the Department and find the Year.min() where Salary >= 100:
first100 = df.groupby('Department').apply(
lambda g: g[g.Salary >= 100].Year.min())
# Department
# Legal 2018.0
# R&D NaN
# dtype: float64
Then map() those first100 years by Department and subtract from Year. The results are in Result for comparison:
df['Result'] = df.Year - df.Department.map(first100)
df.loc[df.Salary < 100, 'Result'] = 0
# Name Year Department Salary Hit100after Result
# 0 John 2018 R&D 80 0 0.0
# 1 Marie 2018 Legal 90 0 0.0
# 2 Jill 2018 Legal 100 0 0.0
# 3 John 2019 R&D 85 0 0.0
# 4 Marie 2019 Legal 95 0 0.0
# 5 Jill 2019 Legal 105 1 1.0
# 6 John 2020 R&D 90 0 0.0
# 7 Marie 2020 Legal 100 2 2.0
# 8 Jill 2020 Legal 110 2 2.0
| method | %timeit with 1 million rows |
|---|---|
| this answer | 3.07 ms ± 23.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) |
query() answer |
4.33 ms ± 354 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) |
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With