I have DataFrame like below:
df = pd.DataFrame([
("i", 1, 'GlIrbixGsmCL'),
("i", 1, 'GlIrbixGsmCL'),
("i", 1, '3IMR1UteQA'),
("c", 1, 'GlIrbixGsmCL'),
("i", 2, 'GlIrbixGsmCL'),
], columns=['type', 'cid', 'userid'])
Expected output like:
For more details:
i_counts, c_counts => df.groupby(["cid","type"]).size()
i_ucounts, c_ucounts => df.groupby(["cid","type"])["userid"].nunique()
i_frequency,u_frequency => df.groupby(["cid","type"])["userid"].value_counts()
Looks it's a little complex for me, how to do with pandas to get the expected result?
The related screenshots:
In pandas you can get the count of the frequency of a value that occurs in a DataFrame column by using Series. value_counts() method, alternatively, If you have a SQL background you can also get using groupby() and count() method.
Pandas is an open-source Python library designed to deal with data analysis and data manipulation. Citing the official website, “pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.”
After grouping a DataFrame object on one column, we can apply count() method on the resulting groupby object to get a DataFrame object containing frequency count. This method can be used to count frequencies of objects over single or multiple columns.
Pandas DataFrame count() Method The count() method counts the number of not empty values for each row, or column if you specify the axis parameter as axis='columns' , and returns a Series object with the result for each row (or column).
This is how I would approach this:
aggfuncs= {
'counts': ('userid', 'count'),
'ucounts': ('userid', 'nunique'),
'frequency': ('userid', lambda S: S.value_counts().to_dict()),
}
output = df.groupby(['cid', 'type']).agg(**aggfuncs).unstack()
output.columns = output.columns.map(lambda tup: '_'.join(tup[::-1]))
output:
c_counts i_counts c_ucounts i_ucounts c_frequency i_frequency
cid
1 1.0 3.0 1.0 2.0 {'GlIrbixGsmCL': 1} {'GlIrbixGsmCL': 2, '3IMR1UteQA': 1}
2 NaN 1.0 NaN 1.0 NaN {'GlIrbixGsmCL': 1}
I think that is the core of what you want. You will need some cosmetic amendments to get the output exactly as in your example (e.g. fillna etc.).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With