I have a dataset like:
df = pd.DataFrame({'scientist':["Wendelaar Bonga"," Sjoerd E.", "Grätzel"," Michael", "Willett", "Walter C.",
"Kessler", "Ronald C.", "Witten, Edward", "Wang, Zhong Lin"],
'SubjectField': ["Biomedical Engineering", "Inorganic & Nuclear Chemistry",
"Organic Chemistry", "Biomedical Engineering", "Developmental Biology",
"Mechanical Engineering & Transports", "Biomedical Engineering", "Microbiology",
"Cardiovascular System & Hematology", "Biomedical Engineering"]})
I want to count the number of scientists in each subject field and extract subject fields that have more than 2 scientists. this is my code to count the number of scientists
number_of_scientists_in_fields=data.groupby(['SubjectField'])['scientist'].count()
how can I extract subject fields that have more than 2 scientists?
Use value_counts, as follows:
fields = df.value_counts('SubjectField').to_frame('count')
res = fields[fields['count'] > 2]
print(res)
Output
count
SubjectField
Biomedical Engineering 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With