Remove groups by condition

Tags:

Suppose I have the following dataframe

using DataFrames
df = DataFrame(A = 1:10, B = ["a","a","b","b","b","c","c","c","c","d"])
grouped_df  = groupby(df, "B")

I would have four groups. How can I drop the groups that have fewer than, say, 2 rows? For example, how can I keep only groups a,b, and c? I can easily do it with a for loop, but I don't think the optimal way.

677

asked Mar 04 '21 23:03

user1691278

1 Answers

If you want the result to be still grouped then filter is simplest:

julia> filter(x -> nrow(x) > 1, grouped_df)
GroupedDataFrame with 3 groups based on key: B
First Group (2 rows): B = "a"
 Row │ A      B
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     2  a
⋮
Last Group (4 rows): B = "c"
 Row │ A      B
     │ Int64  String
─────┼───────────────
   1 │     6  c
   2 │     7  c
   3 │     8  c
   4 │     9  c

If you want to get a data frame as a result of one operation then do e.g.:

julia> combine(grouped_df, x -> nrow(x) < 2 ? DataFrame() : x)
9×2 DataFrame
 Row │ B       A
     │ String  Int64
─────┼───────────────
   1 │ a           1
   2 │ a           2
   3 │ b           3
   4 │ b           4
   5 │ b           5
   6 │ c           6
   7 │ c           7
   8 │ c           8
   9 │ c           9

answered Oct 05 '22 18:10

Bogumił Kamiński

Related questions
                            
                                How to solve npm install errors on Mac
                            
                                AlarmManager doesn't work on MIUI (and who knows where else)
                            
                                Question about format string in scanf function
                            
                                Toggle elements with class using Alpine JS?
                            
                                JavaFX: Layout problem with BorderPane - bug or user error?
                            
                                Assign value with optional question mark
                            
                                How can I draw using pygame, while also drawing with pyopengl?
                            
                                LU-factorization with OpenMP seems to slow, need advice
                            
                                Design system: styles override using TailwindCSS
                            
                                Android CameraX Error retrieving camcorder profile params
                            
                                Authorization in Nestjs using graphql
                            
                                Is it possible to register Controller specific ObjectMapper in SpringBoot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With