I have a lot of units that are measured repeated times. <pre class="prettyprint"><code>>df Item value year 1 20 1990 1 20 1991 2 30 1990 2 15 1990 2 5 1991 3 10 1991 4 15 1990 5 10 1991 5 5 1991 </code></pre> I am trying to use <code>dplyr</code> to remove values that have a low number of observations. On this toy data, lets say that I want to remove data which has fewer than 2 counts. <pre class="prettyprint"><code>>df <- df %>% group_by(Item) %>% tally() %>% filter(n>1) Item n 1 2 2 3 5 2 </code></pre> The problem is that I would like to expand this back to what it was, but with this filter. I attempted using the <code>ungroup</code> command, but that seems to only have an effect when grouping by two variables. How can I filter by item counts then get my original variables back i.e <code>value</code> and <code>year</code>. It should look like this: <pre class="prettyprint"><code>>df Item value year 1 20 1990 1 20 1991 2 30 1990 2 15 1990 2 5 1991 5 10 1991 5 5 1991 </code></pre>

More simply, use dplyr's <code>row_number()</code> <pre class="prettyprint"><code>library(dplyr) df <- read.table("clipboard", header = TRUE, stringsAsFactors = FALSE) df %>% group_by(Item) %>% filter(max(row_number()) > 1) %>% ungroup() # A tibble: 7 x 3 # Groups: Item [3] Item value year <int> <int> <int> 1 1 20 1990 2 1 20 1991 3 2 30 1990 4 2 15 1990 5 2 5 1991 6 5 10 1991 7 5 5 1991 </code></pre>

Ungroup after grouping by just one variable in dplyr

Tags:

r

dplyr

I have a lot of units that are measured repeated times.

>df
Item value  year
1     20     1990
1     20     1991
2     30     1990
2     15     1990
2     5      1991
3     10     1991
4     15     1990
5     10     1991
5      5     1991

I am trying to use dplyr to remove values that have a low number of observations. On this toy data, lets say that I want to remove data which has fewer than 2 counts.

>df <- df %>% 
  group_by(Item) %>% 
  tally() %>% 
  filter(n>1)

Item  n
1     2
2     3
5     2

The problem is that I would like to expand this back to what it was, but with this filter. I attempted using the ungroup command, but that seems to only have an effect when grouping by two variables. How can I filter by item counts then get my original variables back i.e value and year. It should look like this:

>df
Item value  year
1     20     1990
1     20     1991
2     30     1990
2     15     1990
2     5      1991
5     10     1991
5      5     1991

336

asked Jul 28 '17 08:07

Alex

1 Answers

More simply, use dplyr's row_number()

library(dplyr)

df <- read.table("clipboard", header = TRUE, stringsAsFactors = FALSE)

df %>% 
  group_by(Item) %>% 
  filter(max(row_number()) > 1) %>%
  ungroup()

# A tibble: 7 x 3
# Groups:   Item [3]
   Item value  year
  <int> <int> <int>
1     1    20  1990
2     1    20  1991
3     2    30  1990
4     2    15  1990
5     2     5  1991
6     5    10  1991
7     5     5  1991

159

answered Sep 19 '22 17:09

r.bot

Related questions
                            
                                Cut function in R - exclusive or am I double counting?
                            
                                geom_boxplot with precomputed values
                            
                                Create R binary packages for Linux that can be installed on different machines?
                            
                                Plotting implicit function
                            
                                merging data.tables based on columns names
                            
                                When I import text file into R, I get a special character appended to the first value of the first column
                            
                                row-by-row operations and updates in data.table
                            
                                How to map a vector to a different range in R?
                            
                                How to pass data.frame for UPDATE with R DBI
                            
                                Remove extra space and ring at the edge of a polar plot
                            
                                Is the plyr package for R not available for R version 3.0.2? [duplicate]
                            
                                Can't reproduce stat_smooth using `loess` when x-axis is Date
                            
                                compute only diagonals of matrix multiplication in R
                            
                                data.table: anonymous function in j
                            
                                guess_formats + R + lubridate
                            
                                Including images in R-package documentation (.Rd) files
                            
                                Ignoring case sensitvity in dplyr joins
                            
                                Changing the Projection of Shapefile
                            
                                R check warning: Files in the 'vignettes' directory but no files in 'inst/doc'
                            
                                flexdashboard - change title bar color

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With