Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add row in each group using dplyr and add_row()

Tags:

If I add a new row to the iris dataset with:

iris <- as_tibble(iris)

> iris %>% 
    add_row(.before=0)

# A tibble: 151 × 5
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl>   <chr>
1            NA          NA           NA          NA    <NA> <--- Good!
2           5.1         3.5          1.4         0.2  setosa
3           4.9         3.0          1.4         0.2  setosa

It works. So, why can't I add a new row on top of each "subset" with:

iris %>% 
 group_by(Species) %>% 
 add_row(.before=0)

Error: is.data.frame(df) is not TRUE
like image 317
Dan Avatar asked Apr 13 '17 23:04

Dan


People also ask

How do you add rows in Tibble R?

Use tibble_row() to ensure that the new data has only one row. add_case() is an alias of add_row() .

How do I add a row to an existing Dataframe in R?

To add row to R Data Frame, append the list or vector representing the row, to the end of the data frame. nrow(df) returns the number of rows in data frame. nrow(df) + 1 means the next row after the end of data frame. Assign the new row to this row position in the data frame.

What does %>% do in dplyr?

%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).

How do you append rows in R?

You can quickly append one or more rows to a data frame in R by using one of the following methods: Method 1: Use rbind() to append data frames. Method 2: Use nrow() to append a row.


Video Answer


2 Answers

If you want to use a grouped operation, you need do like JasonWang described in his comment, as other functions like mutate or summarise expect a result with the same number of rows as the grouped data frame (in your case, 50) or with one row (e.g. when summarising).

As you probably know, in general do can be slow and should be a last resort if you cannot achieve your result in another way. Your task is quite simple because it only involves adding extra rows in your data frame, which can be done by simple indexing, e.g. look at the output of iris[NA, ].

What you want is essentially to create a vector

indices <- c(NA, 1:50, NA, 51:100, NA, 101:150)

(since the first group is in rows 1 to 50, the second one in 51 to 100 and the third one in 101 to 150).

The result is then iris[indices, ].

A more general way of building this vector uses group_indices.

indices <- seq(nrow(iris)) %>% 
    split(group_indices(iris, Species)) %>% 
    map(~c(NA, .x)) %>%
    unlist

(map comes from purrr which I assume you have loaded as you have tagged this with tidyverse).

like image 155
konvas Avatar answered Sep 17 '22 16:09

konvas


A more recent version would be using group_modify() instead of do().

iris %>%
  as_tibble() %>%
  group_by(Species) %>% 
  group_modify(~ add_row(.x,.before=0))
#> # A tibble: 153 x 5
#> # Groups:   Species [3]
#>    Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#>    <fct>          <dbl>       <dbl>        <dbl>       <dbl>
#>  1 setosa          NA          NA           NA          NA  
#>  2 setosa           5.1         3.5          1.4         0.2
#>  3 setosa           4.9         3            1.4         0.2
like image 24
Alexlok Avatar answered Sep 17 '22 16:09

Alexlok