I am trying to roll up a bunch of rows for one day into a single row. I would like it in dplyr if possible. I know that my code is far from correct, but this was how far I got: <pre class="prettyprint"><code>data %>% group_by(DAY) %>% summarise_each(funs(Sum = n()), SEX, GROUP, TOTAL) </code></pre> Original: <pre class="prettyprint"><code>DAY SEX GROUP TOTAL 7/1/14 FEMALE A 1 7/1/14 FEMALE B 1 7/1/14 FEMALE B 1 7/1/14 FEMALE A 1 7/1/14 MALE A 1 7/1/14 MALE B 2 </code></pre> New: <pre class="prettyprint"><code>DAY FEMALE MALE GROUP_A GROUP_B TOTAL 7/1/14 4 2 3 3 7 </code></pre>

Another way with <code>data.table</code>, tested on a <code>data.frame</code> with more than one day. <pre class="prettyprint"><code>require(data.table) setDT(data)[, as.list(c(table(SEX), table(GROUP), TOTAL=sum(TOTAL))), by=DAY] # DAY FEMALE MALE A B TOTAL #1: 7/1/14 3 0 1 2 3 #2: 8/1/14 1 2 2 1 4 </code></pre> EDIT: another, less manual, option (you don't need to know which variables are factors and which are numeric), thanks to some help from @jangorecki and @DavidArenburg <pre class="prettyprint"><code>wh_num <- sapply(data, is.numeric)[-1] wh_fact <-sapply(data, is.factor)[-1] setDT(data)[, as.list(c(lapply(.SD[, wh_fact, with = FALSE], table), lapply(.SD[, wh_num, with = FALSE], sum), recursive = TRUE)), by = DAY] # DAY SEX.FEMALE SEX.MALE GROUP.A GROUP.B TOTAL #1: 7/1/14 3 0 1 2 3 #2: 8/1/14 1 2 2 1 4 </code></pre> data <pre class="prettyprint"><code>data <- structure(list(DAY = c("7/1/14", "7/1/14", "7/1/14", "8/1/14", "8/1/14", "8/1/14"), SEX = structure(c(1L, 1L, 1L, 1L, 2L, 2L ), .Label = c("FEMALE", "MALE"), class = "factor"), GROUP = structure(c(1L, 2L, 2L, 1L, 1L, 2L), .Label = c("A", "B"), class = "factor"), TOTAL = c(1L, 1L, 1L, 1L, 1L, 2L)), .Names = c("DAY", "SEX", "GROUP", "TOTAL"), row.names = c(NA, -6L), class = "data.frame") </code></pre>

It may seem a little arcane, but here is a short incantation <pre class="prettyprint"><code>dat %>% group_by(DAY) %>% summarise_each(funs(ifelse(is.numeric(.), sum(.), list(table(.))))) -> res data.frame(DAY=res$DAY, t(unlist(res[, 2:ncol(res)]))) # DAY SEX.FEMALE SEX.MALE GROUP.A GROUP.B TOTAL # 1 7/1/14 4 2 3 3 7 </code></pre> Here, you simply summarise each column as a table if it's not numeric, or sum it if it is (for the total column). This needs to be returned as a list since <code>summarise_each</code> expects a single value. Then, the result is expanded to a regular <code>data.frame</code>.

R rolling up rows to a single row (continuous & factor variables)

I am trying to roll up a bunch of rows for one day into a single row. I would like it in dplyr if possible. I know that my code is far from correct, but this was how far I got:

data %>%
  group_by(DAY) %>%
  summarise_each(funs(Sum = n()), SEX, GROUP, TOTAL)

Original:

DAY SEX GROUP   TOTAL       
7/1/14  FEMALE  A   1       
7/1/14  FEMALE  B   1       
7/1/14  FEMALE  B   1       
7/1/14  FEMALE  A   1       
7/1/14  MALE    A   1       
7/1/14  MALE    B   2

New:

DAY     FEMALE  MALE    GROUP_A GROUP_B TOTAL
7/1/14  4       2       3       3       7

How do I grab certain rows in R?

By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.

Can you subset rows in R?

Subsetting in R is a useful indexing feature for accessing object elements. It can be used to select and filter variables and observations. You can use brackets to select rows and columns from your dataframe.

How do I change the order of rows in R?

To change the row order in an R data frame, we can use single square brackets and provide the row order at first place.

Another way with data.table, tested on a data.frame with more than one day.

require(data.table)
setDT(data)[, as.list(c(table(SEX), table(GROUP), TOTAL=sum(TOTAL))), by=DAY]

#      DAY FEMALE MALE A B TOTAL
#1: 7/1/14      3    0 1 2     3
#2: 8/1/14      1    2 2 1     4

EDIT: another, less manual, option (you don't need to know which variables are factors and which are numeric), thanks to some help from @jangorecki and @DavidArenburg

wh_num <- sapply(data, is.numeric)[-1]
wh_fact <-sapply(data, is.factor)[-1]
setDT(data)[, as.list(c(lapply(.SD[, wh_fact, with = FALSE], table), 
                        lapply(.SD[, wh_num, with = FALSE], sum), 
                        recursive = TRUE)), by = DAY]

#      DAY SEX.FEMALE SEX.MALE GROUP.A GROUP.B TOTAL
#1: 7/1/14          3        0       1       2     3
#2: 8/1/14          1        2       2       1     4

data

data <- structure(list(DAY = c("7/1/14", "7/1/14", "7/1/14", "8/1/14", 
"8/1/14", "8/1/14"), SEX = structure(c(1L, 1L, 1L, 1L, 2L, 2L
), .Label = c("FEMALE", "MALE"), class = "factor"), GROUP = structure(c(1L, 
2L, 2L, 1L, 1L, 2L), .Label = c("A", "B"), class = "factor"), 
    TOTAL = c(1L, 1L, 1L, 1L, 1L, 2L)), .Names = c("DAY", "SEX", 
"GROUP", "TOTAL"), row.names = c(NA, -6L), class = "data.frame")

It may seem a little arcane, but here is a short incantation

dat %>% group_by(DAY) %>%
  summarise_each(funs(ifelse(is.numeric(.), sum(.), list(table(.))))) -> res

data.frame(DAY=res$DAY, t(unlist(res[, 2:ncol(res)])))
#      DAY SEX.FEMALE SEX.MALE GROUP.A GROUP.B TOTAL
# 1 7/1/14          4        2       3       3     7

Here, you simply summarise each column as a table if it's not numeric, or sum it if it is (for the total column). This needs to be returned as a list since summarise_each expects a single value. Then, the result is expanded to a regular data.frame.

R rolling up rows to a single row (continuous & factor variables)

Tags:

r

dplyr

yokota

People also ask

2 Answers

Cath

Rorschach

Recent Activity

Donate For Us

R rolling up rows to a single row (continuous & factor variables)

Tags:

r

dplyr

yokota

People also ask

2 Answers

Cath

Rorschach

Related questions

Recent Activity

Donate For Us