R function to combine rows based on duplicate times

Question

I have a large dataset that has duplicate times (rows) with data in both row columns that I would like to combine. The data looks like this:

date              P1   PT1  P2   PT2   P3   PT3

5/5/2011@11:40    NA   NA   NA   NA   9.4   10.1

5/5/2011@11:40    5.6  10.2  8.5 10.1  NA   NA

I would like to get to this

date                P1     PT1     P2    PT2    P3    PT3

5/5/2011@11:40    5.6  10.2  8.5 10.1  9.4   10.1

My dataset is 10 minutes data for ten years and the repeats are somewhat random. The @ sign was added to display properly.

I've tried rbind and rbind.row.names to no avail.

Thanks!

mikebader · Accepted Answer

You can use the summarize() function in dplyr. The following will work, but it does not check for duplicates, it only takes the maximum value for each date.

library(dplyr)
df <- tribble(~date, ~P1, ~PT1, ~P2, ~PT2, ~P3, ~PT3, 
        "5/5/2011@11:40", NA, NA, NA, NA, 9.4, 10.1, 
        "5/5/2011@11:40", 5.6, 10.2, 8.5, 10.1, NA, NA
)

df %>%
    group_by(date) %>%
    summarize(across(starts_with("P"), max, na.rm = TRUE))

In other words, if you are sure that your data include either a number or NA, then it will work.

R function to combine rows based on duplicate times

Tags:

dataframe

r

dplyr

rbind

schultz45

1 Answers

mikebader

Recent Activity

Donate For Us

R function to combine rows based on duplicate times

Tags:

dataframe

r

dplyr

rbind

schultz45

1 Answers

mikebader

Related questions

Recent Activity

Donate For Us