Merging complementary rows of a dataframe with R

Tags:

I have such a data frame

0     weekday day month year hour basal bolus carb period.h
1    Tuesday  01    03 2016  0.0  0.25    NA   NA        0
2    Tuesday  01    03 2016 10.9    NA    NA   67       10
3    Tuesday  01    03 2016 10.9    NA  4.15   NA       10
4    Tuesday  01    03 2016 12.0  0.30    NA   NA       12
5    Tuesday  01    03 2016 17.0  0.50    NA   NA       17
6    Tuesday  01    03 2016 17.6    NA    NA   33       17
7    Tuesday  01    03 2016 17.6    NA  1.35   NA       17
8    Tuesday  01    03 2016 18.6    NA    NA   44       18
9    Tuesday  01    03 2016 18.6    NA  1.80   NA       18
10   Tuesday  01    03 2016 18.9    NA    NA   17       18
11   Tuesday  01    03 2016 18.9    NA  0.70   NA       18
12   Tuesday  01    03 2016 22.0  0.40    NA   NA       22
13 Wednesday  02    03 2016  0.0  0.25    NA   NA        0
14 Wednesday  02    03 2016  9.7    NA    NA   39        9
15 Wednesday  02    03 2016  9.7    NA  2.65   NA        9
16 Wednesday  02    03 2016 11.2    NA    NA   13       11
17 Wednesday  02    03 2016 11.2    NA  0.30   NA       11
18 Wednesday  02    03 2016 12.0  0.30    NA   NA       12
19 Wednesday  02    03 2016 12.0    NA    NA   16       12
20 Wednesday  02    03 2016 12.0    NA  0.65   NA       12

If you look at the lines 2 and 3, you notice that they correspond exactly to the same day & time: just for the line #2 the "carb" is not NA, and the "bolus" is not NA (These are data about diabete).

I want to merge such lines into a single one:

2    Tuesday  01    03 2016 10.9    NA    NA   67       10
3    Tuesday  01    03 2016 10.9    NA  4.15   NA       10

2    Tuesday  01    03 2016 10.9    NA    4.15   67       10

I could of course do a brutal double loop over each line, but I look for a cleverer and faster way.

619

asked Oct 11 '16 21:10

Xavier Prudent

1 Answers

You can group your data frame by the common identifier columns weekday, day, month, year, hour, period.h here and then sort and take the first element from the remaining columns which you would like to merge, sort() function by default will remove NAs in the vector to be sorted and thus you will end up with non-NA elements for each column within each group; if all elements in a column are NA, sort(col)[1] returns NA:

library(dplyr)
df %>% 
       group_by(weekday, day, month, year, hour, period.h) %>% 
       summarise_all(funs(sort(.)[1]))

#      weekday   day month  year  hour period.h basal bolus  carb
#       <fctr> <int> <int> <int> <dbl>    <int> <dbl> <dbl> <int>
# 1    Tuesday     1     3  2016   0.0        0  0.25    NA    NA
# 2    Tuesday     1     3  2016  10.9       10    NA  4.15    67
# 3    Tuesday     1     3  2016  12.0       12  0.30    NA    NA
# 4    Tuesday     1     3  2016  17.0       17  0.50    NA    NA
# 5    Tuesday     1     3  2016  17.6       17    NA  1.35    33
# 6    Tuesday     1     3  2016  18.6       18    NA  1.80    44
# 7    Tuesday     1     3  2016  18.9       18    NA  0.70    17
# 8    Tuesday     1     3  2016  22.0       22  0.40    NA    NA
# 9  Wednesday     2     3  2016   0.0        0  0.25    NA    NA
# 10 Wednesday     2     3  2016   9.7        9    NA  2.65    39
# 11 Wednesday     2     3  2016  11.2       11    NA  0.30    13
# 12 Wednesday     2     3  2016  12.0       12  0.30  0.65    16

Instead of sort(), maybe a more appropriate function to use here is na.omit():

df %>% group_by(weekday, day, month, year, hour, period.h) %>% 
       summarise_all(funs(na.omit(.)[1]))

145

answered Nov 05 '22 16:11

Psidom

Related questions
                            
                                How to define y axis breaks in a facetted plot using ggplot2?
                            
                                Error : Could not find build tools necessary to build
                            
                                Grouping rows on the basis of row differences in R [duplicate]
                            
                                Print custom object in rmarkdown and knitr (PDF and HTML)
                            
                                Restart shiny-server on centos
                            
                                Shiny app crashes on Mac in RStudio
                            
                                Testing the Proportional Odds Assumption in R
                            
                                My corrplot does not fit in the screen?
                            
                                R doesn't recognize command line option. (WARNING: unknown option '-d')
                            
                                Conditional reactivity Shiny
                            
                                is it possible to run CDO within a R script?
                            
                                Julia pmap performance
                            
                                mailR: Sending mail to multiple addresses from shiny
                            
                                scale_fill_manual is not working in geom_bar
                            
                                how to pass column names in dplyr select without evaluation
                            
                                readr - how to update col_spec object from spec()
                            
                                Map does not render in shiny app with leaflet and raster
                            
                                R :: override C compiler using Makevars
                            
                                Understanding reactive functions in Shiny
                            
                                Error in x[j] : invalid subscript type 'list' while using subset in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Merging complementary rows of a dataframe with R

Tags:

dataframe

r

row

Xavier Prudent

People also ask

1 Answers

Psidom

Recent Activity

Donate For Us