Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge two rows in data.frame

Tags:

r

Similar to Merge rows in one data.frame and Merge two rows in one dataframe, when the rows are disjoint and contain nulls I face the following problem for which the aforementioned posts could really help.

The data that I have looks like this

| Date     | Checkin | Origin | Checkout | Destination |
| 03-07-17 | 08:00   | A      |          |             |
| 03-07-17 |         | A      | 09:00    | B           |
| 03-07-17 | 17:00   | B      |          |             |
| 03-07-17 |         | B      | 18:00    | A           |
| 04-07-17 | 08:00   | A      |          |             |
| 04-07-17 |         | A      | 09:00    | B           |
| 04-07-17 | 17:00   | B      |          |             |
| 04-07-17 |         | B      | 18:00    | A           |

Now I want to aggregate this into 4 rows looking like:

| Date     | Checkin | Origin | Checkout | Destination |
| 03-07-17 | 08:00   | A      | 09:00    | B           |
| 03-07-17 | 17:00   | B      | 18:00    | A           |
| 04-07-17 | 08:00   | A      | 09:00    | B           |
| 04-07-17 | 17:00   | B      | 18:00    | A           |

Any ideas? Thanks!

like image 900
Pascal Avatar asked Mar 09 '23 13:03

Pascal


1 Answers

An idea via dplyr,

library(dplyr)

df %>% 
 group_by(Date, Origin) %>% 
 summarise_all(funs(trimws(paste(., collapse = ''))))
 A tibble: 4 x 5
 Groups:   Date [?]
        Date   Origin Checkin Checkout Destination
       <chr>    <chr>   <chr>    <chr>       <chr>
1  03-07-17   A         08:00    09:00           B
2  03-07-17   B         17:00    18:00           A
3  04-07-17   A         08:00    09:00           B
4  04-07-17   B         17:00    18:00           A

DATA

dput(df)
structure(list(Date = c(" 03-07-17 ", " 03-07-17 ", " 03-07-17 ", 
" 03-07-17 ", " 04-07-17 ", " 04-07-17 ", " 04-07-17 ", " 04-07-17 "
), Checkin = c(" 08:00   ", "         ", " 17:00   ", "         ", 
" 08:00   ", "         ", " 17:00   ", "         "), Origin = c(" A      ", 
" A      ", " B      ", " B      ", " A      ", " A      ", " B      ", 
" B      "), Checkout = c("          ", " 09:00    ", "          ", 
" 18:00    ", "          ", " 09:00    ", "          ", " 18:00    "
), Destination = c("             ", " B           ", "             ", 
" A           ", "             ", " B           ", "             ", 
" A           ")), .Names = c("Date", "Checkin", "Origin", "Checkout", 
"Destination"), row.names = c(NA, -8L), class = "data.frame")
like image 155
Sotos Avatar answered Mar 11 '23 13:03

Sotos