For instance, suppose I have the following dataframe:
ID<-c("A", "A", "B", "B", "B", "C")
StartDate<-as.Date(c("2018-01-01", "2019-02-05", "2016-04-18", "2020-03-03", "2021-12-13", "2014-03-03"), "%Y-%m-%d")
TermDate<-as.Date(c("2018-02-01", NA, "2016-05-18", "2020-04-03", "2021-12-15", "2014-04-03"), "%Y-%m-%d")
df<-data.frame(ID=ID, StartDate=StartDate, TermDate=TermDate)
ID StartDate TermDate
1 A 2018-01-01 2018-02-01
2 A 2019-02-05 <NA>
3 B 2016-04-18 2016-05-18
4 B 2020-03-03 2020-04-03
5 B 2021-12-13 2021-12-15
6 C 2014-03-03 2014-04-03
What I'm ultimately trying to get is the following:
ID StartDate TermDate
1 A 2018-01-01 <NA>
2 B 2016-04-18 2021-12-15
3 C 2014-03-03 2014-04-03
To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.
When you have multiple datasets that have the same set of columns, you can concatenate one dataset to another, vertically. That is, keeping the columns of your dataset, you can add more rows to it.
If you've got a series of dates and associated values, there's an extremely easy way to group them by date range such as week, month, quarter or year: R's cut() function. The as. Date() function is important here; otherwise R will view each item as a string object and not a date object.
How to combine year, month, and day column in an R data frame? Sometimes date variable is recorded in three different columns representing year, month, and day instead of a single column as date. Therefore, we need to combine these three columns and create a single column. This can be done by using paste function and define the values with as.Date.
We can create a DataFrame in R by combining two or more other DataFrames. We can do this horizontally or vertically. To combine DataFrames horizontally (i.e., adding the columns of one dataframe to the columns of the other), we use the cbind () function, where we pass the necessary DataFrames.
We can change column name in R with the function names (). Check the R create dataframe example below: By default, data frame returns string variables as a factor. It is possible to SLICE values of a Data Frame. We select the rows and columns to return into bracket precede by the name of the data frame.
DataFrames are essential data structures in the R programming language. In this tutorial, we’ll discuss how to create a dataframe in R. A DataFrame in R is a tabular (i.e., 2-dimensional, rectangular) data structure used to store values of any data type.
There are functions first
and last
in dplyr
and data.table
that could help here.
library(dplyr)
df %>%
group_by(ID) %>%
summarise(StartDate = first(StartDate),
TermDate = last(TermDate))
# ID StartDate TermDate
#* <chr> <date> <date>
#1 A 2018-01-01 NA
#2 B 2016-04-18 2021-12-15
#3 C 2014-03-03 2014-04-03
With data.table
:
library(data.table)
setDT(df)[, .(StartDate = first(StartDate), TermDate = last(TermDate)), ID]
Using min
and max
instead of first
and last
will eliminate the need for sorting the data, if not already
df %>% group_by(ID) %>%
summarise(StartDate = min(StartDate),
TermDate = max(TermDate))
# A tibble: 3 x 3
ID StartDate TermDate
* <chr> <date> <date>
1 A 2018-01-01 NA
2 B 2016-04-18 2021-12-15
3 C 2014-03-03 2014-04-03
See if your df is like this
> df
ID StartDate TermDate
1 A 2019-02-05 <NA>
2 A 2018-01-01 2018-02-01
3 B 2016-04-18 2016-05-18
4 B 2020-03-03 2020-04-03
5 B 2021-12-13 2021-12-15
6 C 2014-03-03 2014-04-03
df %>% group_by(ID) %>%
summarise(StartDate = first(StartDate),
TermDate = last(TermDate))
# A tibble: 3 x 3
ID StartDate TermDate
* <chr> <date> <date>
1 A 2019-02-05 2018-02-01
2 B 2016-04-18 2021-12-15
3 C 2014-03-03 2014-04-03
We can also do
library(dplyr)
df %>%
group_by(ID) %>%
summarise(StartDate = StartDate[1]),
TermDate = TermDate[n()])
Another data.table
option
setDT(df)[
,
as.list(
setNames(
data.frame(.SD)[cbind(c(1, .N), c(1, 2))],
names(.SD)
)
), ID
]
gives
ID StartDate TermDate
1: A 2018-01-01 <NA>
2: B 2016-04-18 2021-12-15
3: C 2014-03-03 2014-04-03
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With