I have the following dataframes a,b,c
Year<-rep(c("2002","2003"),1)
Crop<-c("TTT","RRR")
a<-data.frame(Year,Crop)
Year<-rep(c("2002","2003"),2)
ProductB<-c("A","A","B","B")
b<-data.frame(Year,ProductB)
Year<-rep(c("2002","2003"),3)
Location<-c("XX","XX","YY","YY","ZZ","ZZ")
c<-data.frame(Year,Location)
and want to get them together. When I use the merge
function i get the cartesian product which is not what I want.
d<-merge(a,b,by="Year")
e<-merge(d,c,by="Year")
I would like the dataframe to look like
Year Crop ProductB Location
2002 TTT A XX
2002 NA B YY
2002 NA NA ZZ
2003 RRR A XX
2003 NA B YY
2003 NA NA ZZ
Is this possible? Thanks for your help
Here's one way using data.table
.
require(data.table) ## 1.9.2
# (1)
setDT(a)[, GRP := 1:.N, by=Year]
setDT(b)[, GRP := 1:.N, by=Year]
setDT(c)[, GRP := 1:.N, by=Year]
# (2)
merge(a, merge(b, c, by=c("Year", "GRP"),
all=TRUE), by=c("Year", "GRP"), all=TRUE)
# Year GRP Crop ProductB Location
# 1: 2002 1 TTT A XX
# 2: 2002 2 NA B YY
# 3: 2002 3 NA NA ZZ
# 4: 2003 1 RRR A XX
# 5: 2003 2 NA B YY
# 6: 2003 3 NA NA ZZ
- (1) -
setDT
converts thedata.frame
todata.table
and then we create a new columnGRP
by grouping byYear
. With this, we've a unique combination ofYear, Grp
.- (2) - we merge on the two columns
Year, GRP
.
.N
is an inbuilt variable that holds the number of rows for that group.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With