Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Excluding unique dates

Tags:

r

I have data that looks like this in excel and extends to more (Date5, Date6....)

 Date1   Value1 Date2    Value2 Date3    Value3 Date4   Value4
 1/2/2004   17  1/3/2004    27  1/1/2004    17  1/3/2004    31
 1/3/2004   26  1/4/2004    30  1/3/2004    29  1/4/2004    36
 1/4/2004   22  1/5/2004    22  1/4/2004    28  1/5/2004    33
 1/5/2004   17  1/6/2004    28  1/5/2004    36  1/6/2004    50
 1/13/2004  15  1/7/2004    17  1/12/2004   15  1/8/2004    9
 1/14/2004  10  1/14/2004   21  1/14/2004   12  1/14/2004   11

And i want to exclude any values that its associated date doesn't exist in all series.

for the sample data i posted the result should look like this:

Date    Value1  Value2  Value3  Value4
1/3/2004    26    27    29      31
1/4/2004    22    30    28      36
1/5/2004    17    22    36      33
1/14/2004   10    21    12      11
like image 806
G_z Avatar asked Nov 11 '15 09:11

G_z


1 Answers

Date <- Reduce(intersect, list(df$Date1, df$Date2, df$Date3, df$Date4))
Value1 <- df[df$Date1 %in% Date, ]$Value1
Value2 <- df[df$Date2 %in% Date, ]$Value2
Value3 <- df[df$Date3 %in% Date, ]$Value3
Value4 <- df[df$Date4 %in% Date, ]$Value4
data.frame(Date, Value1, Value2, Value3, Value4)


# Date Value1 Value2 Value3 Value4
# 1  1/3/2004     26     27     29     31
# 2  1/4/2004     22     30     28     36
# 3  1/5/2004     17     22     36     33
# 4 1/14/2004     10     21     12     11

As mentioned by @docendo discimus that this could be lengthy in case of multiple columns, an updated way would be

Date <- Reduce(intersect, list(df$Date1, df$Date2, df$Date3, df$Date4))
Values <- df[, seq(0, ncol(df), by=2)]
Dates <- df[, seq(1, ncol(df), by=2)]
mat <- apply(Dates, 2, function(x) {x %in% Date})
data.frame(Date, matrix(Values[mat], nrow = 4))

# Date X1 X2 X3 X4
# 1  1/3/2004 26 27 29 31
# 2  1/4/2004 22 30 28 36
# 3  1/5/2004 17 22 36 33
# 4 1/14/2004 10 21 12 11

As per @David 's comments, this can be further improved by using

Values <- df[c(FALSE, TRUE)]
Dates <- df[c(TRUE, FALSE)]
Date <- Reduce(intersect, as.list(Dates))
mat <- apply(Dates, 2, function(x) {x %in% Date}) 
data.frame(Date, matrix(Values[mat], nrow = ncol(df)/2))

#        Date X1 X2 X3 X4
# 1  1/3/2004 26 27 29 31
# 2  1/4/2004 22 30 28 36
# 3  1/5/2004 17 22 36 33
# 4 1/14/2004 10 21 12 11
like image 122
Ronak Shah Avatar answered Oct 26 '22 06:10

Ronak Shah