Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge producing unexpected results in R

I am trying to merge:

to_graph <- structure(list(Teacher = c("BS", "BS", "FA"
), Level = structure(c(2L, 1L, 1L), .Label = c("BE", "AE", "ME", 
"EE"), class = "factor"), Count = c(2L, 25L, 28L)), .Names = c("Teacher", 
"Level", "Count"), row.names = c(NA, 3L), class = "data.frame")

and

graph_avg <- structure(list(Teacher = structure(c(1L, 1L, 2L), .Label = c("BS", 
"FA"), class = "factor"), Count.Fraction = c(0.0740740740740741, 
0.925925925925926, 1)), .Names = c("Teacher", "Count.Fraction"
), row.names = c(NA, -3L), class = "data.frame")

with merge(to_graph, graph_avg, by="Teacher"), but instead of getting what I expect (3 rows), I get:

  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    AE     2     0.92592593
3      BS    BE    25     0.07407407
4      BS    BE    25     0.92592593
5      FA    BE    28     1.00000000

Any ideas? Thank you!

like image 828
Jeff Erickson Avatar asked Feb 22 '23 07:02

Jeff Erickson


1 Answers

Not sure what you're trying to accomplish. merge is doing what it's supposed to here.

Let's look at all of the data.frames

graph_avg
  Teacher Count.Fraction
1      BS     0.07407407
2      BS     0.92592593
3      FA     1.00000000

to_graph
  Teacher Level Count
1      BS    AE     2
2      BS    BE    25
3      FA    BE    28

merge(to_graph, graph_avg)
  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    AE     2     0.92592593
3      BS    BE    25     0.07407407
4      BS    BE    25     0.92592593
5      FA    BE    28     1.00000000

Now, if I'm going to merge those I've got to look and see what's common and what I'm going to get for an outcome. Teacher, you have that in both. But, if I try to merge on just Teacher what do I do? There's no unique identifier for BS and it appears twice in both data.frames. If it appeared once in one of them it would be easy to solve. So, I go can check and say, OK, I've got a unique identifier in one data.frame, level... that would do it... and go and make something that doesn't lose any of your data. merge is really handy for situations where you've got a small data.frame, say with each teacher in it once, and it has the teacher's age, or sex there. You could merge that into your another data.frame with repeated measures on teacher and every time the teacher appears you'll also know those. But for what you're doing it's not the right tool.

merge is not what you want here. If these are really your data.frames use cbind instead.

cbind(to_graph, graph_avg$Count.Fraction)

  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    BE    25     0.92592593
3      FA    BE    28     1.00000000

That's probably what you were looking for.

like image 159
John Avatar answered Mar 05 '23 04:03

John