Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I perform a function on each row of a data frame and have just one element of the output inserted as a new column in that row

Tags:

r

row

It is easy to do an Exact Binomial Test on two values but what happens if one wants to do the test on a whole bunch of number of successes and number of trials. I created a dataframe of test sensitivities, potential number of enrollees in a study and then for each row I calculate how may successes that would be. Here is the code.

sens <-seq(from=.1, to=.5, by=0.05)
enroll <-seq(from=20, to=200, by=20)
df <-expand.grid(sens=sens,enroll=enroll)
df <-transform(df,succes=sens*enroll)

But now how do I use each row's combination of successes and number of trials to do the binomial test.

I am only interested in the upper limit of the 95% confidence interval of the binomial test. I want that single number to be added to the data frame as a column called "upper.limit"

I thought of something along the lines of

binom.test(succes,enroll)$conf.int    

alas, conf.int gives something such as

[1] 0.1266556 0.2918427
attr(,"conf.level")
[1] 0.95

All I want is just 0.2918427

Furthermore I have a feeling that there has to be do.call in there somewhere and maybe even an lapply but I do not know how that will go through the whole data frame. Or should I perhaps be using plyr?

Clearly my head is spinning. Please make it stop.

like image 622
Farrel Avatar asked Nov 25 '10 00:11

Farrel


2 Answers

If this gives you (almost) what you want, then try this:

binom.test(succes,enroll)$conf.int[2]

And apply across the board or across the rows as it were:

> df$UCL <- apply(df, 1, function(x)  binom.test(x[3],x[2])$conf.int[2] )
> head(df)
  sens enroll succes       UCL
1 0.10     20      2 0.3169827
2 0.15     20      3 0.3789268
3 0.20     20      4 0.4366140
4 0.25     20      5 0.4910459
5 0.30     20      6 0.5427892
6 0.35     20      7 0.5921885
like image 65
IRTFM Avatar answered Oct 02 '22 14:10

IRTFM


Here you go:

R> newres <- do.call(rbind, apply(df, 1, function(x) { 
+                     bt <- binom.test(x[3], x[2])$conf.int; 
+                     newdf <- data.frame(t(x), UCL=bt[2]) }))
R>
R> head(newres)
  sens enroll succes     UCL
1 0.10     20      2 0.31698
2 0.15     20      3 0.37893
3 0.20     20      4 0.43661
4 0.25     20      5 0.49105
5 0.30     20      6 0.54279
6 0.35     20      7 0.59219
R> 

This uses apply to loop over your existing data, compute test, return the value you want by sticking it into a new (one-row) data.frame. And we then glue all those 90 data.frame objects into a new single one with do.call(rbind, ...) over the list we got from apply.

Ah yes, if you just want to directly insert a single column the other answer rocks as it is simple. My longer answer shows how to grow or construct a data.frame during the sweep of apply.

like image 40
Dirk Eddelbuettel Avatar answered Oct 02 '22 13:10

Dirk Eddelbuettel