loop: selection of variable for correlation function in r

Question

Here is what I intend to do (for a fairly large number of variables and dataset):

mygroupdf <- data.frame (varname = c("A", "B", "c1", "D2",
    "E", "F", "g1"), group = c(1, 1, 1, 2,3,3,4))

> mygroupdf
      varname group
  1       A     1
  2       B     1
  3      c1     1
  4      D2     2
  5       E     3
  6       F     3
  7      g1     4

This dataframe only consists of information for grouping of variables:

group 1 = A, B, c1
group 2 = D2
group 3 = E, F
group 4 = g1

Second dataset - contains actual data

set.seed(1234)
dataf <- data.frame (yvar = rnorm (10, 10,3), 
    A = sample(c(1,0), 10, T), B = sample(c(1,0), 10, T), 
    c1 = sample (c(1,0), 10, T), D2 = sample (c(1,0), 10, T), 
    E= sample (c(1,0), 10, T),F = sample (c(1,0), T), 
    g1 = sample (c(1,0), 10, T))

# manual workout:
xtemp <- dataf$A* dataf$B * dataf$c1 # all from group 1
# I error in previous version it is * not + 
# (is product of all members of a group i.e. 
 xtemp <- dataf$D2 (- group 2)
 xtemp <- dataf$E * dataf$F (- group 3)
 xtemp <- dataf$G (- group 4)

Then correlation of the product with Yvar:

x <- cor(dataf$yvar, xtemp)

I want to wrap it to a function so that I can apply it to the 1000 groups of variables in my dataset.

   corrfun <- function (x, V1, V2, V3) {
           xtemp <- V1 * V2  + V3
           x <- cor(dataf$yvar, xtemp)
           return (x)
          }

As different groups have different variables, I am not sure how can I build such a function and apply to whole dataset. Help please !

Edits: process:

enter image description here

Justin · Accepted Answer

I'll wager a guess...

corrfun <- function (group.no, x=dataf, x.lookup=mygroupdf) {
  xtemp <- apply(x[x.lookup$varname[x.lookup$group == group.no]], 1, prod)

  out <- cor(x$yvar, xtemp)

  return (out)
}

>     corrfun(1)
[1] 0.35593
> corrfun(2)
[1] 0.4181311
>

loop: selection of variable for correlation function in r

Tags:

variables

loops

dataframe

r

SHRram

1 Answers

Justin

Recent Activity

Donate For Us

loop: selection of variable for correlation function in r

Tags:

variables

loops

dataframe

r

SHRram

1 Answers

Justin

Related questions

Recent Activity

Donate For Us