Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Partial Correlations in R [closed]

I am trying to compute a partial correlation in R. I have the two data sets that I want to compare and currently only one controlled variable. (This will change in the future)

I have looked online to try to work this out myself but it is difficult to understand the terminology used on the websites I have looked at. Can someone please explain how I would go about doing this and perhaps provide a simple example?

Data is in the following form:

                Project.Name Bugs.Project Changes.Project Orgs.Project
1     platform_external_svox            4             161            2
3 platform_packages_apps_Nfc           13             223            2
5      platform_system_media           36             307            2
7     platform_external_mtpd            2              30            2
9            platform_bionic           42            1061            4

I want the correlation between Bugs.Project and Orgs.Project with Changes.Project as a controlled variable. I have downloaded the ppcor library since it looks like it has the functionality that I need. I am unsure how to use it, however. How do I add my data to a matrix and use the pcor function?

This is what I've been trying:

y.data <- data.frame(
bpp=c(projRelateBugsOrgs[2]),
opp=c(projRelateBugsOrgs[4]),
cpp=c(projRelateBugsOrgs[3])
)

test <- pcor(y.data)

I just used an example I found and tried to use my data in place of theirs. I don't understand my output.

It looks like this:

$estimate
                Bugs.Project Orgs.Project Changes.Project
Bugs.Project       1.0000000    0.3935535       0.9749296
Orgs.Project       0.3935535    1.0000000      -0.1800788
Changes.Project    0.9749296   -0.1800788       1.0000000

$p.value
                Bugs.Project Orgs.Project Changes.Project
Bugs.Project     0.00000e+00  2.09795e-07       0.0000000
Orgs.Project     2.09795e-07  0.00000e+00       0.0264442
Changes.Project  0.00000e+00  2.64442e-02       0.0000000

$statistic
                Bugs.Project Orgs.Project Changes.Project
Bugs.Project        0.000000     5.190442       53.122165
Orgs.Project        5.190442     0.000000       -2.219625
Changes.Project    53.122165    -2.219625        0.000000

$n
[1] 150

$gp
[1] 1

$method
[1] "pearson"

I think I want something from the $estimate table but I'm not exactly sure what it's giving me,

like image 904
user1897691 Avatar asked Jan 10 '13 04:01

user1897691


People also ask

How do you do partial correlation in R?

To calculate Partial Correlation in the R Language, we use the pcor() function of the ppcor package library. The ppcor package library helps us to calculate partial and semi-partial correlations along with p-value.

What does it mean if R is close to 1?

The value of the number indicates the strengthof the relationship: r = 0 means there is no correlation. r = 1 means there is perfect positive correlation. r = -1 means there is a perfect negative correlation.

What is the difference between part and partial correlations?

Like the partial correlation, the part correlation is the correlation between two variables (independent and dependent) after controlling for one or more other variables. However, for the part correlation, only the influence of the control variables on the independent variable is taken into account.

How do you find the partial correlation between two variables?

Formal definition. Formally, the partial correlation between X and Y given a set of n controlling variables Z = {Z1, Z2, ..., Zn}, written ρXY·Z, is the correlation between the residuals eX and eY resulting from the linear regression of X with Z and of Y with Z, respectively.


1 Answers

Reading from help('pcor') in the value section

Value

estimate a matrix of the partial correlation coefficient between two variables

p.value a matrix of the p value of the test

statistic a matrix of the value of the test statistic

n the number of samples

gn the number of given variables

method the correlation method used

The details section gives

Details

Partial correlation is the correlation of two variables while controlling for a third or more other variables.

For your result

$estimate
                Bugs.Project Orgs.Project Changes.Project
Bugs.Project       1.0000000    0.3935535       0.9749296
Orgs.Project       0.3935535    1.0000000      -0.1800788
Changes.Project    0.9749296   -0.1800788       1.0000000

The partial correlation of Changes.Project and Orgs.Project is -0.1800788. This is the correlation of Changes.Project and Orgs.Project controlling for Bugs.Project

The partial correlation of Changes.Project and Bugs.Project is 0.9747296. This is the correlation of Changes.Project and Bugs.Project controlling for Orgs.Project

The partial correlation of Orgs.Project and Bugs.Project is 0.3935535. This is the correlation of Orgs.Project and Bugs.Project controlling for Changes.Project

You could get same information (if you are only interested in this third case) from

pcor.test(y.data$Orgs.Project, y.data$Bugs.Project, y.data$Changes.Project)
like image 178
mnel Avatar answered Nov 14 '22 21:11

mnel