Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expand data frame into combinations of row pairs

Tags:

dataframe

r

I have a data frame that contains an identifier / key column followed by several rows of value columns. I want to expand the data column by taking unique pairs of entries in the key column as the new rows and transform the value columns using binary operations on the entries from the corresponding rows.

E.g.

> Test_data
         SYS dE_water_free dE_water_periodic dE_membrane_periodic    RTlogKi
1 4NTJ_D294N       -56.542           -56.642                   NA -0.9629731
2  4NTJ_wild      -171.031          -162.030                   NA -0.8877264
3 4PXZ_D294N       -53.430           -50.810                   NA -1.1301124
4  4PXZ_wild       -59.990           -57.320                   NA -1.2318835
5 4PY0_D294N       -77.040           -72.880                   NA -1.1351579
6  4PY0_wild       -79.080           -74.950                   NA -1.2297302

Some of the columns may or may not contain missing value(s).

what I would like would be to take each pair of SYS entries, e.g. SYS1 SYS2 and compute a binary operation on the corresponding value rows E.g. SYS1 SYS2 dE_water_free(SYS==SYS1)-dE_water_free(SYS==SYS2) ... etc

        SYS1       SYS2   dE_water_free   dE_water_periodic   ...etc.
1 4NTJ_D294N  4NTJ_wild         114.489             105.610
2 4NTJ_D294N 4PXZ_D294N          -3.112               5.832
... etc.

I can use the function combn() to get an array of pairs from the SYSTEM column to form the entries in SYS1 and SYS2, but I'm not sure how to use it to build the new data frame...

I know one option would be to use something like mapply and build each column individually by hand, then paste them all into a new data frame, but that seems like it will be klunky and slow and there should be a more automatic function to do this, like reshape, merge, or recast... but I can't seem to figure out how make that work.

like image 279
wmsmith Avatar asked May 18 '15 20:05

wmsmith


People also ask

How do I expand a Dataframe in R?

To find all unique combinations of x , y and z , including those not present in the data, supply each variable as a separate argument: expand(df, x, y, z) . To find only the combinations that occur in the data, use nesting : expand(df, nesting(x, y, z)) . You can combine the two forms.

How do you expand the data frame by adding rows and columns in data frame for employee data set?

To add more rows permanently to an existing data frame, we need to bring in the new rows in the same structure as the existing data frame and use the rbind() function. In the example below we create a data frame with new rows and merge it with the existing data frame to create the final data frame.

How do I combine rows in a Dataframe in R?

To merge two data frames (datasets) horizontally, use the merge() function in the R language. To bind or combine rows in R, use the rbind() function. The rbind() stands for row binding.

What does expand grid do in R?

expand. grid() function in R Language is used to create a data frame with all the values that can be formed with the combinations of all the vectors or factors passed to the function as argument.


1 Answers

outer is well suited for this type of problem:

de_wf <- with(Test_data, setNames(dE_water_free, SYS))
outer(de_wf, de_wf, `-`)

produces:

           4NTJ_D294N 4NTJ_wild 4PXZ_D294N 4PXZ_wild 4PY0_D294N 4PY0_wild
4NTJ_D294N      0.000   114.489     -3.112     3.448     20.498    22.538
4NTJ_wild    -114.489     0.000   -117.601  -111.041    -93.991   -91.951
4PXZ_D294N      3.112   117.601      0.000     6.560     23.610    25.650
4PXZ_wild      -3.448   111.041     -6.560     0.000     17.050    19.090
4PY0_D294N    -20.498    93.991    -23.610   -17.050      0.000     2.040
4PY0_wild     -22.538    91.951    -25.650   -19.090     -2.040     0.000
like image 81
BrodieG Avatar answered Sep 20 '22 02:09

BrodieG