Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Operation on multiple(70) columns by another column in R

For the following data, I want each column to be replaced by (divide by) --> /corresponding length. (i.e. A/len, B/len, C/len,...)

... implies more columns, upto 70. As this has many columns, how one should proceed ?

 A    B    C     D    E     F   ...   len

 2    4    5     7    8     8          5
 5    8    3     1    0     4          6
 8    9    3     9    6     2          12
 2    6    2     6    7     8          10
 1    2    4     2    9     5          20
like image 812
rach Avatar asked Apr 06 '15 18:04

rach


People also ask

How do I separate multiple columns from another column in R?

To divide each column by a particular column, we can use division sign (/). For example, if we have a data frame called df that contains three columns say x, y, and z then we can divide all the columns by column z using the command df/df[,3].

How do I specify multiple columns in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.

How do I reference specific columns in R?

To access a specific column in a dataframe by name, you use the $ operator in the form df$name where df is the name of the dataframe, and name is the name of the column you are interested in.

Can you gather multiple columns in R?

The function unite() takes multiple columns and paste them together into one.


1 Answers

If your data frame df is exactly as you show, you can simply do

df[-ncol(df)] / df$len

If you have other columns to exclude, and you want them all included in the result, you can do something like

with(df, cbind(ID, df[!names(df) %in% c("ID", "len")]/len, len))
#   ID         A        B    C         D    E         F len
# 1  1 0.4000000 0.800000 1.00 1.4000000 1.60 1.6000000   5
# 2  2 0.8333333 1.333333 0.50 0.1666667 0.00 0.6666667   6
# 3  3 0.6666667 0.750000 0.25 0.7500000 0.50 0.1666667  12
# 4  4 0.2000000 0.600000 0.20 0.6000000 0.70 0.8000000  10
# 5  5 0.0500000 0.100000 0.20 0.1000000 0.45 0.2500000  20

Also, as suggested by David in the comments, you can use data.table

library(data.table)
x <- c(1L, ncol(df))
setDT(df)[, names(df)[-x] := lapply(.SD, "/", df$len), .SDcols = -x]

which results in

#    ID         A        B    C         D    E         F len
# 1:  1 0.4000000 0.800000 1.00 1.4000000 1.60 1.6000000   5
# 2:  2 0.8333333 1.333333 0.50 0.1666667 0.00 0.6666667   6
# 3:  3 0.6666667 0.750000 0.25 0.7500000 0.50 0.1666667  12
# 4:  4 0.2000000 0.600000 0.20 0.6000000 0.70 0.8000000  10
# 5:  5 0.0500000 0.100000 0.20 0.1000000 0.45 0.2500000  20

where df is

df <- read.table(text = "ID A    B    C     D    E     F   len
1  2    4    5     7    8     8    5
2  5    8    3     1    0     4    6
3  8    9    3     9    6     2   12
4  2    6    2     6    7     8   10
5  1    2    4     2    9     5   20", header = TRUE)
like image 97
Rich Scriven Avatar answered Nov 28 '22 14:11

Rich Scriven