Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pasting the column name to each value of a dataframe in R

Tags:

dataframe

r

Hoping this isn't a repeat- I've had a search around but can't find quite what I'm looking for.

I have a dataframe (df) in R

  1 2 3 4 5 
1 1 0.5 0.5 0 1
2 0.5 0.5 0.5 0 1
3 1 1 0 0 1
4 1 1 0 0 1 
5 1 1 0 0 1

(with the 1-5 indicating row and column names)

I would like to paste the column name to each cell, separated by a ":" so that it looks like this:

  1 2 3 4 5 
1 1:1 2:0.5 3:0.5 4:0 5:1 
2 1:0.5 2:0.5 3:0.5 4:0 5:1 
3 1:1 2:1 3:0 4:0 5:1 
4 1:1 2:1 3:0 4:0 5:1 
5 1:1 2:1 3:0 4:0 5:1 

However, my actual data is quite a bit larger.

I currently have

apply(df, 2, function(x) paste(colnames(df)[x], x, sep=":"))

Of course this doesn't work as colnames(df)[x] doesn't make any sense. Is there anything I can put in that first 'paste' term to get this sorted? Or another function to do a better job?

Thanks.

like image 382
Emma Sylvester Avatar asked Jan 16 '17 20:01

Emma Sylvester


People also ask

How do I copy and paste column names in R?

scan. You can use the scan function to copy a column of numbers from Excel to R. Copy the column from Excel, run x <- scan() , type Ctrl-v to paste into R, and press enter to signal the end of input to scan .

How do I give column names to a Dataframe in R?

You can add new columns to a dataframe using the $ and assignment <- operators. To do this, just use the df$name notation and assign a new vector of data to it. As you can see, survey has a new column with the name sex with the values we specified earlier.

How do I reference a column in a Dataframe in R?

We reference a data frame column with the double square bracket "[[]]" operator. For example, to retrieve the ninth column vector of the built-in data set mtcars, we write mtcars[[9]].

How do I copy a column in a Dataframe in R?

The easiest way to create a duplicate column in an R data frame is setting the new column with signandifwewanttohaveadifferentnamethenwecansimplypassanewname. Forexample,ifwehaveadataframedfthatcontainsacolumnxandwewanttohaveanewcolumnx1havingsamevaluesasinxthenitcanbedoneasdfx1<-df$x.


2 Answers

As an alternative to looping, you can use col(., as.factor = TRUE) to create a matrix of column names, then paste it to the data (coerced to matrix).

df[] <- paste(col(df, TRUE), as.matrix(df), sep = ":")

Resulting in:

      1     2     3   4   5
1   1:1 2:0.5 3:0.5 4:0 5:1
2 1:0.5 2:0.5 3:0.5 4:0 5:1
3   1:1   2:1   3:0 4:0 5:1
4   1:1   2:1   3:0 4:0 5:1
5   1:1   2:1   3:0 4:0 5:1

Actually, with these particular column names, as.factor = TRUE is not necessary. But it would be necessary for column names not the same as the column numbers. For this particular example, it could be

df[] <- paste(col(df), as.matrix(df), sep = ":")

P.S. You should really be using a matrix with 100% numeric data, instead of a data frame.

Data:

df <- structure(list(`1` = c(1, 0.5, 1, 1, 1), `2` = c(0.5, 0.5, 1, 
1, 1), `3` = c(0.5, 0.5, 0, 0, 0), `4` = c(0L, 0L, 0L, 0L, 0L
), `5` = c(1L, 1L, 1L, 1L, 1L)), .Names = c("1", "2", "3", "4", 
"5"), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5"))
like image 42
Rich Scriven Avatar answered Oct 07 '22 00:10

Rich Scriven


To explain my comment, Map is a multivariate version of lapply, so

df <- data.frame(`1` = c(1, 0.5, 1, 1, 1), 
                 `2` = c(0.5, 0.5, 1, 1, 1), 
                 `3` = c(0.5, 0.5, 0, 0, 0), 
                 `4` = c(0L, 0L, 0L, 0L, 0L), 
                 `5` = c(1L, 1L, 1L, 1L, 1L), 
                 check.names = FALSE)

df[] <- Map(paste, names(df), df, sep = ':')

df
##       1     2     3   4   5
## 1   1:1 2:0.5 3:0.5 4:0 5:1
## 2 1:0.5 2:0.5 3:0.5 4:0 5:1
## 3   1:1   2:1   3:0 4:0 5:1
## 4   1:1   2:1   3:0 4:0 5:1
## 5   1:1   2:1   3:0 4:0 5:1

Here Map takes the first element of names(df), i.e. 1, and pastes it to the first element of df, i.e. the first column. Assigning to df[] keeps the list's data.frame class, and therefore the original structure.

If your data is a matrix, you can do the same thing with sweep:

mat <- matrix(c(1, 0.5, 1, 1, 1, 0.5, 0.5, 1, 1, 1, 0.5, 0.5, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1), 
              5, 5, 
              dimnames = list(c("1", "2", "3", "4", "5"), 
                              c("1", "2", "3", "4", "5")))

mat[] <- sweep(mat, 2, colnames(df), function(x, y) paste(y, x, sep = ':'))

mat
##   1       2       3       4     5    
## 1 "1:1"   "2:0.5" "3:0.5" "4:0" "5:1"
## 2 "1:0.5" "2:0.5" "3:0.5" "4:0" "5:1"
## 3 "1:1"   "2:1"   "3:0"   "4:0" "5:1"
## 4 "1:1"   "2:1"   "3:0"   "4:0" "5:1"
## 5 "1:1"   "2:1"   "3:0"   "4:0" "5:1"
like image 77
alistaire Avatar answered Oct 07 '22 00:10

alistaire