Hoping this isn't a repeat- I've had a search around but can't find quite what I'm looking for.
I have a dataframe (df) in R
1 2 3 4 5
1 1 0.5 0.5 0 1
2 0.5 0.5 0.5 0 1
3 1 1 0 0 1
4 1 1 0 0 1
5 1 1 0 0 1
(with the 1-5 indicating row and column names)
I would like to paste the column name to each cell, separated by a ":" so that it looks like this:
1 2 3 4 5
1 1:1 2:0.5 3:0.5 4:0 5:1
2 1:0.5 2:0.5 3:0.5 4:0 5:1
3 1:1 2:1 3:0 4:0 5:1
4 1:1 2:1 3:0 4:0 5:1
5 1:1 2:1 3:0 4:0 5:1
However, my actual data is quite a bit larger.
I currently have
apply(df, 2, function(x) paste(colnames(df)[x], x, sep=":"))
Of course this doesn't work as colnames(df)[x] doesn't make any sense. Is there anything I can put in that first 'paste' term to get this sorted? Or another function to do a better job?
Thanks.
scan. You can use the scan function to copy a column of numbers from Excel to R. Copy the column from Excel, run x <- scan() , type Ctrl-v to paste into R, and press enter to signal the end of input to scan .
You can add new columns to a dataframe using the $ and assignment <- operators. To do this, just use the df$name notation and assign a new vector of data to it. As you can see, survey has a new column with the name sex with the values we specified earlier.
We reference a data frame column with the double square bracket "[[]]" operator. For example, to retrieve the ninth column vector of the built-in data set mtcars, we write mtcars[[9]].
The easiest way to create a duplicate column in an R data frame is setting the new column with signandifwewanttohaveadifferentnamethenwecansimplypassanewname. Forexample,ifwehaveadataframedfthatcontainsacolumnxandwewanttohaveanewcolumnx1havingsamevaluesasinxthenitcanbedoneasdfx1<-df$x.
As an alternative to looping, you can use col(., as.factor = TRUE)
to create a matrix of column names, then paste it to the data (coerced to matrix).
df[] <- paste(col(df, TRUE), as.matrix(df), sep = ":")
Resulting in:
1 2 3 4 5 1 1:1 2:0.5 3:0.5 4:0 5:1 2 1:0.5 2:0.5 3:0.5 4:0 5:1 3 1:1 2:1 3:0 4:0 5:1 4 1:1 2:1 3:0 4:0 5:1 5 1:1 2:1 3:0 4:0 5:1
Actually, with these particular column names, as.factor = TRUE
is not necessary. But it would be necessary for column names not the same as the column numbers. For this particular example, it could be
df[] <- paste(col(df), as.matrix(df), sep = ":")
P.S. You should really be using a matrix with 100% numeric data, instead of a data frame.
Data:
df <- structure(list(`1` = c(1, 0.5, 1, 1, 1), `2` = c(0.5, 0.5, 1,
1, 1), `3` = c(0.5, 0.5, 0, 0, 0), `4` = c(0L, 0L, 0L, 0L, 0L
), `5` = c(1L, 1L, 1L, 1L, 1L)), .Names = c("1", "2", "3", "4",
"5"), class = "data.frame", row.names = c("1", "2", "3", "4",
"5"))
To explain my comment, Map
is a multivariate version of lapply
, so
df <- data.frame(`1` = c(1, 0.5, 1, 1, 1),
`2` = c(0.5, 0.5, 1, 1, 1),
`3` = c(0.5, 0.5, 0, 0, 0),
`4` = c(0L, 0L, 0L, 0L, 0L),
`5` = c(1L, 1L, 1L, 1L, 1L),
check.names = FALSE)
df[] <- Map(paste, names(df), df, sep = ':')
df
## 1 2 3 4 5
## 1 1:1 2:0.5 3:0.5 4:0 5:1
## 2 1:0.5 2:0.5 3:0.5 4:0 5:1
## 3 1:1 2:1 3:0 4:0 5:1
## 4 1:1 2:1 3:0 4:0 5:1
## 5 1:1 2:1 3:0 4:0 5:1
Here Map
takes the first element of names(df)
, i.e. 1
, and paste
s it to the first element of df
, i.e. the first column. Assigning to df[]
keeps the list's data.frame class, and therefore the original structure.
If your data is a matrix, you can do the same thing with sweep
:
mat <- matrix(c(1, 0.5, 1, 1, 1, 0.5, 0.5, 1, 1, 1, 0.5, 0.5, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1),
5, 5,
dimnames = list(c("1", "2", "3", "4", "5"),
c("1", "2", "3", "4", "5")))
mat[] <- sweep(mat, 2, colnames(df), function(x, y) paste(y, x, sep = ':'))
mat
## 1 2 3 4 5
## 1 "1:1" "2:0.5" "3:0.5" "4:0" "5:1"
## 2 "1:0.5" "2:0.5" "3:0.5" "4:0" "5:1"
## 3 "1:1" "2:1" "3:0" "4:0" "5:1"
## 4 "1:1" "2:1" "3:0" "4:0" "5:1"
## 5 "1:1" "2:1" "3:0" "4:0" "5:1"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With