I have a dataset with several columns, one of which is a column for reaction times. These reaction times are comma separated to denote the reaction times (of the same participant) for the different trials.
For example: row 1 (i.e.: the data from participant 1) has the following under the column "reaction times"
reaction_times
2000,1450,1800,2200
Hence these are the reaction times of participant 1 for trials 1,2,3,4
.
I now want to create a new data set in which the reaction times for these trials all form individual columns. This way I can calculate the mean reaction time for each trial.
trial 1 trial 2 trial 3 trial 4
participant 1: 2000 1450 1800 2200
I tried the colsplit
from the reshape2
package but that doesn't seem to split my data into new columns (perhaps because my data is all in 1 cell).
Any suggestions?
To convert a comma separated string to a numeric array:Call the split() method on the string to get an array containing the substrings. Use the map() method to iterate over the array and convert each string to a number. The map method will return a new array containing only numbers.
To parse a string with commas to a number: Use the replace() method to remove all the commas from the string. The replace method will return a new string containing no commas. Convert the string to a number.
The CONCATENATE function in Excel can transform a column list into a list separated by commas in a cell. Please follow these instructions: 1. Type the formula =CONCATENATE(TRANSPOSE(A1:A7)&",") in a blank cell adjacent to the list's initial data, for example, cell C1.
I think you are looking for the strsplit() function;
a = "2000,1450,1800,2200"
strsplit(a, ",")
[[1]]
[1] "2000" "1450" "1800" "2200"
Notice that strsplit returns a list, in this case with only one element. This is because strsplit takes vectors as input. Therefore, you can also put a long vector of your single cell characters into the function and get back a splitted list of that vector. In a more relevant example this look like:
# Create some example data
dat = data.frame(reaction_time =
apply(matrix(round(runif(100, 1, 2000)),
25, 4), 1, paste, collapse = ","),
stringsAsFactors=FALSE)
splitdat = do.call("rbind", strsplit(dat$reaction_time, ","))
splitdat = data.frame(apply(splitdat, 2, as.numeric))
names(splitdat) = paste("trial", 1:4, sep = "")
head(splitdat)
trial1 trial2 trial3 trial4
1 597 1071 1430 997
2 614 322 1242 1140
3 1522 1679 51 1120
4 225 1988 1938 1068
5 621 623 1174 55
6 1918 1828 136 1816
and finally, to calculate the mean per person:
apply(splitdat, 1, mean)
[1] 1187.50 361.25 963.75 1017.00 916.25 1409.50 730.00 1310.75 1133.75
[10] 851.25 914.75 881.25 889.00 1014.75 676.75 850.50 805.00 1460.00
[19] 901.00 1443.50 507.25 691.50 1090.00 833.25 669.25
A nifty, if rather heavy-handed, way is to use read.csv
in conjunction with textConnection
. Assuming your data is in a data frame, df
:
x <- read.csv(textConnection(df[["reaction times"]]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With