I found this code:
string = c("G1:E001", "G2:E002", "G3:E003")
> sapply(strsplit(string, ":"), "[", 2)
[1] "E001" "E002" "E003"
clearly strsplit(string, ":")
returns a vectors of size 3 where each component i is a vector of size 2 containing Gi
and E00i
.
But why the two more arguments "[", 2
have the effect to select only those E00i
? As far as I see the only arguments accepted by the function are:
sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
Strsplit(): An R Language function which is used to split the strings into substrings with split arguments. Where: X = input data file, vector or a stings. Split = Splits the strings into required formats.
To split a string in R, use the strsplit() method. The strsplit() is a built-in R function that splits the string vector into sub-strings. The strsplit() method returns the list, where each list item resembles the item of input that has been split.
You could use sub
to get the expected output instead of using strsplit/sapply
sub('.*:', '', string)
#[1] "E001" "E002" "E003"
Regarding your code, strsplit
output is a list and list can be processed with apply family functions sapply/lapply/vapply/rapply
etc. In this case, each list element have a length of 2 and we are selecting the second element.
strsplit(string, ":")
#[[1]]
#[1] "G1" "E001"
#[[2]]
#[1] "G2" "E002"
#[[3]]
#[1] "G3" "E003"
lapply(strsplit(string, ":"), `[`, 2)
#[[1]]
#[1] "E001"
#[[2]]
#[1] "E002"
#[[3]]
#[1] "E003"
In the case of sapply
, the default option is simplify=TRUE
sapply(strsplit(string, ":"), `[`, 2, simplify=FALSE)
#[[1]]
#[1] "E001"
#[[2]]
#[1] "E002"
#[[3]]
#[1] "E003"
The [
can be replaced by anonymous function call
sapply(strsplit(string, ":"), function(x) x[2], simplify=FALSE)
#[[1]]
#[1] "E001"
#[[2]]
#[1] "E002"
#[[3]]
#[1] "E003"
Look at the docs for ?sapply
:
sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
FUN: the function to be applied to each element of ‘X’: see
‘Details’. In the case of functions like ‘+’, ‘%*%’, the
function name must be backquoted or quoted.
...: optional arguments to ‘FUN’.
There-in lies your answer. In your case, FUN
is [
. The "optional arguments to fun
" is "2" in your case since it gets matched to ...
in your call. So in this case, sapply
is calling [
with the values in the list as the first argument, and 2
as the second. Consider:
x <- c("G1", "E001") # this is the result of `strsplit` on the first value
Then:
`[`(x, 2) # equivalent to x[2]
# [1] "E001"
This is what sapply
is doing in your example, except it is applying to every 2 length character vector returned by strsplit
.
Because the output of strsplit()
is a list. The "[" addresses the elements of the list, and the 2 indicates that the second item of a member of the list is selected. The sapply()
function ensures that this is done for each member of the list. Here [
is the function in sapply()
, which is applied to the list of strsplit()
and called with the additional parameter 2.
> strsplit(string, ":")
#[[1]]
#[1] "G1" "E001"
#
#[[2]]
#[1] "G2" "E002"
#
#[[3]]
#[1] "G3" "E003"
#
> str(strsplit(string, ":"))
#List of 3
# $ : chr [1:2] "G1" "E001"
# $ : chr [1:2] "G2" "E002"
# $ : chr [1:2] "G3" "E003"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With