I am trying to subset the columns of a data.frame
using the interval of column names.
For instance, the data.frame
A
:
A
ID1 ID2 ID3
1 5 01901
2 5 01902
For example, I want create variable b with the columns of A:
b=A[,"ID2":"ID3"]
Error in "ID1":"ID3" : NA/NaN argument In addition: Warning messages: 1: In
[.data.frame
(A, , "ID1":"ID3") : NAs introduced by coercion 2: In[.data.frame
(A, , "ID1":"ID3") : NAs introduced by coercion
What I want how solution:
b
ID2 ID3
5 01901
5 01902
When I put the indexes of the columns, it works. But when I use the column name, as above, does not work.
Two approaches in base
R's data.frame
:
First, subset by known name:
b = A[, c('ID2', 'ID3')]
Second, subset by an interval when it is known the columns are the same:
# Column Variables
colvars = names(A)
# Get the first ID
start_loc = match("ID1",colvars)
# Get the second ID
end_loc = match("ID3",colvars)
# Subset range
b = A[,start_loc:end_loc]
If you are not restricted to data.frame, you can convert it to data.table and then your formula will work:
data.table::setDT(A)[, ID2:ID3, with=F]
ID2 ID3
1: 5 1901
2: 5 1902
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With