Consider the following code
a = "col1" b = "col2" d = data.frame(a=c(1,2,3),b=c(4,5,6))
This code produces the following data frame
a b 1 1 4 2 2 5 3 3 6
However the desired data frame is
col1 col2 1 1 4 2 2 5 3 3 6
Further, I'd like to be able to do something like d$a
which would then grab d$col1
since a = "col1"
How can I tell R that "a"
is a variable and not a name of a column?
To access a specific column in a dataframe by name, you use the $ operator in the form df$name where df is the name of the dataframe, and name is the name of the column you are interested in. This operation will then return the column you want as a vector.
To select a single column, use square brackets [] with the column name of the column of interest.
Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.
When you extract only one column that automatically becomes a series , do you want to forcefully make it a dataframe? You can simple use it like this: df2 = df[['b','c','d','e','f']] why are you using regex?
After creating your data frame, you need to use ?colnames. For example, you would have:
d = data.frame(a=c(1,2,3), b=c(4,5,6)) colnames(d) <- c("col1", "col2")
You can also name your variables when you create the data frame. For example:
d = data.frame(col1=c(1,2,3), col2=c(4,5,6))
Further, if you have the names of columns stored in variables, as in
a <- "col1"
you can't use $
to select a column via d$a
. R will look for a column whose name is a
. Instead, you can do either d[[a]]
or d[,a]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With