Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

passing a string as a data frame column name

Tags:

dataframe

r

I have a data frame called data.df with various columns say col1,col2,col3....col15. The data frame does not have a specific class attribute but any attribute could be potentially used as a class variable. I would like to use an R variable called target which points to the column number to be treated as class as follows :

target<-data.df$col3

and then use that field (target) as input to several learners such as PART and J48 (from package RWeka) :

part<-PART(target~.,data=data.df,control=Weka_control(M=200,R=FALSE))
j48<-J48(target~.,data=data.df,control=Weka_control(M=200,R=FALSE)) 

The idea is to be able to change 'target' only once at the beginning of my R code. How can this be done?

like image 931
Harry Wells Avatar asked Nov 02 '11 10:11

Harry Wells


People also ask

How to assign column names to dataframe in r?

You can add new columns to a dataframe using the $ and assignment <- operators. To do this, just use the df$name notation and assign a new vector of data to it. As you can see, survey has a new column with the name sex with the values we specified earlier.

How do you call a column in a data frame?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.

How do you refer to a column name in R?

Use the $ operator to address a column by name.

What is the function to set column names for a data frame?

rename() is the method available in the dplyr package, which is used to change the particular columns present in the dataframe. The operator – %>% is used to load the renamed column names to the dataframe. At a time it will change single or multiple column names.


1 Answers

I sometimes manage to get a lot done by using strings to reference columns. It works like this:

> df <- data.frame(numbers=seq(5))
> df
  numbers
1       1
2       2
3       3
4       4
5       5
> df$numbers
[1] 1 2 3 4 5
> df[['numbers']]
[1] 1 2 3 4 5

You can then have a variable target be the name of your desired column as a string. I don't know about RWeka, but many libraries such as ggplot can take string references for columns (e.g. the aes_string parameter instead of aes).

like image 59
metakermit Avatar answered Sep 21 '22 15:09

metakermit