Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

select numeric columns and one column specified by name from data frame

Tags:

r

numeric

scale

I have a data frame which contains both numeric and non-numeric columns, say

df <- data.frame(v1=1:20,v2=1:20,v3=1:20,v4=letters[1:20],v5=letters[1:20])

To select only the non-numeric columns I would use

fixCol <- !sapply(df,is.numeric)

But now I also want to include a specific numeric column, say v2. My data frame is very big and the order of the columns changes, so I cannot index it using a number, I really want to use the name 'v2'. I tried

fixCol$v2 = TRUE

but that gives me the warning In fixCol$FR = TRUE : Coercing LHS to a list which makes it impossible to subset my original data frame to get only fixCol

df[,fixCol]

gives: Error in .subset(x, j) : invalid subscript type 'list'

In the end my goal is to scale all numeric columns of my data frame except this one specified column, using something like this

scaleCol = !fixCol
df_scaled = cbind(df[,fixCol], sapply(df[,scaleCol],scale))

How can I best do this?

like image 337
Ciska Avatar asked Feb 12 '16 10:02

Ciska


People also ask

How do I select only numeric columns from a data frame?

To select columns that are only of numeric datatype from a Pandas DataFrame, call DataFrame. select_dtypes() method and pass np. number or 'number' as argument for include parameter.

How do you subset a Dataframe in R based on column names?

3.1 Subset by Column Name Let's use the same df[] notation and subset() function to subset the data frame by column name in R. To subset columns use select argument with values as column names to subset() .


1 Answers

We can use a OR condition (|) to get a logical index and then subset the columns of 'df'.

df1 <- df[!sapply(df, is.numeric)|names(df)=='v2']
head(df1,2)
#  v2 v4 v5
#1  1  a  a
#2  2  b  b
like image 104
akrun Avatar answered Oct 23 '22 05:10

akrun