I am trying to do some feature engineering on test and train data. I am well versed with python but new to R.
#Row binding train & test set for feature engineering
train_test = rbind(train, test)
It seems that my train and test data have different number of columns. How to resolve this so that the only columns which are common in both dataframes stay?
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
I would find out what the column names are for both data frames, take their intersection (common names), and selecting those columns from both data frames:
train_names <- colnames(train)
test_names <- colnames(test)
common_names <- intersect(train_names, test_names)
train_test <- rbind(train[common_names], test[common_names])
Find the common columns:
common_cols <- intersect(colnames(train), colnames(test))
Now perform the rbind
train_test=rbind(subset(train, select = common_cols),
subset(test, select = common_cols))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With