I have a dataframe and for a particular column I want to strip out everything after the last underscore.
So:
test <- data.frame(label=c('test_test_test', 'test_tom_cat', 'tset_eat_food', 'tisk - tisk'),
stuff=c('blah', 'blag', 'gah', 'nah') ,
numbers=c(1,2,3, 4))
should become
result <- data.frame(label=c('test_test', 'test_tom', 'tset_eat', 'tisk - tisk'),
stuff=c('blah', 'blag', 'gah', 'nah') ,
numbers=c(1,2,3, 4))
I have got:
require(dplyr)
test %>%
mutate(label = gsub('_.*','',label))
but that drops everything from the first underscore and gives me
wrong_result <- data.frame(label=c('test', 'test', 'tset', 'tisk - tisk'),
stuff=c('blah', 'blag', 'gah', 'nah') ,
numbers=c(1,2,3, 4))
We can use sub
and this can be done without any external packages
test$label <- sub("_[^_]+$", "", test$label)
test$label
#[1] "test_test" "test_tom" "tset_eat" "tisk - tisk"
This will also work:
gsub('(.*)_\\w+', '\\1', test$label)
#[1] "test_test" "test_tom" "tset_eat" "tisk - tisk"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With