Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove n number of characters of a string in R after a specific character?

Tags:

r

stringr

My data frame is:

df <- data.frame(player = c("Taiwo Awoniyi/e5478b87", "Jacob Bruun Larsen/4e204552", "Andi Zeqiri/d01231f0"), goals = c(2,5,7))

I want to remove all numbers after the "/" in the "player" column. To ideally have:

df <- data.frame(player = c("Taiwo Awoniyi", "Jacob Bruun Larsen", "Andi Zeqiri"), goals = c(2,5,7))

I am unsure of how to approach this since player names vary greatly in length and some numbers are larger than others.

like image 295
Juan Avilez Avatar asked Sep 17 '25 19:09

Juan Avilez


2 Answers

We could use separate, added extra = 'drop' (many thanks to Onyambu)

library(dplyr)
library(tidyr)

df %>% 
  separate(player, "player", sep="/", extra = 'drop')
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7
like image 53
TarJae Avatar answered Sep 20 '25 11:09

TarJae


You can backreference the substring you want to keep by a negative character class allowing any characters except the /:

df %>%
  mutate(player = sub("([^/]+).*", "\\1", player))
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7

More simply, you can just remove anything that's a / or a digit:

df %>%
  mutate(player = gsub("[/0-9]", "", player))

In base R syntax:

df$player <- gsub("[/0-9]", "", df$player)
like image 26
Chris Ruehlemann Avatar answered Sep 20 '25 11:09

Chris Ruehlemann