Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all characters from text outside of punctuation

Tags:

r

I have a dataset that has something of the following:

ID    Type                 Count
1     **Radisson**             8
2     **Renaissance**          9
3     **Hilton** New York Only 8
4     **Radisson** East Cost   8

I want to get a dataset that looks like

ID    Type                 Count
1     **Radisson**             8
2     **Renaissance**          9
3     **Hilton**               8
4     **Radisson**             8

Or even without the * if at all possible.

Any solutions?

like image 726
akash87 Avatar asked Nov 21 '25 04:11

akash87


1 Answers

You could just sub out everything that isn't between the stars in the beginning.

df <- data.frame(Type = c("**Radisson**", "**Renaissance**", "**Hilton** New York Only",
                          "**Radisson** East Cost"),
                 Count = c(8, 9, 8, 8))

gsub("^(\\*{2}.*\\*{2}).*", "\\1", df$Type, perl = TRUE)

[1] "**Radisson**"    "**Renaissance**" "**Hilton**"      "**Radisson**" 

So ...

df$Type <- gsub("^(\\*{2}.*\\*{2}).*", "\\1", df$Type, perl = TRUE)
df

             Type Count
1    **Radisson**     8
2 **Renaissance**     9
3      **Hilton**     8
4    **Radisson**     8
like image 67
erocoar Avatar answered Nov 22 '25 18:11

erocoar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!