R regex - extract words beginning with @ symbol

Tags:

I'm trying to extract twitter handles from tweets using R's stringr package. For example, suppose I want to get all words in a vector that begin with "A". I can do this like so

Click to copy

library(stringr)

# Get all words that begin with "A"
str_extract_all(c("hAi", "hi Ahello Ame"), "(?<=\\b)A[^\\s]+")

[[1]]
character(0)

[[2]]
[1] "Ahello" "Ame"

Great. Now let's try the same thing using "@" instead of "A"

Click to copy

str_extract_all(c("h@i", "hi @hello @me"), "(?<=\\b)\\@[^\\s]+")

[[1]]
[1] "@i"

[[2]]
character(0)

Why does this example give the opposite result that I was expecting and how can I fix it?

220

asked Mar 14 '19 20:03

Ben

1 Answers

It looks like you probably mean

Click to copy

str_extract_all(c("h@i", "hi @hello @me", "@twitter"), "(?<=^|\\s)@[^\\s]+")
# [[1]]
# character(0)
# [[2]]
# [1] "@hello" "@me" 
# [[3]]
# [1] "@twitter"

The \b in a regular expression is a boundary and it occurs "Between two characters in the string, where one is a word character and the other is not a word character." see here. Since an space and "@" are both non-word characters, there is no boundary before the "@".

With this revision you match either the start of the string or values that come after spaces.

200

answered Sep 27 '22 22:09

MrFlick

Related questions
                            
                                How to render scatter3d inside shiny page instead of popup
                            
                                What standard errors are returned with predict.glm(..., type = "response", se.fit = TRUE)?
                            
                                Setting method as default method in geom_smooth gives different result
                            
                                R Shiny run task/script in different process
                            
                                convert 3 dimensional array into dataframe
                            
                                How do I set width of y-axis labels in ggplot2
                            
                                type/origin of R's 'as' function
                            
                                R package build failing on Unix machines due to missing GSL - GNU Scientific Library
                            
                                Carry / use value from previous group
                            
                                How to reverse a sentence in R?
                            
                                Embed a tweet in a blogdown post
                            
                                Double precision (64-bit) representation of numeric value in R (sign, exponent, significand)
                            
                                using purrr to map dplyr::select
                            
                                Months to integer R
                            
                                ggplot2: Creating themed title, subtitle with cowplot
                            
                                identify consecutively overlapping segments in R
                            
                                String as formula
                            
                                Setting up docker image with R and SQL server drivers
                            
                                Error code 100 fitting exp distribution using fitdist in r
                            
                                Error when trying to write DataFrame to feather. Does feather support list columns?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R regex - extract words beginning with @ symbol

Tags:

regex

r

stringr

Ben

People also ask

1 Answers

MrFlick

Recent Activity

Donate For Us