R regex to parse token after @ also with no additional tokens in string

Question

I have a problem in parsing address in text strings. The usual address will be "@address token token token" or "@address token token /ntoken".

string <- c("@address token token token", "@address token token /ntoken")
gsub("^\.?@([a-z0-9_]{1,25})[^a-z0-9_]+.*$", "\1", string)

which are correctly parsed

[1] "address" "address"

yet, in some circumstances the address will be the only token in the string, then regex will return the address including the @

string <- c("@address token token token", "@address token token /ntoken", "@address")
gsub("^\.?@([a-z0-9_]{1,25})[^a-z0-9_]+.*$", "\1", string)
# [1] "address"  "address"  "@address"

How to instruct regex to manage also the one-token only case?

Braj · Accepted Answer

in some circumstances the address will be the only token in the string, then regex will return the address including the @

because in that case there is no match.

Just make a slight change:

convert [^a-z0-9_]+ into [^a-z0-9_]? to make it optional.

^\.?@([a-z0-9_]{1,25})[^a-z0-9_]?.*$

Here is Online demo

R regex to parse token after @ also with no additional tokens in string

Tags:

regex

r

gsub

CptNemo

1 Answers

Braj

Recent Activity

Donate For Us

R regex to parse token after @ also with no additional tokens in string

Tags:

regex

r

gsub

CptNemo

1 Answers

Braj

Related questions

Recent Activity

Donate For Us