Extract e-mail address from string using r

Question

These are 5 twitter user descriptions. The idea is to extract the e-mail from each string.

This is the code i've tried, it works but there is probably something better. I'd rather avoid using unlist() and do it in one go using regex. I've seen other questions of the kind for python/perl/php but not for R. I know i could use grep(..., perl = TRUE) but that should't be the only way to do it. If it works, of course it helps.

ds <- c("#MillonMusical | #PromotorMusical | #Diseñador | Contacto :        ezequielife@gmail.com | #Instagram : Ezeqielgram | 01-11-11 |           @_MillonMusical @flowfestar", "LipGLosSTudio by: SAndry RUbio           Maquilladora PRofesional estudiande de diseño profesional de maquillaje     artistico lipglosstudio@hotmail.com/", "Medico General Barranquillero   radicado con su familia en Buenos Aires para iniciar Especialidad       Medico Quirurgica. email jaenpavi@hotmail.com", "msn =
    rdt031169@hotmail.comskype = ronaldotorres-br", "Aguante piscis /       manuarias17@gmail.com  buenos aires"
    )

ds <- unlist(strsplit(ds, ' '))
ds <- ds[grep("mail.", ds)]

> print(ds)
[1] "		ezequielife@gmail.com"  "lipglosstudio@hotmail.com/"
[3] "jaenpavi@hotmail.com"       "rdt031169@hotmail.comskype"
[5] "/		manuarias17@gmail.com"

It would be nice to separate this one "rdt031169@hotmail.comskype" perhaps asking it to end in .com or .com.ar that would make sense for what i'm working on

Jilber Urbina · Accepted Answer

Here's one alternative:

> regmatches(ds, regexpr("[[:alnum:]]+\@[[:alpha:]]+\.com", ds))
[1] "ezequielife@gmail.com"     "lipglosstudio@hotmail.com" "jaenpavi@hotmail.com"      "rdt031169@hotmail.com"    
[5] "manuarias17@gmail.com"

Based on @Frank's comment, if you want to keep country identifier after .com as in your example .com.ar then, look at this:

> ds <- c(ds, "fulanito13@somemail.com.ar")  # a new e-mail address
> regmatches(ds, regexpr("[[:alnum:]]+\@[[:alpha:]]+\.com(\.[a-z]{2})?", ds))
[1] "ezequielife@gmail.com"      "lipglosstudio@hotmail.com"  "jaenpavi@hotmail.com"       "rdt031169@hotmail.com"     
[5] "manuarias17@gmail.com"      "fulanito13@somemail.com.ar"

Extract e-mail address from string using r

Tags:

string

regex

r

perl

marbel

1 Answers

Jilber Urbina

Recent Activity

Donate For Us

Extract e-mail address from string using r

Tags:

string

regex

r

perl

marbel

1 Answers

Jilber Urbina

Related questions

Recent Activity

Donate For Us