Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract email address from string using tsql

I'm trying to extract email addresses from an existing comments field and put it into its own column. The string may be something like this "this is an example comment with an email address of [email protected]" or just literally the email itself "[email protected]".

I figure the best thing to do would be to find the index of the '@' symbol and search in both directions until either the end of the string was hit or there was a space. Can anyone help me out with this implementation?

like image 641
Anthony Avatar asked Apr 13 '15 01:04

Anthony


2 Answers

I know wewesthemenace already answered the question, but his/her solution seems over complicated. Why concatenate the left and right sides of the email address together? I'd rather just find the beginning and the end of the email address and then use substring to return the email address like so:

My Table

DECLARE @Table TABLE (comment NVARCHAR(50));
INSERT INTO @Table
VALUES ('blah [email protected]'),            --At the end
        ('blah [email protected] blah blah'), --In the middle
        ('[email protected] blah'),           --At the beginning
        ('no email');

Actual Query:

SELECT  comment,        
        CASE
            WHEN CHARINDEX('@',comment) = 0 THEN NULL
            ELSE SUBSTRING(comment,beginningOfEmail,endOfEmail-beginningOfEmail)
        END email
FROM @Table
CROSS APPLY (SELECT CHARINDEX(' ',comment + ' ',CHARINDEX('@',comment))) AS A(endOfEmail)
CROSS APPLY (SELECT DATALENGTH(comment)/2 - CHARINDEX(' ',REVERSE(' ' + comment),CHARINDEX('@',REVERSE(' ' + comment))) + 2) AS B(beginningOfEmail)

Results:

comment                                            email
-------------------------------------------------- --------------------------------------------------
blah [email protected]                     [email protected]
blah [email protected] blah blah           [email protected]
[email protected] blah                     [email protected]
no email                                           NULL
like image 66
Stephan Avatar answered Oct 04 '22 11:10

Stephan


You can search for '@' in the string. Then you get the string at the LEFT and RIGHT side of '@'. You then want to REVERSE the LEFT side and get first occurrence of ' ' then get the SUBSTRING from there. Then REVERSE it to get the original form. Same principle apply to the RIGHT side without doing REVERSE.

Example string: 'some text [email protected] some text'

  1. LEFT = 'some text someemail'
  2. RIGHT = '@domain.org some text'
  3. Reverse LEFT = 'liameemos txet emos'
  4. SUBSTRING up to the first space = 'liameemos'
  5. REVERSE(4) = someemail
  6. SUBSTRING (2) up to the first space = '@domain.org'
  7. Combine 5 and 6 = '[email protected]'

Your query would be:

;WITH CteEmail(email) AS(
    SELECT '[email protected]' UNION ALL
    SELECT 'some text [email protected] some text' UNION ALL
    SELECT 'no email'
)
,CteStrings AS(
    SELECT
        [Left] = LEFT(email, CHARINDEX('@', email, 0) - 1),
        Reverse_Left = REVERSE(LEFT(email, CHARINDEX('@', email, 0) - 1)),
        [Right] = RIGHT(email, CHARINDEX('@', email, 0) + 1)
    FROM CteEmail
    WHERE email LIKE '%@%'
)
SELECT *,
    REVERSE(
        SUBSTRING(Reverse_Left, 0, 
            CASE
                WHEN CHARINDEX(' ', Reverse_Left, 0) = 0 THEN LEN(Reverse_Left) + 1
                ELSE CHARINDEX(' ', Reverse_Left, 0)
            END
        )
    )
    +
    SUBSTRING([Right], 0,
        CASE
            WHEN CHARINDEX(' ', [Right], 0) = 0 THEN LEN([Right]) + 1
            ELSE CHARINDEX(' ', [Right], 0)
        END
    )
FROM CteStrings

Sample Data:

email
----------------------------------------
[email protected]
some text [email protected] some text
no email

Result

---------------------
[email protected]
[email protected]
like image 21
Felix Pamittan Avatar answered Oct 04 '22 11:10

Felix Pamittan