Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Email Verification Function using Regex

I am trying to create a function in SQL Server 2005 to check to see if an email is in a valid format with regular expressions.

Here is what I have so far:

CREATE FUNCTION isValidEmailFormat
(
    @Email varchar(100)
)
RETURNS bit
AS
BEGIN
   DECLARE @Result bit

   SET @Result = (SELECT CASE
                     WHEN @Email LIKE '%[a-zA-Z0-9_\-]+@([a-zA-Z0-9_\-]+\.)+  (com|org|edu|nz|au])%' 
                       THEN 1
                       ELSE 0
                  END AS Valid)
   RETURN @Result
END

Am I doing something wrong with my regular expression? Or do I have to do something more to compare a varchar to a regular expression?

-Edit-

Right now, whatever string I put in returns a 0, even if the email format itself is correct.

like image 833
Cody Kosmerick Avatar asked Mar 20 '13 12:03

Cody Kosmerick


2 Answers

The short answer is that no, it cannot be done. The syntax for LIKE is not the same as (and way less powerful than) regular expressions. See also SQL Server Regular expressions in T-SQL

But you can make the hop to .Net and do the matching there. You can instantiate a VBScript.RegExp inside T-SQL using sp_OACreate and use that.

CREATE FUNCTION dbo.isValidEmailFormat
(
    @Email varchar(100)
)
RETURNS bit
AS
BEGIN
    DECLARE @pattern varchar(4000)
    SET @pattern = '[a-zA-Z0-9_\-]+@([a-zA-Z0-9_\-]+\.)+(com|org|edu|nz|au)'
    DECLARE @Result bit

    DECLARE @objRegexExp INT
    EXEC sp_OACreate 'VBScript.RegExp', @objRegexExp OUT

    EXEC sp_OASetProperty @objRegexExp, 'Pattern', @pattern
    EXEC sp_OASetProperty @objRegexExp, 'IgnoreCase', 1
    EXEC sp_OASetProperty @objRegexExp, 'MultiLine', 0
    EXEC sp_OASetProperty @objRegexExp, 'Global', false
    EXEC sp_OASetProperty @objRegexExp, 'CultureInvariant', true

    EXEC sp_OAMethod @objRegexExp, 'Test', @Result OUT, @Email

    EXEC sp_OADestroy @objRegexExp

    RETURN @Result
END

Do take a peek at Regex email verification error - using JavaScript to see if you want to be a bit less restrictive on what characters are allowed.

like image 140
flup Avatar answered Oct 18 '22 10:10

flup


I used @flup 's answer and updated the RegExp pattern as follows:

  • dots are allowed in the portion before @, but not as first or last character there
  • the expression is anchored to start and end of string with ^ and $
  • specific TLDs and country codes were added for the data I was using

The TLDs are whitelisted explicitly to exclude many typos where .com was entered as .co or .cm -- valid country codes for Colombia or Cameroon but not for my data.

The updated line:

SET @pattern = '^([a-zA-Z0-9_\-]+\.)*[a-zA-Z0-9_\-]+@([a-zA-Z0-9_\-]+\.)+(com|org|edu|net|ca|au|coop|de|ee|es|fm|fr|gr|ie|in|it|jp|me|nl|nu|ru|uk|us|za)$'

I hope this adds value to an already-excellent answer.

like image 20
Lars Avatar answered Oct 18 '22 11:10

Lars