Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to remove non-numeric characters from a VARCHAR in SQL Server

I'm writing an import utility that is using phone numbers as a unique key within the import.

I need to check that the phone number does not already exist in my DB. The problem is that phone numbers in the DB could have things like dashes and parenthesis and possibly other things. I wrote a function to remove these things, the problem is that it is slow and with thousands of records in my DB and thousands of records to import at once, this process can be unacceptably slow. I've already made the phone number column an index.

I tried using the script from this post:
T-SQL trim &nbsp (and other non-alphanumeric characters)

But that didn't speed it up any.

Is there a faster way to remove non-numeric characters? Something that can perform well when 10,000 to 100,000 records have to be compared.

Whatever is done needs to perform fast.

Update
Given what people responded with, I think I'm going to have to clean the fields before I run the import utility.

To answer the question of what I'm writing the import utility in, it is a C# app. I'm comparing BIGINT to BIGINT now, with no need to alter DB data and I'm still taking a performance hit with a very small set of data (about 2000 records).

Could comparing BIGINT to BIGINT be slowing things down?

I've optimized the code side of my app as much as I can (removed regexes, removed unneccessary DB calls). Although I can't isolate SQL as the source of the problem anymore, I still feel like it is.

like image 567
Dan Herbert Avatar asked Sep 19 '08 22:09

Dan Herbert


People also ask

How do I remove non numeric characters from a string in SQL?

select to_number(regexp_replace('Ph: +91 984-809-8540', '\D', '')) OUT_PUT from dual; In this statement '\D' would find all Non-digit characters and the will be replaced by null.

How do I remove non numeric characters from a string?

In order to remove all non-numeric characters from a string, replace() function is used. replace() Function: This function searches a string for a specific value, or a RegExp, and returns a new string where the replacement is done.

How do I remove a specific character from a varchar in SQL?

The TRIM() function removes the space character OR other specified characters from the start or end of a string. By default, the TRIM() function removes leading and trailing spaces from a string.

How do I strip all non alphabetic characters from a string in SQL Server?

If you want to leave the numbers (remove non-alpha numeric characters), then... replace ^a-z with ^a-z^0-9 That search string appears in the code in two different places. Be sure to replace both of them.


1 Answers

I saw this solution with T-SQL code and PATINDEX. I like it :-)

CREATE Function [fnRemoveNonNumericCharacters](@strText VARCHAR(1000)) RETURNS VARCHAR(1000) AS BEGIN     WHILE PATINDEX('%[^0-9]%', @strText) > 0     BEGIN         SET @strText = STUFF(@strText, PATINDEX('%[^0-9]%', @strText), 1, '')     END     RETURN @strText END 
like image 183
David Coster Avatar answered Sep 22 '22 04:09

David Coster