Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't add index to persisted computed column because it is "of a type that is invalid for use as a key"

I have a table like this with a computed column:

CREATE TABLE PhoneNumbers
(
    [PhoneNumberID] int identity(1,1) not null primary key clustered,
    [Number] varchar(20), /* Entire number, like (800) 555-5000 */
    [Digits] AS dbo.RegExReplace(Number, '[^0-9]', '') PERSISTED  /* Like 8005555000 */
)

It's created fine, and the Digits column works great as expected, BUT it doesn't seem to behave as a "PERSISTED" column. When I do a query with Digits in the WHERE clause it's VERY slow. When I try to add an index to the Digits column I get: Column 'Digits' in table 'PhoneNumbers' is of a type that is invalid for use as a key column in an index.

It seems like that column isn't really being treated as PERSISTED and is being recomputed on every query and won't let me add an index.

The RegExReplace is a C# CLR function defined as follows:

[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static SqlString RegExReplace(SqlString expression, SqlString pattern, SqlString replace)

Any ideas on how to get that Digits column to act like a persisted column or allow me to add an index?!

Thank you!

like image 956
JerSchneid Avatar asked Jan 31 '12 02:01

JerSchneid


1 Answers

Try a CAST:

CREATE TABLE PhoneNumbers
(
    [PhoneNumberID] int identity(1,1) not null primary key clustered,
    [Number] varchar(20), /* Entire number, like (800) 555-5000 */
    [Digits] AS CAST(dbo.RegExReplace(Number, '[^0-9]', '') AS VARCHAR(20)) PERSISTED  /* Like 8005555000 */
)

I believe that the problem is your CLR function is returning SqlString which ends up being nvarchar(4000) or similar - not indexable.

It's kind of a known "problem" with computed columns that the datatype is inferred from the expression. Mainly an issue with strings and "helper functions" which take varchar(max) and also with decimal operations where precision changes due to the calculations.

I have a little rule where I always CAST - it makes it explicit and avoids any ambiguity. Generally, columns which are known to be small should be explicitly small - varchar(max) seems to have a lot of performance overhead - even if you pass through a function which returns varchar(max) and takes varchar(max), cast back to the size you know, because it will perform a lot better.

like image 106
Cade Roux Avatar answered Sep 24 '22 10:09

Cade Roux