I've read and heard several times that sql_variant
should be avoided. I think I have a great use case for it. I've used varchar(max)
in the past to store different types in the same column, but it seems sensible to avoid the de/serialization overhead when there's a built-in type that does exactly what I want.
So, what exactly are the pitfalls of using sql_variant
? Are they performance related, or easy-to-make programming mistakes, or something else? Incidentally, I'll be interacting with this column from client code and CLR functions, if that's something to consider.
Storing different types in the same column via SQL_VARIANT
is almost the same thing as casting everything to Object
in .NET. And sometimes there are valid reasons for using this type as it can certainly allow for a more generic programmatic structure.
However, as you were anticipating, there are some pitfalls to using SQL_VARIANT
that you should be aware of, especially as one of them might be a deal-breaker:
Just like casting everything to Object
in .NET (and possibly requiring boxing / unboxing depending on the base type), there is a definite performance hit when using SQL_VARIANT
. Depending on the use case, it might be acceptable to have reduced performance if the functionality really needs it and/or the usage is not very frequent (i.e. many times per second).
Unlike casting everything to Object
in .NET, the SQL_VARIANT
datatype has limitations on what base datatypes it can contain. The following datatypes cannot be stored as SQL_VARIANT
:
VARCHAR(MAX)
NVARCHAR(MAX)
VARBINARY(MAX)
XML
TIMESTAMP
/ ROWVERSION
TEXT
(you shouldn't be using this type anyway as of SQL Server 2005)NTEXT
(you shouldn't be using this type anyway as of SQL Server 2005)IMAGE
(you shouldn't be using this type anyway as of SQL Server 2005)This limitation can easily prevent the possibility of using SQL_VARIANT
if there is a requirement to store any of these datatypes. Please note that the issue here is the base datatype and not the size of the data, as the following test shows:
DECLARE @tmp1 TABLE (col1 SQL_VARIANT NOT NULL);
INSERT INTO @tmp1 (col1) VALUES (CONVERT(VARCHAR(MAX), 'g'));
Returns:
Msg 206, Level 16, State 2, Line 2
Operand type clash: varchar(max) is incompatible with sql_variant
To be fair, one benefit to using SQL_VARIANT
over casting everything to NVARCHAR
is that SQL_VARIANT
retains the underlying type info and enforces its usage so that you cannot easily misuse values in completely inappropriate contexts.
DECLARE @tmp2 TABLE (col1 SQL_VARIANT NOT NULL);
INSERT INTO @tmp2 (col1) VALUES (1);
SELECT CONVERT(DATETIME, col1) FROM @tmp2;
SELECT CONVERT(TIME, col1) FROM @tmp2;
Returns:
1900-01-02 00:00:00.000
Msg 529, Level 16, State 3, Line 6
Explicit conversion from data type int to time is not allowed.
Regarding not being able to use SQL_VARIANT
as a PK: this is really a non-issue since the very nature of a generic datatype pretty much excludes it from being desirable in the first place for such a use.
Regarding not being able to use SQL_VARIANT
with a LIKE
operator: this is mostly a non-issue due to being able to convert it to an appropriate type that does work with LIKE
, as in:
WHERE CONVERT(NVARCHAR(50), [sql_variant_field]) LIKE '%something%'
The above is certainly not the most efficient, but it is functional, and as mentioned above, efficiency was already ruled out as it was sacrificed in return for functionality when deciding to use the SQL_VARIANT
datatype.
It also makes it easier for programming errors to occur. A DBA/Programmer looks at a column and it looks like a integer, so he puts an integer in it, but farther down the line a process wants that to be a string. I've seen this with poorly written imports into sql_variant columns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With