Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I put a Clustered ColumnStore index on a table with NVARCHAR(MAX) fields?

I have a table with 200 GB data in SQL Server 2016. So I am planning to apply Clustered ColumnStore in that table for disk compression and as well as better performance.

But the problem is under that table there is one column which datatype is NVARCHAR(MAX), and columnstore indexes don't support that datatype.

So what I am thinking is to change the datatype from NVARCHAR(max) to any other datatype which at least accept 81446 characters in the same column.

I have tried some other datatypes available in SQL like VARCHAR(8000) but what its doing is removed other data which is after 8000 characters.

I also tried text. But again in Text there is columnstore not applicable becuase of limitation.

So could you please give me any idea what datatype I have to use. Or is there any other way to apply ColumnStore index in the same table?

like image 367
Pardeep Sharma Avatar asked Mar 07 '23 06:03

Pardeep Sharma


1 Answers

You have several different questions in here:

Q: Can SQL Server 2016 use (MAX) datatypes in columnstore indexes?

No. The documentation states:

Don't use a clustered columnstore index when the table requires varchar(max), nvarchar(max), or varbinary(max) data types.

I would normally just stop there - if the documentation tells you not to do something, you probably shouldn't.

Q: Can I store more than 8,000 characters in VARCHAR(8000)?

No. The number means what it says - it's the max amount of characters you can store. If you try to store additional data, it will not survive.

Q: Can I build a clustered columnstore without those (MAX) fields?

Yes, by changing your data model and breaking the table up. Say the table involved is called FactTable:

  1. Create a new table with the large text fields - we'll call it FactTable_Text.
  2. Create a new table with the rest of the fields - we'll call it FactTable_Data. Put a clustered columnstore index on this, and you'll gain compression for it.
  3. Migrate the data from your old FactTable into these new tables
  4. Drop the old table
  5. Create a view called FactTable that joins the FactTable_Data and FactTable_Text together
  6. Users go on querying FactTable without knowing anything has changed

Unfortunately, you're probably going to have to change your ETL processes, and depending on how much text is involved in the table, you might not get any compression. For example, say 90% of the table's size is all due to the text - then you haven't really saved anything here.

Now you start to see why the documentation advises you that this isn't a good idea.

like image 181
Brent Ozar Avatar answered Mar 08 '23 23:03

Brent Ozar