Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate Hash or Checksum for a table in SQL Server

I'm trying to compute a checksum or a hash for an entire table in SQL Server 2008. The problem I'm running into is that the table contains an XML column datatype, which cannot be used by checksum and has to be converted to nvarchar first. So I need to break it down into two problems:

  1. calculate a checksum for a row, schema is unknown before runtime.
  2. calculate the checksum for all of the rows to get the full table checksum.
like image 303
Gabe Brown Avatar asked Oct 13 '09 13:10

Gabe Brown


People also ask

How does SQL Server calculate checksum?

CHECKSUM computes a hash value, called the checksum, over its argument list. Use this hash value to build hash indexes. A hash index will result if the CHECKSUM function has column arguments, and an index is built over the computed CHECKSUM value. This can be used for equality searches over the columns.

How do I generate a hash key in SQL Server?

First of all, we have to make sure that the field or column we have used to preserve password for store the hash code is of data type varbinary. Then, use the HashBytes function in the insert statement to generate the hash for the password and store it in the column.

Does SQL use hash tables?

SQL Server creates hash tables internally if it needs to. It is not a structure you can build like an index or so. For example SQL Server uses hash tables for a hash join.

What is row checksum?

CHECKSUM. Returns the checksum value computed over a row of a table, or over a list of expressions. CHECKSUM is intended for use in building hash indexes. BINARY_CHECKSUM. Returns the binary checksum value computed over a row of a table or over a list of expressions.


2 Answers

You can use CHECKSUM_AGG. It only takes a single argument, so you could do CHECKSUM_AGG(CHECKSUM(*)) - but this doesn't work for your XML datatype, so you'll have to resort to dynamic SQL.

You could generate dynamically the column list from INFORMATION_SCHEMA.COLUMNS and then insert int into a template:

DECLARE @schema_name NVARCHAR(MAX) = 'mySchemaName'; DECLARE @table_name NVARCHAR(MAX) = 'myTableName'; DECLARE @column_list NVARCHAR(MAX);  SELECT @column_list = COALESCE(@column_list + ', ', '')         + /* Put your casting here from XML, text, etc columns */ QUOTENAME(COLUMN_NAME) FROM    INFORMATION_SCHEMA.COLUMNS WHERE   TABLE_NAME = @table_name     AND TABLE_SCHEMA = @schema_name  DECLARE @template AS varchar(MAX) SET @template = 'SELECT CHECKSUM_AGG(CHECKSUM({@column_list})) FROM {@schema_name}.{@table_name}'  DECLARE @sql AS varchar(MAX) SET @sql = REPLACE(REPLACE(REPLACE(@template,     '{@column_list}', @column_list),     '{@schema_name}', @schema_name),     '{@table_name}', @table_name)  EXEC ( @sql ) 
like image 94
Cade Roux Avatar answered Sep 20 '22 00:09

Cade Roux


I modified the script to generate a query for all relevant tables in a database.

USE myDatabase GO DECLARE @table_name sysname DECLARE @schema_name sysname SET @schema_name = 'dbo'  DECLARE myCursor cursor FOR SELECT TABLE_NAME       FROM INFORMATION_SCHEMA.TABLES T      WHERE T.TABLE_SCHEMA = @schema_name        AND T.TABLE_TYPE = 'BASE TABLE'        AND T.TABLE_NAME NOT LIKE 'MSmerge%'        AND T.TABLE_NAME NOT LIKE 'sysmerge%'        AND T.TABLE_NAME NOT LIKE 'tmp%'      ORDER BY T.TABLE_NAME  OPEN myCursor  FETCH NEXT  FROM myCursor INTO @table_name   WHILE @@FETCH_STATUS = 0 BEGIN     DECLARE @column_list nvarchar(MAX)     SET @column_list='' SELECT @column_list = @column_list + CASE WHEN DATA_TYPE IN ('xml','text','ntext','image sql_variant') THEN 'CONVERT(nvarchar(MAX),'                                           ELSE ''                                      END                                    + QUOTENAME(COLUMN_NAME)                                    + CASE WHEN DATA_TYPE IN ('xml','text','ntext','image sql_variant') THEN ' /* ' + DATA_TYPE + ' */)'                                           ELSE ''                                      END + ', '   FROM INFORMATION_SCHEMA.COLUMNS      WHERE TABLE_NAME = @Table_name      ORDER BY ORDINAL_POSITION      SET @column_list = LEFT(@column_list, LEN(@column_list)-1) -- remove trailing comma      DECLARE @sql AS nvarchar(MAX)     SET @sql = 'SELECT ''' + QUOTENAME(@schema_name) + '.' + QUOTENAME(@table_name) + ''' table_name,        CHECKSUM_AGG(CHECKSUM(' + @column_list + ')) CHECKSUM   FROM ' + QUOTENAME(@schema_name) + '.' + QUOTENAME(@Table_name) + ' WITH (NOLOCK)'       PRINT  @sql      FETCH NEXT      FROM myCursor     INTO @table_name       IF @@FETCH_STATUS = 0         PRINT  'UNION ALL'  END  CLOSE myCursor DEALLOCATE myCursor GO 
like image 25
Jonathan Roberts Avatar answered Sep 20 '22 00:09

Jonathan Roberts