Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sql Server Primary Key With Partition Issue

I am building a table that will be partitioned and contain a FILESTREAM column. The issue I am encountering is that it appears I have to have a composite primary key (FILE_ID and FILE_UPLOADED_DATE) because FILE_UPLOADED_DATE is part of my partition scheme. Is that correct? I would prefer not to have this be a composite key and simply just have FILE_ID being the primary key.....could this be just an user error?

Any suggestions would be appreciated.

Version: SQL Server 2008 R2

Partition Schemes and Function:

CREATE PARTITION FUNCTION DocPartFunction (datetime)
AS RANGE RIGHT FOR VALUES ('20101220')
GO
CREATE PARTITION SCHEME DocPartScheme AS
PARTITION DocPartFunction TO (DATA_FG_20091231, DATA_FG_20101231);
GO
CREATE PARTITION SCHEME DocFSPartScheme AS
PARTITION DocPartFunction TO (FS_FG_20091231,FS_FG_20101231);
GO

Create Statement:

CREATE TABLE [dbo].[FILE](
    [FILE_ID] [int] IDENTITY(1,1) NOT NULL,
    [DOCUMENT] [varbinary](max) FILESTREAM  NULL,
    [FILE_UPLOADED_DATE] [datetime] NOT NULL,
    [FILE_INT] [int] NOT NULL,
    [FILE_EXTENSION] [varchar](10) NULL,
    [DocGUID] [uniqueidentifier] ROWGUIDCOL  NOT NULL UNIQUE ON [PRIMARY],
CONSTRAINT [PK_File] PRIMARY KEY CLUSTERED
    (   [FILE_ID] ASC
    ) ON DocPartScheme ([FILE_UPLOADED_DATE])
)ON DocPartScheme ([FILE_UPLOADED_DATE])
FILESTREAM_ON DocFSPartScheme;

Error if I don't include FILE_UPLOADED_DATE:

Msg 1908, Level 16, State 1, Line 1
Column 'FILE_UPLOADED_DATE' is partitioning column of the index 'PK_File'. Partition columns for a unique index must be a subset of the index key.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.

Thanks!

like image 487
scarpacci Avatar asked Oct 04 '11 23:10

scarpacci


People also ask

Does partition key have to be PRIMARY KEY?

A primary key uniquely identifies a row. A composite key is a key formed from multiple columns. A partition key is the primary lookup to find a set of rows, i.e. a partition. A clustering key is the part of the primary key that isn't the partition key (and defines the ordering within a partition).

Does SQL Server support partition?

SQL Server provides a logical and physical framework for partitioning table and index data. SQL Server 2017 supports up to 15,000 partitions. Partitioning separates data into logical units. You can store these logical units in more than one file group.

Does the PRIMARY KEY have to be the first column?

No, the position of the column is not important. as there is only one clustered key possible, you'll have to drop the existing (and re-create it as unique, but non clustered index)

Does SQL Server 2016 Standard Edition support partitioning?

The latest SQL release from SQL Server 2016 SP1 onwards supports Partitioning on all editions.


2 Answers

You are confusing the primary key and the clustered index. There is no reason for the two to be one and the same. You can have a clustered index on FILE_UPLOADED_DATE and a separate, non-clustered, primary key on FILE_ID. In fact you already do something similar for the DocGUID column:

CREATE TABLE [dbo].[FILE](
    [FILE_ID] [int] IDENTITY(1,1) NOT NULL,
    [DOCUMENT] [varbinary](max) FILESTREAM  NULL,
    [FILE_UPLOADED_DATE] [datetime] NOT NULL,
    [FILE_INT] [int] NOT NULL,
    [FILE_EXTENSION] [varchar](10) NULL,
    [DocGUID] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
    constraint UniqueDocGUID UNIQUE NONCLUSTERED ([DocGUID]) 
        ON [PRIMARY])
    ON DocPartScheme ([FILE_UPLOADED_DATE])
    FILESTREAM_ON DocFSPartScheme;

CREATE CLUSTERED INDEX cdx_File 
   ON [FILE] (FILE_UPLOADED_DATE)
   ON DocPartScheme ([FILE_UPLOADED_DATE])
   FILESTREAM_ON DocFSPartScheme;

ALTER TABLE [dbo].[FILE]
    ADD CONSTRAINT PK_File PRIMARY KEY NONCLUSTERED (FILE_ID)
    ON [PRIMARY];

However such a design will lead to non-aligned indexes which can cause very serious performance problems, and also block all fast partition switch operations. See Special Guidelines for Partitioned Indexes:

Each sort table requires a minimum amount of memory to build. When you are building a partitioned index that is aligned with its base table, sort tables are built one at a time, using less memory. However, when you are building a nonaligned partitioned index, the sort tables are built at the same time.

As a result, there must be sufficient memory to handle these concurrent sorts. The larger the number of partitions, the more memory required. The minimum size for each sort table, for each partition, is 40 pages, with 8 kilobytes per page. For example, a nonaligned partitioned index with 100 partitions requires sufficient memory to serially sort 4,000 (40 * 100) pages at the same time. If this memory is available, the build operation will succeed, but performance may suffer. If this memory is not available, the build operation will fail

Your design already has a non-aligned index for DocGUID, so the performance problems are likely already present. If you must keep your indexes aligned then you have to admit one of the side effects of choosing a partition scheme: you can no longer have a logical primary key, nor unique constraints enforcement, unless the key includes the partitioning key.

And finally, one must ask: why use a partitioned table? They are always slower than a non-partitioned alternative. Unless you need fast partition switch operations for ETL (which you are already punting due to the non-aligned index on DocGUID), there is basically no incentive to use a partitioned table. (Preemptive comment: clustered index on the FILE_UPLOADED_DATE is guaranteed a better alternative than 'partition elimination').

like image 98
Remus Rusanu Avatar answered Oct 04 '22 02:10

Remus Rusanu


The partitioning column must always be present in a partitioned table's clustered index. Any work-around you come up with has to factor this in.

like image 32
Philip Kelley Avatar answered Oct 04 '22 01:10

Philip Kelley