Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Any disadvantages to bit flags in database columns?

Consider the following tables:

CREATE TABLE user_roles(
    pkey         SERIAL PRIMARY KEY,
    bit_id       BIGINT NOT NULL,
    name         VARCHAR(256) NOT NULL,
);

INSERT INTO user_roles (bit_id,name) VALUES (1,'public');
INSERT INTO user_roles (bit_id,name) VALUES (2,'restricted');
INSERT INTO user_roles (bit_id,name) VALUES (4,'confidential');
INSERT INTO user_roles (bit_id,name) VALUES (8,'secret');

CREATE TABLE news(
    pkey          SERIAL PRIMARY KEY,
    title         VARCHAR(256),
    company_fk    INTEGER REFERENCES compaines(pkey), -- updated since asking the question
    body          VARCHAR(512),
    read_roles    BIGINT -- bit flag 
);

read_roles is a bit flags that specifies some combination of roles that can read news items. So if I am inserting a news item that can be read by restricted and confidential I would set read_roles to have a value of 2 | 4 or 6 and when I want to get back the news posts that a particular user can see I can use a query like.

select * from news WHERE company_fk=2 AND (read_roles | 2 != 0) OR  (read_roles | 4 != 0) ; 
select * from news WHERE company_fk=2 AND read_roles = 6; 

What are disadvantages of using bit flags in database columns in general? I am assuming the answer to this question might be database specific so I am interested in learning about disadvantages with specific databases.

I am using Postgres 9.1 for my application.

UPDATE I got the bit about the database not being to use an index for bit operations which would require a full table scan which would suck for performance. So I have updated the question to reflect my situation more closely, each row in the database belongs to a specific company so all the queries will have WHERE clause that include a company_fk which will have an index on it.

UPDATE I only have 6 roles right now, possible more in the future.

UPDATE roles are not mutually exclusive and they inherit from each other, for example, restricted inherits all the permissions assigned to public.

like image 722
ams Avatar asked Sep 04 '12 19:09

ams


People also ask

What is bit flag in SQL?

The BitFlag column is a varchar column, the bitflags were inserted as '0001' as an example. In the BusinessOperations table, there's a column where the application that uses these tables updates it based on what is selected in the application's UI.

What are flags in database?

You use database flags for many operations, including adjusting SQL Server parameters, adjusting options, and configuring and tuning an instance. When you set, remove, or modify a flag for a database instance, the database might be restarted. The flag value is then persisted for the instance until you remove it.

What is the data type for flag in SQL?

To set a flag, you can set the type as tinyint(1) type.


1 Answers

Adding to previous answers for SQL Server's implementation, you wouldn't save any space by having a single bitfield integer vs a pile of BIT NOT NULL columns:

The SQL Server Database Engine optimizes storage of bit columns. If there are 8 or less bit columns in a table, the columns are stored as 1 byte. If there are from 9 up to 16 bit columns, the columns are stored as 2 bytes, and so on.

bit at docs.microsoft.com

As JNK mentioned, partial comparisons on a bitfield integer would not be SARGable, so an index on a bitfield integer would be useless unless comparing the entire value at once.

On-disk indexes on SQL Server are based on sorting, so to get to the rows that have any particular bit set in isolation would require a separate index for each bit column. One way to save space if you are only looking for 1s is to make them filtered columns that only store the 1 values (zero values will not have an index entry at all).

CREATE TABLE news(
    pkey          INT IDENTITY PRIMARY KEY,
    title         VARCHAR(256),
    company_fk    INTEGER REFERENCES compaines(pkey), -- updated since asking the question
    body          VARCHAR(512),
    public_role BIT NOT NULL DEFAULT 0,
    restricted_role BIT NOT NULL DEFAULT 0,
    confidential_role BIT NOT NULL DEFAULT 0,
    secret_role BIT NOT NULL DEFAULT 0
);

CREATE UNIQUE INDEX ByPublicRole ON news(public_role, pkey) WHERE public_role=1;
CREATE UNIQUE INDEX ByRestrictedRole ON news(restricted_role, pkey) WHERE restricted_role=1;
CREATE UNIQUE INDEX ByConfidentialRole ON news(confidential_role, pkey) WHERE confidential_role=1;
CREATE UNIQUE INDEX BySecretRole ON news(secret_role, pkey) WHERE secret_role=1;

select * from news WHERE company_fk=2 AND restricted_role=1 OR confidential_role=1; 
select * from news WHERE company_fk=2 AND restricted_role=1 AND confidential_role=1;

Both of those queries produce a nice plan with the random test data I produced: bit plans

As always, indexes should be based on actual query usage and balanced against maintenance cost.

like image 188
Chris Smith Avatar answered Oct 22 '22 20:10

Chris Smith