Benchmark: bigint vs int on PostgreSQL

Tags:

I want to increase my database performance. In a project, all tables went from int to bigint, which I think is a bad choice not only regarding storage, since int requires 4 bytes, and bigint requires 8 bytes;but also regarding performance. So I created a small table with 10 millions entries, with a script in Python:

import uuid

rows=10000000

output='insert_description_bigint.sql'
f = open(output, 'w')

set_schema="SET search_path = norma;\n"
f.write(set_schema)

for i in range(1,rows):
    random_string=uuid.uuid4()
    query="insert into description_bigint (description_id, description) values (%d, '%s'); \n"
    f.write(query % (i,random_string))

And this is how I created my two tables:

-- BIGINT

DROP TABLE IF EXISTS description_bigint;

CREATE TABLE description_bigint
(
  description_id BIGINT PRIMARY KEY NOT NULL,
  description VARCHAR(200),
  constraint description_id_positive CHECK (description_id >= 0)
);

select count(1) from description_bigint;
select * from description_bigint;
select * from description_bigint where description_id = 9999999;

-- INT

DROP TABLE IF EXISTS description_int;

CREATE TABLE description_int
(
  description_id INT PRIMARY KEY NOT NULL,
  description VARCHAR(200),
  constraint description_id_positive CHECK (description_id >= 0)
);

After inserting all this data, I do a query for both tables, to measure the difference between them. And for my surprise they both have the same performance:

select * from description_bigint; -- 11m55s
select * from description_int; -- 11m55s

Am I doing something wrong with my benchmark ? Shouldn't int be faster than bigint ? Especially, when the primary key is by definition an index which means, to create an index for bigint would be slower than create an index for int, with the same amount of data, right ?

I know that is not just a small thing that will make a huge impact regarding performance on my database, but I want to ensure that we are using the best practices and focused into performance here.

894

asked Jun 27 '16 12:06

Valter Silva

1 Answers

In a 64-bit system the two tables are nearly identical. The column description_id in description_int covers 8 bytes (4 for integer and 4 as alignment). Try this test:

select 
    pg_relation_size('description_int')/10000000 as table_int, 
    pg_relation_size('description_bigint')/10000000 as table_bigint,
    pg_relation_size('description_int_pkey')/10000000 as index_int,
    pg_relation_size('description_bigint_pkey')/10000000 as index_bigint;

The average row size of both tables is virtually the same. This is because the integer column occupies 8 bytes (4 bytes for a value and 4 bytes of alignment) exactly like bigint (8 bytes for a value without a filler). The same applies to index entries. This is a special case, however. If we add one more integer column to the first table:

CREATE TABLE two_integers
(
  description_id INT PRIMARY KEY NOT NULL,
  one_more_int INT,
  description VARCHAR(200),
  constraint description_id_positive CHECK (description_id >= 0)
);

the average row size should remain the same because the first 8 bytes will be used for two integers (without filler).

Find more details in Calculating and saving space in PostgreSQL.

145

answered Oct 23 '22 19:10

klin

Related questions
                            
                                Scalable Contains method for LINQ against a SQL backend
                            
                                JavaFX eats my memory?
                            
                                MongoDB aggregate framework very slow when have group
                            
                                How to speed up this code when converting float to string?
                            
                                How can I instantiate a lambda closure type in C++11/14?
                            
                                Performance comparison of RDF storage vs traditional database
                            
                                optimizing iPhone OpenGL ES fill rate
                            
                                How to calculate speed of our car using iphone
                            
                                What's the reasonable time for generating web page?
                            
                                how to time (profile) maven goals in a multi-module project
                            
                                measure CPU usage of the JVM : java code
                            
                                Indexing on a field which is in array of subdocuments
                            
                                How many bytes does a Xeon bring into the cache per memory access?
                            
                                Java Temporary Iterators are slowing my Android Game
                            
                                Symfony on Xampp is very slow in development, what can I do?
                            
                                Performance variability of C++ pure virtual function calls
                            
                                IndexedDB Access Speed and Efficiency
                            
                                Why is Cython slower than vectorized NumPy?
                            
                                utf8 <-> utf16: codecvt poor performance
                            
                                Could type punning signed to unsigned integers make bounds checking faster by eliminating the need for >= comparison?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Benchmark: bigint vs int on PostgreSQL

Tags:

performance

int

sqldatatypes

postgresql

bigint

Valter Silva

People also ask

1 Answers

klin

Recent Activity

Donate For Us