Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgresql: optimizing columns size for numeric fields

I dont understand how Postgresql (9.2) calculate the column size (in kb), I have this tables:

Table d2:
 Column  |     Type      |
---------+---------------|
 id      | serial        |
 n       | numeric(17,2) |

Table d4:
 Column  |     Type      |
---------+---------------|
 id      | serial        |
 n       | numeric(19,4) |

Table d18:
 Column  |     Type      |
---------+---------------|
 id      | serial        |
 n       | numeric(35,18)|

Table b1:
 Column  |     Type      |
---------+---------------|
 id      | serial        |
 n       | numeric(16,2) |

Table b2:
 Column  |     Type      |
---------+---------------|
 id      | serial        |
 n       | numeric(4,2)  |

And i populate them with this code, in order to have 10000 rows for each table;

$tests = array(2, 4, 18);
foreach($tests AS $n)
{
  $m = number_format(999999999999999.66549865, $n, '.', '');
  $prp_name = "insert_$n";
  $prp = pg_prepare($db, $prp_name, "INSERT INTO d_$n (n) VALUES ($1)");
  for($i = 0; $i < 10000; $i++)
  {
    pg_execute($db, $prp_name, array($m));
  }
}

$prp = pg_prepare($db, 'insert_b1', "INSERT INTO b1 (n) VALUES ($1)");
$m = 16512.67;
for($i = 0; $i < 10000; $i++)
{
  pg_execute($db, 'insert_b1', array($m));
}
$prp = pg_prepare($db, 'insert_b2', "INSERT INTO b2 (n) VALUES ($1)");
$m = 99.36;
for($i = 0; $i < 10000; $i++)
{
  pg_execute($db, 'insert_b2', array($m));
}

Now, what I dont understand is how could be that:

SELECT pg_size_pretty(pg_total_relation_size('d2')) AS size_d2;
 size_d2 
---------
 752 kB

SELECT pg_size_pretty(pg_total_relation_size('d4')) AS size_d4;
 size_d4 
---------
 752 kB

SELECT pg_size_pretty(pg_total_relation_size('d18')) AS size_d18;
 size_d18 
----------
 752 kB

SELECT pg_size_pretty(pg_total_relation_size('b1')) AS size_b1;
 size_b1 
---------
 440 kB

SELECT pg_size_pretty(pg_total_relation_size('b2')) AS size_b2;
 size_b2 
---------
 680 kB

So, the d_* tables has the same size, even if the precision (and the lenght of the data stored) is very different;

Table b1 is smaller than b2 even if has a bigger precision.

All tables has been flushed (vacuum, analize) before the pg_total_relazion_size.

Im not been able to find an answer in Postgresql's datatypes documentations, so I'm gonna ask it here: How does the size in kb grow in relation of the precision of numeric columns?

Im doing this tests in order to decide what precision/scale to use to store monetary types in the database for a CMS, I would like to have only 1 precision/scale value for all item's prices (not for the totals, where the scale must be of 2 decimals).

The more decimals I can store the better is, for the user (so I dont have limits when a customer ask to store 12 decimals for a specific items), but I want to understand how this decision will affect database size and performance.

like image 784
Strae Avatar asked May 06 '13 15:05

Strae


People also ask

What is numeric precision in PostgreSQL?

In this syntax, the precision is the total number of digits and the scale is the number of digits in the fraction part. For example, the number 1234.567 has the precision 7 and scale 3 . The NUMERIC type can hold a value up to 131,072 digits before the decimal point 16,383 digits after the decimal point.

How do I change precision in PostgreSQL?

Try this: ALTER Table account_invoice ALTER COLUMN amount_total TYPE DECIMAL(10,5); DECIMAL(X, Y) -> X represents full length and Y represents precision of the number.


2 Answers

From the manual:

Numeric values are physically stored without any extra leading or trailing zeroes. Thus, the declared precision and scale of a column are maximums, not fixed allocations. (In this sense the numeric type is more akin to varchar(n) than to char(n).) The actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead.

The result of the pg_total_relation_size function includes indexes. The correct column size for each of the values you are inserting is:

select pg_column_size(a)
from (values
    (999999999999999.62::numeric(17,2)),
    (999999999999999.6250::numeric(19,4)),
    (999999999999999.625000000000000000::numeric(35,18)),
    (16512.67::numeric(16,2)),
    (99.36::numeric(4,2))
) s(a)
;
 pg_column_size 
----------------
             16
             16
             16
             12
             10

So if you want to let the user to have a maximum of n decimals just define it as numeric(35, n). It will only use the space up to the number of existing decimals as trailing zeroes are not stored.

like image 163
Clodoaldo Neto Avatar answered Sep 28 '22 00:09

Clodoaldo Neto


According to Postgresql doc, integer has 4 bytes, bigint - 8 bytes. The size of

NUMERIC (p, s)

11+(p/2) bytes, where p = precision, s = scale

Storage determined by size of numeric type, no padding. For Example:

SELECT pg_column_size('123'::numeric(21,7)); --8
SELECT pg_column_size('123.1'::numeric(21,7)); --10
SELECT pg_column_size('123.12'::numeric(21,7)); --10
SELECT pg_column_size('123.123'::numeric(21,7)); --10
SELECT pg_column_size('123.1234'::numeric(21,7)); --10
SELECT pg_column_size('123.12345'::numeric(21,7)); --12
SELECT pg_column_size('123.123456'::numeric(21,7)); --12
SELECT pg_column_size('123.1234567'::numeric(21,7)); --12 
like image 45
shcherbak Avatar answered Sep 28 '22 00:09

shcherbak