Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres 9.6 parallel XPath

Tags:

I've set up Postgres 9.6 and checked on a large table of random integers that parallel queries are working. However, a simple XPath query on an XML column of another table is always sequential. Both XPath functions are marked as parallel safe in Postgres. I tried to alter XPath cost, so the expected cost skyrocketed, but it didn't change anything. What am I missing?

Example table DDL: CREATE TABLE "test_table" ("xml" XML );

Example query: SELECT xpath('/a', "xml") FROM "test_table";

Example data: <a></a>. Note that real data contains XMLs that are 10-1000kB in size.

> select pg_size_pretty(pg_total_relation_size('test_table'));
28 MB

> explain (analyze, verbose, buffers) select xpath('/a', "xml") from test_table;
Seq Scan on public.test_table  (cost=0.00..64042.60 rows=2560 width=32) (actual time=1.420..4527.061 rows=2560 loops=1)
  Output: xpath('/a'::text, xml, '{}'::text[])
  Buffers: shared hit=10588
Planning time: 0.058 ms
Execution time: 4529.503 ms
like image 561
Eugene Pakhomov Avatar asked Jun 07 '17 12:06

Eugene Pakhomov


1 Answers

The relevant point here is likely the distinction between "relation size" and "total relation size":

CREATE TABLE test_table AS
  SELECT ('<a>' || repeat('x', 1000000) || '</a>')::xml AS "xml"
  FROM generate_series(1, 2560);

SELECT
  pg_size_pretty(pg_relation_size('test_table')) AS relation_size,
  pg_size_pretty(pg_total_relation_size('test_table')) AS total_relation_size;

 relation_size | total_relation_size
---------------+---------------------
 136 kB        | 30 MB

Large column values like these are not stored within the main relation, but instead are pushed to its associated TOAST table. This external storage does not count towards pg_relation_size(), which is what the optimiser appears to be comparing against min_parallel_relation_size when evaluating a parallel plan:

SET parallel_setup_cost = 0;
SET parallel_tuple_cost = 0;
SET min_parallel_relation_size = '144kB';
EXPLAIN SELECT xpath('/a', "xml") FROM test_table;

                          QUERY PLAN
---------------------------------------------------------------
 Seq Scan on test_table  (cost=0.00..49.00 rows=2560 width=32)
SET min_parallel_relation_size = '136kB';
EXPLAIN SELECT xpath('/a', "xml") FROM test_table;

     QUERY PLAN
------------------------------------------------------------------------------
 Gather  (cost=0.00..38.46 rows=2560 width=32)
   Workers Planned: 1
   ->  Parallel Seq Scan on test_table  (cost=0.00..35.82 rows=1506 width=32)
like image 144
Nick Barnes Avatar answered Sep 29 '22 23:09

Nick Barnes