I'm running into this error when attempting to create a btree index on an XML data type column that uses an xpath expression on AuroraDB - PostgreSQL 9.6:
ERROR: could not identify a comparison function for type xml
SQL state: 42883
This 2009 thread without a clear resolution is the only one I've found discussing this error message in regards to creating an xpath based index for a much earlier version of PostgreSQL: https://www.postgresql-archive.org/Slow-select-times-on-select-with-xpath-td2074839.html
In my case I do also need to specify namespaces as well and the original poster in that thread cast the result of the xpath expression to text[] which does get by the error for me too - but why is that even needed? I also don't see PostgreSQL ever using my index even when I have thousands of rows to go through.
So I tried out a simpler case and the error still occurs - please shed some light as to why if you could:
CREATE TABLE test
(
id integer NOT NULL,
xml_data xml NOT NULL,
CONSTRAINT test_pkey PRIMARY KEY (id)
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
CREATE INDEX test_idx
ON test USING btree
(xpath('/book/title', xml_data))
and the resulting message is:
ERROR: could not identify a comparison function for type xml
SQL state: 42883
The database encoding is UTF8. The Collation and Character Type are en_US.UTF-8.
Some sample insert statements too:
insert into source_data.test(id, xml_data)
values(1, XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>1</chapter><chapter>2</chapter></book>'))
insert into source_data.test(id, xml_data)
values(2, XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Apropos</title><chapter>1</chapter><chapter>2</chapter></book>'))
Your getting this error because the XML data type does not provide any comparison operators, hence you can't create an index on the result of xpath()
, because it returns an array of XML values.
Therefore you need to cast the XPath expression to a text array when creating the index:
CREATE INDEX test_idx
ON test USING BTREE
(cast(xpath('/book/title', xml_data) as text[])) ;
This index then gets used when querying the table:
EXPLAIN ANALYZE
SELECT * FROM test where
cast(xpath('/book/title', xml_data) as text[]) = '{<title>Apropos</title>}';
gives
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on test (cost=0.13..8.15 rows=1 width=36) (actual time=0.034..0.038 rows=1 loops=1)
Index Cond: ((xpath('/book/title'::text, xml_data, '{}'::text[]))::text[] = '{<title>Apropos</title>}'::text[])
Planning time: 0.168 ms
Execution time: 0.073 ms (4 rows)
This works the same when using text()
:
CREATE INDEX test_idx
ON test USING BTREE
(cast(xpath('/book/title/text()', xml_data) as text[])) ;
explain analyze select * from test where
cast(xpath('/book/title/text()', xml_data) as text[]) = '{Apropos}';
gives
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on test (cost=0.13..8.15 rows=1 width=36) (actual time=0.034..0.038 rows=1 loops=1)
Index Cond: ((xpath('/book/title/text()'::text, xml_data, '{}'::text[]))::text[] = '{Apropos}'::text[])
Planning time: 0.166 ms
Execution time: 0.076 ms
(4 rows)
Note that I forced the use of the index via the following command, as I only had 4 rows in the test table I created.
SET enable_seqscan TO off;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With