Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting value of xml tag in PostgreSQL

Below is the column response from my Postgres table. I want to extract the status from all the rows in my Postgres database. The status could be of varying sizes like SUCCESS as well so I do not want to use the substring function. Is there a way to do it?

<?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>

so my table structure is like this

   Column    |            Type             |                        Modifiers                         

-------------+-----------------------------+----------------------------------------------------------

 id          | bigint                      | not null default nextval('events_id_seq'::regclass)
 hostname    | text                        | not null
 time        | timestamp without time zone | not null
 trn_type    | text                        | 
 db_ret_code | text                        | 
 request     | text                        | 
 response    | text                        | 
 wait_time   | text                        | 

And I want to extract status from each and every request. How do i do this?

Below is a sample row. And assume the table name abc_events

id          | 1870667
hostname    | abcd.local
time        | 2013-04-16 00:00:23.861
trn_type    | A
request     | <?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>
response    | <?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>
like image 714
ronak Avatar asked Apr 15 '13 19:04

ronak


1 Answers

Use the xpath() function:

WITH x(col) AS (SELECT '<?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status></response>'::xml)
SELECT xpath('./status/text()', col) AS status
FROM   x

/text() strips the surrounding <status> tag.
Returns an array of xml - with a single element in this case:

status
xml[]
-------
{ERROR_MISSING_DATA}

Applied to your table

In response to your question update, this can simply be:

SELECT id, xpath('./status/text()', response::xml) AS status
FROM   tbl;

If you are certain there is only a single status tag per row, you can simply extract the first item from the array:

SELECT id, (xpath('./status/text()', response::xml))[1] AS status
FROM   tbl;

If there can be multiple status items:

SELECT id, unnest(xpath('./status/text()', response::xml)) AS status
FROM   tbl;

Gets you 1-n rows per id.

Cast to xml

Since you defined your columns to be of type text (instead of xml, you need to cast to xml explicitly. The function xpath() expects the 2nd parameters of type xml. An untyped string constant is coerced to xml automatically, but a text column is not. You need to cast explicitly.

This works without explicit cast:

  SELECT xpath('./status/text()'
      ,'<?xml version="1.0" ?><response><status>SUCCESS</status></response>')

A CTE like in my first example needs a type for every column in the "common table expression". If I had not cast to a specific type, the type unknown would have been used - which is not the same thing as an untyped string. Obviously, there is no direct conversion implemented between unknown and xml. You'd have to cast to text first: unknown_type_col::text::xml. Better to cast to ::xml right away.

This has been tightened with PostgreSQL 9.1 (I think). Older versions were more permissive.

Either way, with any of these methods the string has to be valid xml or the cast (implicit or explicit) will raise an exception.

like image 54
Erwin Brandstetter Avatar answered Sep 21 '22 15:09

Erwin Brandstetter