Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using xpath to extract data from an XML column in postgres

I've made the following table:

create table temp.promotions_xml(id serial promotion_xml xml);

I've inserted the following data into temp.promotions:

<promotions xmlns="http://www.demandware.com/xml/impex/promotion/2008-01-31">
    <campaign campaign-id="2013-1st-semester-jet-giveaways">
        <description>2013 1st Semester Jet Giveaways</description>
        <enabled-flag>true</enabled-flag>
        <start-date>2013-01-01T05:00:00.000Z</start-date>
        <end-date>2013-07-01T04:00:00.000Z</end-date>
        <customer-groups>
            <customer-group group-id="Everyone"/>
        </customer-groups>
    </campaign>
</promotions>

The data is in the table.

I can't figure out how to get it out. I probably will want to be able to populate a relational model that I will build, so I want to get rid of all the tag's.

Below are a few queries that I've tried that don't work. I pretty sure that I am just dancing around the correct syntax. These queries return rows of empty sets.

FWIW, we are using Postgres 9.0.4.

Thanks, --sw

select xpath('/promotions/campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('./promotions/campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('promotions/campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('///description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('//description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('.//description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('./campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('//campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
like image 622
Scott Wood Avatar asked Jul 23 '13 00:07

Scott Wood


1 Answers

This works:

WITH tbl(p_xml) AS (  -- CTE just to provide test table with xml value
   SELECT '<promotions xmlns="http://www.demandware.com/xml/impex/promotion/2008-01-31">
              <campaign campaign-id="2013-1st-semester-jet-giveaways">
                 <description>2013 1st Semester Jet Giveaways</description>
                 <enabled-flag>true</enabled-flag>
                 <start-date>2013-01-01T05:00:00.000Z</start-date>
                 <end-date>2013-07-01T04:00:00.000Z</end-date>
                 <customer-groups>
                    <customer-group group-id="Everyone"/>
                 </customer-groups>
              </campaign>
           </promotions>'::xml
    )  -- end of CTE, the rest is the solution
SELECT xpath('/n:promotions/n:campaign/n:description/text()', p_xml
           , '{{n,http://www.demandware.com/xml/impex/promotion/2008-01-31}}')
FROM   tbl;

Returns:

{"2013 1st Semester Jet Giveaways"}

Note how I assign the namespace alias n for your namespace in the third argument of xpath() and use it at every level of the xpath.

If you remove the XML namespace from the document, everything becomes much simpler:

WITH tbl(p_xml) AS (  -- not the missing namespace below
   SELECT '<promotions>
              <campaign campaign-id="2013-1st-semester-jet-giveaways">
                 <description>2013 1st Semester Jet Giveaways</description>
                 <enabled-flag>true</enabled-flag>
                 <start-date>2013-01-01T05:00:00.000Z</start-date>
                 <end-date>2013-07-01T04:00:00.000Z</end-date>
                 <customer-groups>
                    <customer-group group-id="Everyone"/>
                 </customer-groups>
              </campaign>
           </promotions>'::xml
   )
SELECT xpath('/promotions/campaign/description/text()', p_xml)
FROM   tbl;

<rant> Is it just me or is everybody happy about json and jsonb, so we don't have to deal with XML.</rant>

like image 151
Erwin Brandstetter Avatar answered Oct 20 '22 06:10

Erwin Brandstetter