Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove certain attributes from XML using XLST

Tags:

regex

xml

xslt

I have an XML document returned to me via a web service.

<Kronos_WFC encoding="ASCII" version="1.0" WFCVersion="6.1" TimeStamp="01/5/2011 8:38AM">
  <Response Status="Success" Timeout="1800" PersonKey="-1" Object="System" Username="1" Action="Logon" PersonNumber="1">
  </Response>
  <Response Status="Success" action="Load">
      <ScheduleGroup ScheduleGroupName="SomeName" AllowsInheritance="false" AllowContract="false" IsEmploymentTerm="false" />
      <ScheduleGroup ScheduleGroupName="GreatName" AllowsInheritance="true" AllowContract="false" IsEmploymentTerm="false" />
      <ScheduleGroup ScheduleGroupName="BestName" AllowsInheritance="true" AllowContract="false" IsEmploymentTerm="false" />
  </Response>
  <Response Status="Success" Object="System" Action="Logoff">
  </Response>
</Kronos_WFC>

The problem is I turn the results into business objects generated from the xsd schema for this product (xsd2code). The product has nothing in the schema for attributes (for Response):

  • Timeout
  • PersonKey
  • Object
  • Username

I would like to do the following:

  • Remove the above mentioned attributes
  • Turn all other attributes into elements, including all children, and the children's childrens etc.

How do I do this using XLST. Would it be simpler to remove the unwanted attributes using Regex?

like image 203
Jeremy Avatar asked May 30 '11 02:05

Jeremy


1 Answers

Would it be simpler to remove the unwanted attributes using Regex?

No, this is a very simple XSLT operation:

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match=
 "Response/@*[contains('|Timeout|PersonKey|Object|Username|',
                      concat('|',name(),'|')
                      )
            ]"/>
 <xsl:template match="@*">
  <xsl:element name="{name()}">
   <xsl:value-of select="."/>
  </xsl:element>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<Kronos_WFC encoding="ASCII" version="1.0"
WFCVersion="6.1" TimeStamp="01/5/2011 8:38AM">
    <Response Status="Success" Timeout="1800" PersonKey="-1"
    Object="System" Username="1" Action="Logon"
    PersonNumber="1"></Response>
    <Response Status="Success" action="Load">
        <ScheduleGroup ScheduleGroupName="SomeName"
        AllowsInheritance="false" AllowContract="false"
        IsEmploymentTerm="false" />
        <ScheduleGroup ScheduleGroupName="GreatName"
        AllowsInheritance="true" AllowContract="false"
        IsEmploymentTerm="false" />
        <ScheduleGroup ScheduleGroupName="BestName"
        AllowsInheritance="true" AllowContract="false"
        IsEmploymentTerm="false" />
    </Response>
    <Response Status="Success" Object="System"
    Action="Logoff"></Response>
</Kronos_WFC>

produces exactly the wanted, correct result:

<Kronos_WFC>
   <encoding>ASCII</encoding>
   <version>1.0</version>
   <WFCVersion>6.1</WFCVersion>
   <TimeStamp>01/5/2011 8:38AM</TimeStamp>
   <Response>
      <Status>Success</Status>
      <Action>Logon</Action>
      <PersonNumber>1</PersonNumber>
   </Response>
   <Response>
      <Status>Success</Status>
      <action>Load</action>
      <ScheduleGroup>
         <ScheduleGroupName>SomeName</ScheduleGroupName>
         <AllowsInheritance>false</AllowsInheritance>
         <AllowContract>false</AllowContract>
         <IsEmploymentTerm>false</IsEmploymentTerm>
      </ScheduleGroup>
      <ScheduleGroup>
         <ScheduleGroupName>GreatName</ScheduleGroupName>
         <AllowsInheritance>true</AllowsInheritance>
         <AllowContract>false</AllowContract>
         <IsEmploymentTerm>false</IsEmploymentTerm>
      </ScheduleGroup>
      <ScheduleGroup>
         <ScheduleGroupName>BestName</ScheduleGroupName>
         <AllowsInheritance>true</AllowsInheritance>
         <AllowContract>false</AllowContract>
         <IsEmploymentTerm>false</IsEmploymentTerm>
      </ScheduleGroup>
   </Response>
   <Response>
      <Status>Success</Status>
      <Action>Logoff</Action>
   </Response>
</Kronos_WFC>

Explanation:

  1. The identity rule/template copies every node "as-is".

  2. The template overriding the identity rule that matches any attribute with name Timeout, PersonKey, Object, or Username has empty body and doesn't copy them ("deletes" them from the output)

  3. The template matching any attribute creates an element whose name is the name of the matched attribute and whose text node-child is the value of the matched attribute.

Remember: Using and overriding the identity rule is the most fundamental and powerful XSLT design pattern.

like image 123
Dimitre Novatchev Avatar answered Oct 12 '22 23:10

Dimitre Novatchev