How can I remove duplicate nodes in XQuery?

Question

I have an XML document I generate on the fly, and I need a function to eliminate any duplicate nodes from it.

My function looks like:

declare function local:start2() {
    let $data := local:scan_books()
    return <books>{$data}</books>
};

Sample output is:

<books>
  <book>
    <title>XML in 24 hours</title>
    <author>Some Guy</author>  
  </book>
  <book>
    <title>XML in 24 hours</title>
    <author>Some Guy</author>  
  </book>
</books>

I want just the one entry in my books root tag, and there are other tags, like say pamphlet in there too that need to have duplicates removed. Any ideas?

Updated following comments. By unique nodes, I mean remove multiple occurrences of nodes that have the exact same content and structure.

Dimitre Novatchev · Accepted Answer

A simpler and more direct one-liner XPath solution:

Just use the following XPath expression:

  /*/book
        [index-of(/*/book/title, 
                  title
                 )
                  [1]
        ]

When applied, for example, on the following XML document:

<books>
    <book>
        <title>XML in 24 hours</title>
        <author>Some Guy</author>
    </book>
    <book>
        <title>Food in Seattle</title>
        <author>Some Guy2</author>
    </book>
    <book>
        <title>XML in 24 hours</title>
        <author>Some Guy</author>
    </book>
    <book>
        <title>Food in Seattle</title>
        <author>Some Guy2</author>
    </book>
    <book>
        <title>How to solve XPAth Problems</title>
        <author>Me</author>
    </book>
</books>

the above XPath expression selects correctly the following nodes:

<book>
    <title>XML in 24 hours</title>
    <author>Some Guy</author>
</book>
<book>
    <title>Food in Seattle</title>
    <author>Some Guy2</author>
</book>
<book>
    <title>How to solve XPAth Problems</title>
    <author>Me</author>
</book>

The explanation is simple: For every book, select only one of its occurences -- such that its index in all-books is the same as the first index of its title in all-titles.

Travis Webb · Answer

You can use the built-in distinct-values() function...

How can I remove duplicate nodes in XQuery?

Tags:

duplicates

xquery

brabster

2 Answers

Dimitre Novatchev

Travis Webb

Recent Activity

Donate For Us

How can I remove duplicate nodes in XQuery?

Tags:

duplicates

xquery

brabster

2 Answers

Dimitre Novatchev

Travis Webb

Related questions

Recent Activity

Donate For Us