Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the most repeated element in a sequence with XQuery

I've got a sequence of values. They can all be equal... or not. So with XQuery I want to get the most frequent item in the sequence.

let $counter := 0, $index1 := 0 
for $value in $sequence 
if (count(index-of($value, $sequence))) 
then 
{ 
$counter := count(index-of($value, $sequence)) $index1 := index-of($value) 
} else {} 

I can't make this work, so I suppose I'm doing something wrong.

Thanks in advance for any help you could give me.

like image 561
deb Avatar asked Jun 24 '10 15:06

deb


2 Answers

Use:

  for $maxFreq in 
           max(for $val in distinct-values($sequence)
                     return count(index-of($sequence, $val))
               )
   return
      distinct-values($sequence)[count(index-of($sequence, .)) eq $maxFreq]

Update, Dec. 2015:

This is notably shorter, though may not be too-efficient:

$pSeq[index-of($pSeq,.)[max(for $item in $pSeq return count(index-of($pSeq,$item)))]]

The shortest expression can be constructed for XPath 3.1:

enter image description here

And even shorter and copyable -- using a one-character name:

$s[index-of($s,.)[max($s ! count(index-of($s, .)))]]
like image 122
Dimitre Novatchev Avatar answered Oct 17 '22 21:10

Dimitre Novatchev


You are approaching this problem from too much of an imperative standpoint.

In XQuery you can set the values of variables, but you can never change them.

The correct way to do iterative-type algorithms is with a recursive function:

declare funciton local:most($sequence, $index, $value, $count)
{
  let $current=$sequence[$index]
  return
    if (empty($current))
    then $value
    else
      let $current-count = count(index-of($current, $sequence))
      return
        if ($current-count > $count)
        then local:most($sequence, $index+1, $current, $current-count)
        else local:most($sequence, $index+1, $value, $count)
}

but a better way of approaching the problem is by describing the problem in a non-iterative way. In this case of all the distinct values in your sequence you want the one that appears maximum number of times of any distinct value.

The previous sentance translated into XQuery is

let $max-count := max(for $value1 in distinct-values($sequence)
                      return count(index-of($sequence, $value1)))
for $value2 in distinct-values($sequence)
where (count(index-of($sequence, $value2)) = $max-count
return $value2
like image 26
Oliver Hallam Avatar answered Oct 17 '22 22:10

Oliver Hallam