Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xpath query returning wrong values when using greater and less than?

Tags:

xml

xpath

This query is returning values less than 1000. It should only be returning values between 1000 and 1100. Why is that?

//results/Building[ 1 = 1 and (( Vacancy/sqft > 1000 ) and ( Vacancy/sqft < 1100 ) ) ]

The query will return the following building, which has vacancies less than 1000 square feet and greater than 1100 square feet:

<Building>
  <Vacancy><sqft>900</sqft></Vacancy>
  <Vacancy><sqft>1000</sqft></Vacancy>
  <Vacancy><sqft>2000</sqft></Vacancy>
  <Vacancy><sqft>500</sqft></Vacancy>
</Building>

Why is it included in the results?

Sample data:

<results>
  <Building><!--Shouldn't be selected.--></Building>

  <Building><!--Should be selected-->
    <Vacancy><sqft>1050</sqft></Vacancy>
  </Building>

  <Building><!--Should be selected-->
    <Vacancy><sqft>1025</sqft></Vacancy>
    <Vacancy><sqft>1075</sqft></Vacancy>
  </Building>

  <Building><!--Shouldn't be selected-->
    <Vacancy><sqft>10</sqft></Vacancy>
    <Vacancy><sqft>50</sqft></Vacancy>
  </Building>

  <Building><!--Should be selected.-->
    <Vacancy><sqft>1050</sqft></Vacancy>
    <Vacancy><sqft>2000</sqft></Vacancy>
  </Building>

  <Building><!--Should be selected.-->
    <Vacancy><sqft>900</sqft></Vacancy>
    <Vacancy><sqft>1040</sqft></Vacancy>
  </Building>

  <Building><!--Shouldn't be selected-->
    <Vacancy><sqft>10500</sqft></Vacancy>
  </Building>

  <Building><!--Shouldn't be selected-->
    <Vacancy><sqft>900</sqft></Vacancy>
    <Vacancy><sqft>1000</sqft></Vacancy>
    <Vacancy><sqft>2000</sqft></Vacancy>
    <Vacancy><sqft>500</sqft></Vacancy>
  </Building>

</results>

Thanks.

like image 255
live-love Avatar asked Nov 30 '22 09:11

live-love


1 Answers

The sample Building has a Vacancy child with sqft of 2000, so Vacancy/sqft > 1000 succeeds. It has a child with sqft of 1000 (and 900 and 500), so Vacancy/sqft < 1100 succeeds. Thus the xpath selects the Building.

The comparison expressions (such as Vacancy/sqft <= 1000) are implicitly qualified with "there exists"–as in "there exists a Vacancy child that has a sqft child with value > 1000"–because Vacancy/sqft is a set of nodes, rather than a single node. Moreover, each comparison has its own qualification, so the sqft in Vacancy/sqft > 1000 doesn't need to be the same sqft as in Vacancy/sqft < 1100. Note that //results/Buildings is a node set; the predicate [...] applies separately to each item in the set, which is why there isn't an issue with qualifiers. Translating your original xpath into English, we get:

Select the buildings (in the results) such that 1=1 and there exists a vacancy square footage > 1000 and there exists a vacancy square footage < 1100.

Let's take the English statement of the desired query and make it a little closer to a statement of logic, arriving at one of:

Select the buildings (in the results) such that there exists a vacancy with square footage such that it's > 1000 and it's < 1100

Select the buildings (in the results) such that there exists a vacancy such that the square footage > 1000 and the square footage < 1100

The former leads to jasso's solution, the latter to:

//results/Building[ Vacancy[1000 < sqft and sqft < 1100] ]

Original solution

(Note: this answered the original question, when it wasn't clear what the OP wanted. The technique may prove useful to others with a similar problem but different requirements, so I'm leaving it in.)

Try the logical double-negation of the condition:

//results/Building[ Vacancy and not (Vacancy/sqft <= 1000 or Vacancy/sqft >= 1100) ]

This predicate includes a test for Vacancy children to filter out cases that are otherwise trivially true, i.e. buildings with no vacancies. The English equivalent of this solution is:

Select buildings (in the results) such that the building has a vacancy and it's not the case that there exists a vacancy square footage <= 1000 or there exists a vacancy square footage >= 1100

In fewer words:

Select all buildings with vacancies where no vacancy has <= 1000 square feet or >= 1100 square feet.

In fewer words still:

Select all buildings with vacancies where all vacancies are between 1000 and 1100 square feet.

like image 86
outis Avatar answered Dec 09 '22 16:12

outis