Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XPath is primarily used for Html or XML or XHTML?

I am totally new to XPath concept and I have a very basic understanding of XPath. I started using XPath firstly for finding web elements on HTML page.

Now while searching over web (videos and text), I found that all XPath tutorials are related to XML (and not HTML pages).

Wiki says,

XPath (XML Path Language) is a query language for selecting nodes from an XML document.

This has confounded me a lot.

  1. Is XPath not used for HTML Document?
  2. Are there any fundamental/structural differences in writing XPath for HTML, XML, XHTML?

Please note that I understand that this question is below par, but only out of utter confuson I am asking it here.

like image 841
threeA's Avatar asked Feb 21 '17 17:02

threeA's


People also ask

Is XPath only for XML?

XPath expressions can be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.

Can XPath be used on HTML?

XML and HTML Note that HTML and XML have a very similar structure, which is why XPath can be used almost interchangeably to navigate both HTML and XML documents.

What is XPath what is it used for?

The XML Path Language (XPath) is used to uniquely identify or address parts of an XML document. An XPath expression can be used to search through an XML document, and extract information from any part of the document, such as an element or attribute (referred to as a node in XML) in it.

Which type of documents use XPath?

XPath uses a path notation (as in URLs) for navigating through the hierarchical structure of an XML document. It uses a non-XML syntax so that it can be used in URIs and XML attribute values.


1 Answers

You have a right to be confused.

XPath operates against a data model that generally assumes that markup is well-formed. By definition, XML and XHTML are necessarily well-formed; HTML, not necessarily. However, HTML parsers can often successfully parse non-well-formed markup anyway, in the spirit of being liberal in what one accepts as input, into a data model suitable for XPath.

Therefore, you can usually also use XPath with HTML. Using XPath in this manner, in fact, is a common web page scraping technique.

like image 92
kjhughes Avatar answered Oct 15 '22 12:10

kjhughes