I have input field value from that is used for forming XPath query. What symbols in input string should I check to minimise possibility of XML injection?
This document describes in detail the concept of "Blind XPath Injection".
It provides concrete examples of XPath injections and discusses ways of preventing such.
In the section "Defending against XPath Injection" it is said:
"Defending against XPath Injection is essentially similar to defending against SQL injection. The application must sanitize user input. Specifically, the single and double quote characters should be disallowed. This can be done either in the application itself, or in a third party product (e.g. application firewall.) Testing application susceptibility to XPath Injection can be easily performed by injecting a single quote or a double quote, and inspecting the response. If an error has occurred, then it’s likely that an XPath Injection is possible."
As others have said, one should also pay attention to using of axes and the // abbreviation. If XPath 2.0 is being used, then the doc
() function should not be allowed, as it gives access to any document with known URI (or filename).
It is advisable to use an API which precompiles an XPath expression but leaves the possibility that it works with dynamically defined parameters or variables. Then the user input will define the contents of these parameters only and will never be treated as a modification of the already compiled expression.
Turn your tactics upside down.
Don't try to filter out unacceptable characters - a policy of "Assume it's OK unless I know it's bad"
Instead, filter in acceptable characters - a policy of "This stuff is OK, I'll assume everything else is bad".
In security terms, adopt a policy of "Default Deny" instead of "Default Accept".
For example ...
... if you're asking someone for a search term, say a persons first name, limit the input to only the characters you expect to find in names.
One way would be to limit to A-Z and then ensure that your search technique is accent aware (eg i = ì = í = î = ï and so on ), though this falls down on non-european naming.
... if you're asking for a number, limit to just digits and reject everything else.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With