Requirement: When i pass the following request to my application,
1) How to do XML validation on such input xml which is risk
2) How to disable XXE in libxml2 i.e. should not parse the ENTITY field
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY foo SYSTEM "file:///etc/issue">
]><TRANSACTION>
<FUNCTION_TYPE>LINE_ITEM</FUNCTION_TYPE>
<COMMAND>ADD</COMMAND>
<COUNTER>3</COUNTER>
<MAC>qof2EtycqT9YMcmOfKowpyXVbRpgM/7rncS3liK4JOs=</MAC>
<MAC_LABEL>P_206</MAC_LABEL>
<RUNNING_TAX_AMOUNT>0.00</RUNNING_TAX_AMOUNT>
<RUNNING_TRANS_AMOUNT>1.00</RUNNING_TRANS_AMOUNT>
<LINE_ITEMS>
<MERCHANDISE>
<LINE_ITEM_ID>1</LINE_ITEM_ID>
<DESCRIPTION>&foo;</DESCRIPTION>
<QUANTITY>1</QUANTITY>
<UNIT_PRICE>5.00</UNIT_PRICE>
<EXTENDED_PRICE>5.00</EXTENDED_PRICE>
</MERCHANDISE>
</LINE_ITEMS>
</TRANSACTION>
I understand starting with libxml2 version 2.9, XXE has been disabled by default. But we are using 2.7.7 version currently.
According to this link XML_ENTITY_PROCESSING
The Enum xmlParserOption should not have the following options defined in libxml2:
XML_PARSE_NOENT: Expands entities and substitutes them with replacement text XML_PARSE_DTDLOAD: Load the external DTD
Till now i was using xmlParseMemory
function to parse an XML in-memory block and build a tree. This function does not take any parameter to set the xmlParserOption.
Then i Changed to xmlReadMemory
function which also does same thing as xmlParseMemory
function but takes different parameters.
docPtr = xmlReadMemory(szXMLMsg, iLen, "noname.xml", NULL, XML_PARSE_RECOVER);
Still I observe that ENTITY field is getting parsed. Could anyone help me? Please let me know if you need any more additional information.
Thank you for your time.
Regards
Praveen
If you don't specify XML_PARSE_NOENT
, the ENTITY
declaration is still parsed but the entity won't be replaced. Also, the file /etc/issue
won't be opened which you can verify with strace
. So to protect from XXE, you simply don't pass the XML_PARSE_NOENT
parser option.
The name of the option is a bit misleading, XML_PARSE_NOENT
means that no entity nodes should be created in the parsed document. Consequently every entity is expanded. A better name would be something like XML_PARSE_EXPAND_ENTITIES
.
If you really want to make sure or you want to expand entities with fine-grained control over which URLs to load, you can install your own external entity loader using xmlSetExternalEntityLoader
. If your handler always returns NULL, you're on the safe side. But note that the external entity loader is used to load all kinds of external resources, so completely disabling it might break other stuff (XIncludes or XSLT stylesheets, for example).
EDIT: I have no idea why the entity is replaced in your case. Here's a test program:
#include <stdio.h>
#include <stdlib.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
static xmlNodePtr
find_node(xmlNodePtr parent, const char *name) {
for (xmlNodePtr cur = parent->children; cur != NULL; cur = cur->next) {
if (cur->type == XML_ELEMENT_NODE
&& xmlStrcmp(cur->name, (const xmlChar*)name) == 0
) {
return cur;
}
}
fprintf(stderr, "Element '%s' not found\n", name);
abort();
return NULL;
}
int
main(int argc, char **argv) {
static const char buf[] =
"<?xml version=\"1.0\"?>\n"
"<!DOCTYPE foo [\n"
"<!ENTITY foo SYSTEM \"file:///etc/issue\">\n"
"]><TRANSACTION>\n"
"<FUNCTION_TYPE>LINE_ITEM</FUNCTION_TYPE>\n"
"<COMMAND>ADD</COMMAND>\n"
"<COUNTER>3</COUNTER>\n"
"<MAC>qof2EtycqT9YMcmOfKowpyXVbRpgM/7rncS3liK4JOs=</MAC>\n"
"<MAC_LABEL>P_206</MAC_LABEL>\n"
"<RUNNING_TAX_AMOUNT>0.00</RUNNING_TAX_AMOUNT>\n"
"<RUNNING_TRANS_AMOUNT>1.00</RUNNING_TRANS_AMOUNT>\n"
"<LINE_ITEMS>\n"
"<MERCHANDISE>\n"
"<LINE_ITEM_ID>1</LINE_ITEM_ID>\n"
"<DESCRIPTION>&foo;</DESCRIPTION>\n"
"<QUANTITY>1</QUANTITY>\n"
"<UNIT_PRICE>5.00</UNIT_PRICE>\n"
"<EXTENDED_PRICE>5.00</EXTENDED_PRICE>\n"
"</MERCHANDISE>\n"
"</LINE_ITEMS>\n"
"</TRANSACTION>\n";
xmlDocPtr doc = xmlReadMemory(buf, sizeof(buf), "noname.xml", NULL,
XML_PARSE_RECOVER);
xmlNodePtr trans = find_node((xmlNodePtr)doc, "TRANSACTION");
xmlNodePtr items = find_node(trans, "LINE_ITEMS");
xmlNodePtr merch = find_node(items, "MERCHANDISE");
xmlNodePtr desc = find_node(merch, "DESCRIPTION");
for (xmlNodePtr cur = desc->children; cur != NULL; cur = cur->next) {
if (cur->type == XML_ENTITY_REF_NODE) {
printf("entity ref node\n");
}
else {
printf("other node of type: %d\n", cur->type);
}
}
xmlFreeDoc(doc);
return 0;
}
If I compile with
gcc -std=c99 -O2 -I/usr/include/libxml2 so.c -lxml2 -o so
and run it, the result is
entity ref node
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With