Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to disable XXE in libxml2in C?

Requirement: When i pass the following request to my application,

1) How to do XML validation on such input xml which is risk

2) How to disable XXE in libxml2 i.e. should not parse the ENTITY field

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY foo SYSTEM "file:///etc/issue">
]><TRANSACTION>
<FUNCTION_TYPE>LINE_ITEM</FUNCTION_TYPE>
<COMMAND>ADD</COMMAND>
<COUNTER>3</COUNTER>
<MAC>qof2EtycqT9YMcmOfKowpyXVbRpgM/7rncS3liK4JOs=</MAC>
<MAC_LABEL>P_206</MAC_LABEL>
<RUNNING_TAX_AMOUNT>0.00</RUNNING_TAX_AMOUNT>
<RUNNING_TRANS_AMOUNT>1.00</RUNNING_TRANS_AMOUNT>
<LINE_ITEMS>
<MERCHANDISE>
<LINE_ITEM_ID>1</LINE_ITEM_ID>
<DESCRIPTION>&foo;</DESCRIPTION>
<QUANTITY>1</QUANTITY>
<UNIT_PRICE>5.00</UNIT_PRICE>
<EXTENDED_PRICE>5.00</EXTENDED_PRICE>
</MERCHANDISE>
</LINE_ITEMS>
</TRANSACTION>

I understand starting with libxml2 version 2.9, XXE has been disabled by default. But we are using 2.7.7 version currently.

According to this link XML_ENTITY_PROCESSING

The Enum xmlParserOption should not have the following options defined in libxml2:

XML_PARSE_NOENT: Expands entities and substitutes them with replacement text XML_PARSE_DTDLOAD: Load the external DTD

Till now i was using xmlParseMemory function to parse an XML in-memory block and build a tree. This function does not take any parameter to set the xmlParserOption.

Then i Changed to xmlReadMemory function which also does same thing as xmlParseMemory function but takes different parameters.

docPtr = xmlReadMemory(szXMLMsg, iLen, "noname.xml", NULL, XML_PARSE_RECOVER);

Still I observe that ENTITY field is getting parsed. Could anyone help me? Please let me know if you need any more additional information.

Thank you for your time.

Regards

Praveen

like image 561
Praveen PVS Avatar asked Oct 21 '22 10:10

Praveen PVS


1 Answers

If you don't specify XML_PARSE_NOENT, the ENTITY declaration is still parsed but the entity won't be replaced. Also, the file /etc/issue won't be opened which you can verify with strace. So to protect from XXE, you simply don't pass the XML_PARSE_NOENT parser option.

The name of the option is a bit misleading, XML_PARSE_NOENT means that no entity nodes should be created in the parsed document. Consequently every entity is expanded. A better name would be something like XML_PARSE_EXPAND_ENTITIES.

If you really want to make sure or you want to expand entities with fine-grained control over which URLs to load, you can install your own external entity loader using xmlSetExternalEntityLoader. If your handler always returns NULL, you're on the safe side. But note that the external entity loader is used to load all kinds of external resources, so completely disabling it might break other stuff (XIncludes or XSLT stylesheets, for example).

EDIT: I have no idea why the entity is replaced in your case. Here's a test program:

#include <stdio.h>
#include <stdlib.h>
#include <libxml/parser.h>
#include <libxml/tree.h>

static xmlNodePtr
find_node(xmlNodePtr parent, const char *name) {
    for (xmlNodePtr cur = parent->children; cur != NULL; cur = cur->next) {
        if (cur->type == XML_ELEMENT_NODE
            && xmlStrcmp(cur->name, (const xmlChar*)name) == 0
        ) {
            return cur;
        }
    }

    fprintf(stderr, "Element '%s' not found\n", name);
    abort();

    return NULL;
}

int
main(int argc, char **argv) {
    static const char buf[] =
        "<?xml version=\"1.0\"?>\n"
        "<!DOCTYPE foo [\n"
        "<!ENTITY foo SYSTEM \"file:///etc/issue\">\n"
        "]><TRANSACTION>\n"
        "<FUNCTION_TYPE>LINE_ITEM</FUNCTION_TYPE>\n"
        "<COMMAND>ADD</COMMAND>\n"
        "<COUNTER>3</COUNTER>\n"
        "<MAC>qof2EtycqT9YMcmOfKowpyXVbRpgM/7rncS3liK4JOs=</MAC>\n"
        "<MAC_LABEL>P_206</MAC_LABEL>\n"
        "<RUNNING_TAX_AMOUNT>0.00</RUNNING_TAX_AMOUNT>\n"
        "<RUNNING_TRANS_AMOUNT>1.00</RUNNING_TRANS_AMOUNT>\n"
        "<LINE_ITEMS>\n"
        "<MERCHANDISE>\n"
        "<LINE_ITEM_ID>1</LINE_ITEM_ID>\n"
        "<DESCRIPTION>&foo;</DESCRIPTION>\n"
        "<QUANTITY>1</QUANTITY>\n"
        "<UNIT_PRICE>5.00</UNIT_PRICE>\n"
        "<EXTENDED_PRICE>5.00</EXTENDED_PRICE>\n"
        "</MERCHANDISE>\n"
        "</LINE_ITEMS>\n"
        "</TRANSACTION>\n";

    xmlDocPtr doc = xmlReadMemory(buf, sizeof(buf), "noname.xml", NULL,
                                  XML_PARSE_RECOVER);

    xmlNodePtr trans = find_node((xmlNodePtr)doc, "TRANSACTION");
    xmlNodePtr items = find_node(trans, "LINE_ITEMS");
    xmlNodePtr merch = find_node(items, "MERCHANDISE");
    xmlNodePtr desc  = find_node(merch, "DESCRIPTION");

    for (xmlNodePtr cur = desc->children; cur != NULL; cur = cur->next) {
        if (cur->type == XML_ENTITY_REF_NODE) {
            printf("entity ref node\n");
        }
        else {
            printf("other node of type: %d\n", cur->type);
        }
    }

    xmlFreeDoc(doc);

    return 0;
}

If I compile with

gcc -std=c99 -O2 -I/usr/include/libxml2 so.c -lxml2 -o so

and run it, the result is

entity ref node
like image 161
nwellnhof Avatar answered Oct 27 '22 15:10

nwellnhof