Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Address Out of bounds error when reading xml

I am getting a weird segfault when using libxml to parse a file. This code worked previously when I compiled it as a 32bit application. I changed it to a 64 bit application and it stops working.

The seg fault comes in at "if (xmlStrcmp(cur->name, (const xmlChar *) "servers"))"

cur->name is a const xmlChar * and it points to an address that says its out out bounds. But when I debug and go to that memory location, that data is correct.

int XmlGetServers()
{
xmlDocPtr doc;
xmlNodePtr cur;

doc = xmlParseFile("Pin.xml");
if (doc == NULL)
{
    std::cout << "\n Pin.xml not parsed successfully." << std::endl;
    return -1;
}
cur = xmlDocGetRootElement(doc);

if (cur == NULL)
{
    std::cout << "\n Pin.xml is empty document." << std::endl;
    xmlFreeDoc(doc);
    return -1;
}
if (xmlStrcmp(cur->name, (const xmlChar *) "servers"))
{
    std::cout << "\n ERROR: Pin.xml of the wrong type, root node != servers." << std::endl;
    xmlFreeDoc(doc);
    return -1;
}
}

Before cur is initialized the name parameter is

Name : name
    Details:0xed11f72000007fff <Address 0xed11f72000007fff out of bounds>

After cur is initialized the name parameter is

Name : name
    Details:0x64c43000000000 <Address 0x64c43000000000 out of bounds> 

Referenced XML file

<?xml version="1.0"?>

<servers>

<server_info>

    <server_name>Server1</server_name>

    <server_ip>127.0.0.1</server_ip> 

    <server_data_port>9000</server_data_port> 

</server_info>

<server_info>

    <server_name>Server2</server_name> 

    <server_ip>127.0.0.1</server_ip> 

    <server_data_port>9001</server_data_port> 

</server_info>

</servers>

System:

OS: Redhat Enterprise Linux 6.4 64-bit

GCC: 4.4.7-3

packages: libxml2-2.7.6-8.el6_3.4.x86_64

like image 351
user758114 Avatar asked Oct 20 '15 15:10

user758114


1 Answers

I took your code, as is, and added:

#include <libxml/parser.h>
#include <iostream>

then renamed the function to main() and compiled it on x86-64 Fedora 22, which has libxml2 2.9.2

The resulting code ran successfully, using the sample file, with no segfaults. Even valgrind found no memory access violation. As proof, the resulting, abbreviated strace log is as follows:

stat("Pin.xml", {st_mode=S_IFREG|0644, st_size=362, ...}) = 0
stat("Pin.xml", {st_mode=S_IFREG|0644, st_size=362, ...}) = 0
stat("Pin.xml", {st_mode=S_IFREG|0644, st_size=362, ...}) = 0
open("Pin.xml", O_RDONLY)               = 3
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "<?xml version=\"1.0\"?>\n\n<servers>\n\n<server_info>\n\n    <server_name>Server1</server_name>\n\n    <server_ip>127.0.0.1</server_ip> \n\n    <server_data_port>9000</server_data_port> \n\n</server_info>\n\n<server_info>\n\n    <server_name>Server2</server_name> \n\n    <ser"..., 8192) = 362
read(3, "", 7830)                       = 0
getcwd("/tmp", 1024)                    = 5
close(3)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

Although this is Fedora with slightly new libxml2 and gcc, this difference does not matter. The answer here is that there's nothing wrong with the code that's shown here. I see nothing wrong with it.

But it is obviously a part of a much larger application, and your memory corruption is happening in some other part of your application, and it only manifests itself when your application's execution reaches this part.

The thing about C++ is that just because the code crashes at a particular point, it doesn't mean that this particular line of code is where the problem is. It shouldn't be too hard to come up with a simple example:

#include <iostream>
#include <cstring>

int main()
{

    char foo[3];

    strcpy(foo, "FoobarbazXXXXXXXXXXXXXXXXXXXXXX");

    for (int i=0; i<100; i++)
        std::cout << i << std::endl;
    return 0;
}

The bug here obviously occurs in the strcpy line. But the code will run just fine, and print 100 numbers from 0 to 99, and crash when main() returns. But, obviously, "return 0" is not where the bug is.

This is analogous to what's happening with your application. Some kind of memory corruption occurs at some point, which doesn't materially affect code execution until your code tries to parse your XML file.

Welcome to C++.

like image 195
Sam Varshavchik Avatar answered Sep 30 '22 17:09

Sam Varshavchik