Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I get the error malloc(): invalid size (unsorted)?

I have a web crawler code at https://github.com/JamesRead5737/webcrawler/blob/master/crawler.c which is producing some strange errors that I cannot explain. Most commonly it Aborts with error malloc(): invalid size (unsorted)

A backtrace shows:

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff760e859 in __GI_abort () at abort.c:79
#2  0x00007ffff76793ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff77a3285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007ffff768147c in malloc_printerr (str=str@entry=0x7ffff77a5a50 "malloc(): invalid size (unsorted)") at malloc.c:5347
#4  0x00007ffff7684234 in _int_malloc (av=av@entry=0x7ffff77d4b80 <main_arena>, bytes=bytes@entry=8200) at malloc.c:3736
#5  0x00007ffff7686419 in __GI___libc_malloc (bytes=8200) at malloc.c:3066
#6  0x00005555555578b3 in html_link_find (url=0x55555d0f8b08 "https://www.android.com/intl/en_us//security-center/", 
    html=0x55555f9e6c00 "<!DOCTYPE html>\n<html lang=\"en\" dir=\"ltr\">\n  <head>\n    <meta charset=\"utf-8\">\n    <title>Android Security Center</title>\n    <meta content=\"initial-scale=1, minimum-scale=1, width=device-width\" name="...) at crawler.c:455
#7  0x0000555555557d70 in html_parse (url=0x55555d0f8b08 "https://www.android.com/intl/en_us//security-center/", 
    html=0x55555f9e6c00 "<!DOCTYPE html>\n<html lang=\"en\" dir=\"ltr\">\n  <head>\n    <meta charset=\"utf-8\">\n    <title>Android Security Center</title>\n    <meta content=\"initial-scale=1, minimum-scale=1, width=device-width\" name="...) at crawler.c:536
#8  0x00005555555582cc in check_multi_info (g=0x7ffffffe0970) at crawler.c:678
#9  0x00005555555583db in event_cb (g=0x7ffffffe0970, fd=1164, revents=1) at crawler.c:706
#10 0x0000555555559829 in crawler_init () at crawler.c:1154
#11 0x0000555555559ae9 in main (argc=1, argv=0x7fffffffe018) at crawler.c:1207

This takes me to the line of code that says sql_current->next = (SqlNode *)malloc(sizeof(SqlNode)); which as far as I can see should cause no errors at all.

Googling the error suggests that the problem could be anywhere in the code and completely unrelated to the line in question. Is that right?

MySQL database is set up as follows:

USE crawl;
CREATE TABLE IF NOT EXISTS `crawled` (`id` int NOT NULL AUTO_INCREMENT, `url` varchar(768) DEFAULT NULL, `title` varchar(768) DEFAULT NULL, `date` varchar(128) DEFAULT NULL, `links` int DEFAULT 0, `backlinks` int DEFAULT 0, `frontier` int DEFAULT 1, PRIMARY KEY (`id`), UNIQUE KEY `url` (`url`), KEY `title` (`title`), KEY `frontier` (`frontier`)) ENGINE=InnoDB AUTO_INCREMENT=1;
CREATE TABLE IF NOT EXISTS `emails` (`email` varchar(2084) NOT NULL, `id` int NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`)) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
INSERT INTO crawled (url) VALUES ('http://www.bing.com'),('http://www.yahoo.com'),('http://www.google.com');

Any ideas how I could find the real problem if the line in question is fine?

EDIT

Here is some valgrind output:

==318618== Memcheck, a memory error detector
==318618== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==318618== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==318618== Command: ./a.out
==318618== Parent PID: 2591
==318618== 
==318618== Warning: ignored attempt to set SIGKILL handler in sigaction();
==318618==          the SIGKILL signal is uncatchable
==318618== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==318618==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==318618==    by 0x10D052: crawler_init (crawler.c:987)
==318618==    by 0x10DAE8: main (crawler.c:1207)
==318618==  Address 0x1ffefe28ac is on thread 1's stack
==318618==  in frame #1, created by crawler_init (crawler.c:956)
==318618== 
==318618== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==318618==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==318618==    by 0x10C61C: setsock (crawler.c:769)
==318618==    by 0x10C6BC: addsock (crawler.c:782)
==318618==    by 0x10CE70: sock_cb (crawler.c:921)
==318618==    by 0x48B70B1: singlesocket (multi.c:2593)
==318618==    by 0x48B7878: multi_socket (multi.c:2839)
==318618==    by 0x48B8053: curl_multi_socket_action (multi.c:2956)
==318618==    by 0x10C4E8: timer_cb (crawler.c:741)
==318618==    by 0x10D7C7: crawler_init (crawler.c:1152)
==318618==    by 0x10DAE8: main (crawler.c:1207)
==318618==  Address 0x1ffefe2584 is on thread 1's stack
==318618==  in frame #1, created by setsock (crawler.c:749)
==318618== 
==318618== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==318618==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==318618==    by 0x10C61C: setsock (crawler.c:769)
==318618==    by 0x10CE8F: sock_cb (crawler.c:923)
==318618==    by 0x48B70B1: singlesocket (multi.c:2593)
==318618==    by 0x48B7878: multi_socket (multi.c:2839)
==318618==    by 0x48B8053: curl_multi_socket_action (multi.c:2956)
==318618==    by 0x10C3BA: event_cb (crawler.c:703)
==318618==    by 0x10D828: crawler_init (crawler.c:1154)
==318618==    by 0x10DAE8: main (crawler.c:1207)
==318618==  Address 0x1ffefe25a4 is on thread 1's stack
==318618==  in frame #1, created by setsock (crawler.c:749)
==318618== 
==318618== 
==318618== HEAP SUMMARY:
==318618==     in use at exit: 149,695,831 bytes in 27,400 blocks
==318618==   total heap usage: 2,198,504 allocs, 2,171,104 frees, 3,507,931,785 bytes allocated
==318618== 
==318618== LEAK SUMMARY:
==318618==    definitely lost: 1,889,627 bytes in 9,067 blocks
==318618==    indirectly lost: 0 bytes in 0 blocks
==318618==      possibly lost: 2,137,116 bytes in 27 blocks
==318618==    still reachable: 145,669,088 bytes in 18,306 blocks
==318618==         suppressed: 0 bytes in 0 blocks
==318618== Rerun with --leak-check=full to see details of leaked memory
==318618== 
==318618== Use --track-origins=yes to see where uninitialised values come from
==318618== ERROR SUMMARY: 295 errors from 3 contexts (suppressed: 0 from 0)
==318618== 
==318618== 1 errors in context 1 of 3:
==318618== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==318618==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==318618==    by 0x10D052: crawler_init (crawler.c:987)
==318618==    by 0x10DAE8: main (crawler.c:1207)
==318618==  Address 0x1ffefe28ac is on thread 1's stack
==318618==  in frame #1, created by crawler_init (crawler.c:956)
==318618== 
==318618== 
==318618== 79 errors in context 2 of 3:
==318618== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==318618==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==318618==    by 0x10C61C: setsock (crawler.c:769)
==318618==    by 0x10CE8F: sock_cb (crawler.c:923)
==318618==    by 0x48B70B1: singlesocket (multi.c:2593)
==318618==    by 0x48B7878: multi_socket (multi.c:2839)
==318618==    by 0x48B8053: curl_multi_socket_action (multi.c:2956)
==318618==    by 0x10C3BA: event_cb (crawler.c:703)
==318618==    by 0x10D828: crawler_init (crawler.c:1154)
==318618==    by 0x10DAE8: main (crawler.c:1207)
==318618==  Address 0x1ffefe25a4 is on thread 1's stack
==318618==  in frame #1, created by setsock (crawler.c:749)
==318618== 
==318618== 
==318618== 215 errors in context 3 of 3:
==318618== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==318618==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==318618==    by 0x10C61C: setsock (crawler.c:769)
==318618==    by 0x10C6BC: addsock (crawler.c:782)
==318618==    by 0x10CE70: sock_cb (crawler.c:921)
==318618==    by 0x48B70B1: singlesocket (multi.c:2593)
==318618==    by 0x48B7878: multi_socket (multi.c:2839)
==318618==    by 0x48B8053: curl_multi_socket_action (multi.c:2956)
==318618==    by 0x10C4E8: timer_cb (crawler.c:741)
==318618==    by 0x10D7C7: crawler_init (crawler.c:1152)
==318618==    by 0x10DAE8: main (crawler.c:1207)
==318618==  Address 0x1ffefe2584 is on thread 1's stack
==318618==  in frame #1, created by setsock (crawler.c:749)
==318618== 
==318618== ERROR SUMMARY: 295 errors from 3 contexts (suppressed: 0 from 0)

EDIT

Here is some valgrind output from a crash:

==319842== Memcheck, a memory error detector
==319842== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==319842== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==319842== Command: ./a.out
==319842== Parent PID: 2591
==319842== 
==319842== Warning: ignored attempt to set SIGKILL handler in sigaction();
==319842==          the SIGKILL signal is uncatchable
==319842== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==319842==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==319842==    by 0x10D052: crawler_init (crawler.c:987)
==319842==    by 0x10DAE8: main (crawler.c:1207)
==319842==  Address 0x1ffefe28ac is on thread 1's stack
==319842==  in frame #1, created by crawler_init (crawler.c:956)
==319842== 
==319842== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==319842==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==319842==    by 0x10C61C: setsock (crawler.c:769)
==319842==    by 0x10C6BC: addsock (crawler.c:782)
==319842==    by 0x10CE70: sock_cb (crawler.c:921)
==319842==    by 0x48B70B1: singlesocket (multi.c:2593)
==319842==    by 0x48B7878: multi_socket (multi.c:2839)
==319842==    by 0x48B8053: curl_multi_socket_action (multi.c:2956)
==319842==    by 0x10C4E8: timer_cb (crawler.c:741)
==319842==    by 0x10D7C7: crawler_init (crawler.c:1152)
==319842==    by 0x10DAE8: main (crawler.c:1207)
==319842==  Address 0x1ffefe2584 is on thread 1's stack
==319842==  in frame #1, created by setsock (crawler.c:749)
==319842== 
==319842== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==319842==    at 0x515AACE: epoll_ctl (syscall-template.S:78)
==319842==    by 0x10C61C: setsock (crawler.c:769)
==319842==    by 0x10CE8F: sock_cb (crawler.c:923)
==319842==    by 0x48B70B1: singlesocket (multi.c:2593)
==319842==    by 0x48B7878: multi_socket (multi.c:2839)
==319842==    by 0x48B8053: curl_multi_socket_action (multi.c:2956)
==319842==    by 0x10C3BA: event_cb (crawler.c:703)
==319842==    by 0x10D828: crawler_init (crawler.c:1154)
==319842==    by 0x10DAE8: main (crawler.c:1207)
==319842==  Address 0x1ffefe25a4 is on thread 1's stack
==319842==  in frame #1, created by setsock (crawler.c:749)
==319842== 
==319842== Invalid write of size 1
==319842==    at 0x48436E4: mempcpy (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==319842==    by 0x50CD1D8: _IO_default_xsputn (genops.c:386)
==319842==    by 0x50CD1D8: _IO_default_xsputn (genops.c:370)
==319842==    by 0x50B227B: __vfprintf_internal (vfprintf-internal.c:1688)
==319842==    by 0x50C0278: __vsprintf_internal (iovsprintf.c:95)
==319842==    by 0x509D047: sprintf (sprintf.c:30)
==319842==    by 0x10B88F: html_link_find (crawler.c:452)
==319842==    by 0x10BD6F: html_parse (crawler.c:536)
==319842==    by 0x10C2CB: check_multi_info (crawler.c:678)
==319842==    by 0x10C3DA: event_cb (crawler.c:706)
==319842==    by 0x10D828: crawler_init (crawler.c:1154)
==319842==    by 0x10DAE8: main (crawler.c:1207)
==319842==  Address 0xf107d18 is 0 bytes after a block of size 8,200 alloc'd
==319842==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==319842==    by 0x10B736: html_link_find (crawler.c:440)
==319842==    by 0x10BD6F: html_parse (crawler.c:536)
==319842==    by 0x10C2CB: check_multi_info (crawler.c:678)
==319842==    by 0x10C3DA: event_cb (crawler.c:706)
==319842==    by 0x10D828: crawler_init (crawler.c:1154)
==319842==    by 0x10DAE8: main (crawler.c:1207)
==319842== 

valgrind: m_mallocfree.c:305 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 8272, hi = 3625731377157460067.
This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata.  If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away.  Please try that before reporting this as a bug.


host stacktrace:
==319842==    at 0x58046FFA: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x58047127: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x580472CB: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x580514B4: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x5803DE9A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x5803CD9F: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x58041F04: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x5803C1D8: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/memcheck-amd64-linux)
==319842==    by 0x1002EC6B18: ???
==319842==    by 0x1002CADF2F: ???
==319842==    by 0x1002CADF17: ???
==319842==    by 0x1002CADF2F: ???
==319842==    by 0x1002CADF3F: ???

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 319842)
==319842==    at 0x50CD2B4: _IO_default_xsputn (genops.c:394)
==319842==    by 0x50CD2B4: _IO_default_xsputn (genops.c:370)
==319842==    by 0x50B2165: __vfprintf_internal (vfprintf-internal.c:1719)
==319842==    by 0x50C0278: __vsprintf_internal (iovsprintf.c:95)
==319842==    by 0x509D047: sprintf (sprintf.c:30)
==319842==    by 0x10B88F: html_link_find (crawler.c:452)
==319842==    by 0x10BD6F: html_parse (crawler.c:536)
==319842==    by 0x10C2CB: check_multi_info (crawler.c:678)
==319842==    by 0x10C3DA: event_cb (crawler.c:706)
==319842==    by 0x10D828: crawler_init (crawler.c:1154)
==319842==    by 0x10DAE8: main (crawler.c:1207)
client stack range: [0x1FFEFDB000 0x1FFF000FFF] client SP: 0x1FFEFDB600
valgrind stack range: [0x1002BAE000 0x1002CADFFF] top usage: 10344 of 1048576


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.
like image 729
James Read Avatar asked Jun 04 '20 22:06

James Read


1 Answers

Transferring comment to answer.

It appears that function html_link_find() allocated memory at line 440, and then invoked sprintf() at line 452 to format something, and it overwrote the end of the allocated memory by 1 byte, which was enough to kill malloc().

One partial fix would be to use snprintf() instead of sprintf() — but you would need to test the return value as well to avoid data truncation. There's a chance that some data that was supposed to be null-terminated isn't, which may be causing part of the overflow.

But it looks to me like that's where you need to start looking.

like image 178
Jonathan Leffler Avatar answered Nov 11 '22 10:11

Jonathan Leffler