Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using valgrind to spot error in mpi code

Tags:

valgrind

mpi

I have a code which works perfect in serial but with mpirun -n 2 ./out it gives the following error:

./out': malloc(): smallbin double linked list corrupted: 0x00000000024aa090

I tried to use valgrind such as:

valgrind --leak-check=yes mpirun -n 2 ./out

I got the following output:

==3494== Memcheck, a memory error detector
==3494== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==3494== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==3494== Command: mpirun -n 2 ./out
==3494== 
Grid_0/NACA0012.msh
Grid_0/NACA0012.msh
>>> Number of cells: 7734
>>> Number of cells: 7734
0.000000  0         1.470622e-02
*** Error in `./out': malloc(): smallbin double linked list corrupted: 0x00000000024aa090 ***

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 3497 RUNNING AT orhan
=   EXIT CODE: 134
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
==3494== 
==3494== HEAP SUMMARY:
==3494==     in use at exit: 131,120 bytes in 2 blocks
==3494==   total heap usage: 1,064 allocs, 1,062 frees, 231,859 bytes allocated
==3494== 
==3494== LEAK SUMMARY:
==3494==    definitely lost: 0 bytes in 0 blocks
==3494==    indirectly lost: 0 bytes in 0 blocks
==3494==      possibly lost: 0 bytes in 0 blocks
==3494==    still reachable: 131,120 bytes in 2 blocks
==3494==         suppressed: 0 bytes in 0 blocks
==3494== Reachable blocks (those to which a pointer was found) are not shown.
==3494== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==3494== 
==3494== For counts of detected and suppressed errors, rerun with: -v
==3494== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I am not good in valgrind but what I understood is valgrind saw no problem. Are there better options for valgrind to spot the source of the specific error mentioned?

like image 754
Shibli Avatar asked Jan 18 '16 09:01

Shibli


People also ask

How do you find errors in valgrind?

Look for function names and line numbersIf you compile your program with the -g flag, Valgrind will show you the function names and line numbers where errors occur.

How does valgrind detect memory corruption?

Valgrind Memcheck is a tool that detects memory leaks and memory errors. Some of the most difficult C bugs come from mismanagement of memory: allocating the wrong size, using an uninitialized pointer, accessing memory after it was freed, overrunning a buffer, and so on.

Can valgrind detect double free?

Sometimes, running a program (including with valgrind) can show a double-free error while in reality, it's a memory corruption problem (for example a memory overflow). The best way to check is to apply the advice detailed in the answers : How to track down a double free or corruption error in C++ with gdb.


1 Answers

Note that with the invocation above,

valgrind --leak-check=yes mpirun -n 2 ./out

you are running valgrind on the program mpirun, which presumably has been extensively tested and works correctly, and not the program ./out, which you know to have a problem.

To run valgrind on your test program you will want to do:

mpirun -n 2 valgrind --leak-check=yes ./out

Which uses mpirun to launch 2 processes, each running valgrind --leak-check=yes ./out.

like image 71
Jonathan Dursi Avatar answered Sep 21 '22 06:09

Jonathan Dursi