Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

`MPI_ERR_TRUNCATE: message truncated` error [closed]

I was facing a problem similar to the one discussed in this topic, I have a MPI code which sum the lines of a vector which has a specific number of lines. I attach the code here.

When I try to compile with one core online mpirun -n 1 ./program I obtain:

500000 sum 125000250000.00000 calculated by root process. The grand total is: 125000250000.00000 Because I have only one core that compute the sum, it looks OK. But when I try to use multicore mpirun -n 4 ./program I obtain:

please enter the number of numbers to sum:
500000
[federico-C660:9540] *** An error occurred in MPI_Recv
[federico-C660:9540] *** on communicator MPI_COMM_WORLD
[federico-C660:9540] *** MPI_ERR_TRUNCATE: message truncated
[federico-C660:9540] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
 sum    7812562500.0000000       calculated by root process.
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 9539 on
node XXXXX1 exiting without calling "finalize".

I also red similar problem for C program here. The same with 2 and 3 processors.

Could someone help me to figure out what is the problem? My guess is that I made a mistake in the MPI_RECV calling related with the "sender".

like image 573
Panichi Pattumeros PapaCastoro Avatar asked Feb 15 '26 12:02

Panichi Pattumeros PapaCastoro


1 Answers

There were a couple of problems in the code;

  1. The most obvious problem was the syntax error with receive variable, num_rows_to_receive. You have received the rows calculated by the root_process in num_rows_to_received, but used the variable num_rows_to_receive for actually receiving the vector. CALL mpi_recv (num_rows_to_receive, 1 , mpi_integer, root_process, mpi_any_tag, mpi_comm_world, STATUS, ierr) CALL mpi_recv (vector2, num_rows_to_receive, mpi_real8, root_process, mpi_any_tag, mpi_comm_world, STATUS, ierr)

This should resolve the error.

  1. The second problem (atleast I could see that on my system) is the MPI_REAL datatype defaults to MPI_REAL4 and the size of the vector gets truncated. So we won't be able to receive the actual summation of all the elements. Changing mpi_real to MPI_REAL8 will fix the summation issue and you can get the exact summation value for all any number of ranks. ~/temp$ mpirun -n 8 ./a.out please enter the number of numbers to sum: 500000 sum 1953156250.0000000 calculated by root process. partial sum 5859406250.0000000 returned from process 1 partial sum 9765656250.0000000 returned from process 2 partial sum 17578156250.000000 returned from process 4 partial sum 21484406250.000000 returned from process 5 partial sum 13671906250.000000 returned from process 3 partial sum 25390656250.000000 returned from process 6 partial sum 29296906250.000000 returned from process 7 The grand total is: 125000250000.00000
like image 63
naveen-rn Avatar answered Feb 21 '26 14:02

naveen-rn