question about mpich2 implementation
|Reported by:||michael <michael.bane@…>||Owned by:|
I believe the attached code should work correctly but I find that using the mpich2 implementation on one particular box it hangs for odd numbers of processors (not every time but frequently), whereas this code runs fine on another box I've tried (albeit with OpenMPI) Details are below and I'd welcome suggestions as to the cause of the problem. Note that if I add WRITE statements or use the debugger the problem appears to go away. Adding a FLUSH and BARRIER immediately after the WRITE stmt makes no difference. To confuse myself further, if I replace the MPI_Send() by MPI_SSend(), ie synchronous, sometimes the code completes whereas other times it appears to hang (see sync.out example at end) Thanks, Michael ~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ cat mkb_ring_solution_send_and_recv_portable.f90 PROGRAM ring ! this program will work on all MPI implementations USE MPI IMPLICIT NONE ! since we're only sending a single message between any src/dest pair we can use a single tag INTEGER, PARAMETER :: myTag=101 INTEGER :: ierror, inputRank, myRank, size INTEGER :: sendTo, recvFrom INTEGER :: recv_status(MPI_STATUS_SIZE) ! initialise MPI CALL MPI_INIT(ierror) ! determine my rank and total size CALL MPI_COMM_RANK(MPI_COMM_WORLD, myRank, ierror) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror) ! set up which process rank is to my right (ie clockwise) for sending sendTo = myRank + 1 IF (sendTo == size) sendTo = 0 recvFrom = myRank - 1 IF (recvFrom == -1) recvFrom = size-1 ! send my rank clockwise (from recvFrom to sendTo) ! to ensure nobody everybody is sending (and possibly waiting) at the same time, we split into even (send then recv) ! and odd (recv then send) if (mod(myRank,2)==0) then call mpi_send(myRank,1,MPI_INTEGER,sendTo, myTag, & MPI_COMM_WORLD,ierror) call mpi_recv(inputRank,1,MPI_INTEGER,recvFrom,myTag, & MPI_COMM_WORLD,recv_status,ierror) else call mpi_recv(inputRank,1,MPI_INTEGER,recvFrom,myTag, & MPI_COMM_WORLD,recv_status,ierror) call mpi_send(myRank,1,MPI_INTEGER,sendTo, myTag, & MPI_COMM_WORLD,ierror) endif write(*,*) 'i am #',myRank,' and I received a new rank=',inputRank CALL MPI_FINALIZE(ierror) END PROGRAM ring ~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ mpif90 -O0 mkb_ring_solution_send_and_recv_portable.f90;mpif90 -show ifort: Command line warning: overriding '-O3' with '-O0' ifort -O3 -I/opt/mpi/mpibull2-0.9.7-2.t_RC4v4.3/include -L/home/horace/mccssmb2/.mpibull2/lib -L/opt/mpi/mpibull2-0.9.7-2.t_RC4v4.3/lib -lmpidev -lmpi -lrt -ldl -lelan -lelanctrl -lcpuset ~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ date;prun -O -n 5 -p login ./a.out Sun Feb 15 19:06:56 GMT 2009 Using qxelan driver, build for MPIBull2 0.9.7-t (Ishtar) 20060726-1607 i am # 1 and I received a new rank= 0 i am # 2 and I received a new rank= 1 i am # 3 and I received a new rank= 2 i am # 4 and I received a new rank= 3 (CNTL-C) ~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ date Sun Feb 15 19:23:47 GMT 2009 ~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ prun -I -n 5 -p login ./sync.out Using qxelan driver, build for MPIBull2 0.9.7-t (Ishtar) 20060726-1607 i am # 1 and I received a new rank= 0 i am # 2 and I received a new rank= 1 [prun -O means over-commit if required, -I means no over-commit but fail without running if insufficient resources]
Change History (4)
Note: See TracTickets for help on using tickets.