Opened 9 years ago

Last modified 18 months ago

#122 new bug

Dynamic process context IDs

Reported by: "Rajeev Thakur" <thakur@…> Owned by: loden
Priority: long-term Milestone: future
Component: mpich Keywords:
Cc:

Description (last modified by balaji)

Hi Roberto,

We have done several rounds of checks and do not see any difference between MPICH2 1.0.7 and the TCP/IP interface of MVAPICH2 1.2. Both these should perform exactly the same. We are continuing our investigation.

We are wondering whether you can send us a sample code piece to reproduce the problem you are indicating across these two interfaces. This will help us to debug this problem faster and help you to solve your problem.

I've added other CCs in this email, maybe other people are interested to have a look in.

Attached you find the test program, which I'm working on, to turn up the problem. I'm not completely sure if it works perfectly since I wasn't able to complete its execution, but please let me know if I made something wrong inside the code. The testmaster is quite easy, you must provide the number of jobs to simulate (say 50000) and the node file that the resource manager provide for its schedule. Actually the node that matches the master will be excluded by the slave nodes.

The testmain creates a ring of threads from the assigned nodes. So walking in the ring, for each free node it find, a thread is started so you should have as many threads as the number of assigned nodes working in multithreading. For simulating something to do each thread internally generate a random integer, sets some MPI_Info (host and pwd), spawn the testslave job, send it the
generated random number, wait that the testslave receive and send back that number, sent and received numbers are comparated in order to verify their coherency, the slave send an empty MPI_Send() for signaling its termination, the thread now calls MPI_Comm_disconnect() for closing the slave connection, and finally all the MPI_Info are cleared. At this time the thread terminate. When the number of requested jobs are correctly "worked out" the application should terminate ... but without cleaning up (too tired sorry ;-), so it just wait a bit and finalize the MPI.

At this time, I wasn't able to complete any execution. Currently the application still crashing with the backtrace you find below. Only one time I was able to reach 3500 jobs but one thread was stuck in a mutex. Looking in the backtrace you can find the same race I'm getting in my applications.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1087666512 (LWP 18231)]
0x00000000006a3902 in MPIDI_PG_Dup_vcr () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
Missing separate debuginfos, use: debuginfo-install glibc.x86_64
(gdb) info threads
  29 Thread 1121462608 (LWP 18232)  0x0000003465a0a8f9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 28 Thread 1087666512 (LWP 18231)  0x00000000006a3902 in MPIDI_PG_Dup_vcr
() from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
  27 Thread 1142442320 (LWP 18230)  0x0000003464ecbd66 in poll () from
/lib64/libc.so.6
  26 Thread 1098156368 (LWP 18229)  0x0000003464e9ac61 in nanosleep () from
/lib64/libc.so.6
  1 Thread 140135980537584 (LWP 18029)  main (argc=3, argv=0x7ffffb5992d8)
at testmaster.c:437

(gdb) bt
#0  0x00000000006a3902 in MPIDI_PG_Dup_vcr () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#1  0x0000000000668012 in SetupNewIntercomm () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#2  0x00000000006682c8 in MPIDI_Comm_accept () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#3  0x00000000006a6617 in MPID_Comm_accept () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#4  0x000000000065ec5f in MPIDI_Comm_spawn_multiple () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#5  0x00000000006a17e6 in MPID_Comm_spawn_multiple () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#6  0x00000000006783fd in PMPI_Comm_spawn () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#7  0x00000000004017de in NodeThread_threadMain (arg=0x120a790) at
testmaster.c:314
#8  0x0000003465a06407 in start_thread () from /lib64/libpthread.so.0
#9  0x0000003464ed4b0d in clone () from /lib64/libc.so.6
(gdb) thread 29

[Switching to thread 29 (Thread 1121462608 (LWP 18232))]#0
0x0000003465a0a8f9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
(gdb) bt
#0  0x0000003465a0a8f9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x000000000065e2e7 in MPIDI_CH3I_Progress () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#2  0x00000000006675ca in FreeNewVC () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#3  0x0000000000668302 in MPIDI_Comm_accept () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#4  0x00000000006a6617 in MPID_Comm_accept () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#5  0x000000000065ec5f in MPIDI_Comm_spawn_multiple () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#6  0x00000000006a17e6 in MPID_Comm_spawn_multiple () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#7  0x00000000006783fd in PMPI_Comm_spawn () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#8  0x00000000004017de in NodeThread_threadMain (arg=0x120d590) at
testmaster.c:314
#9  0x0000003465a06407 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003464ed4b0d in clone () from /lib64/libc.so.6
(gdb) thread 27

[Switching to thread 27 (Thread 1142442320 (LWP 18230))]#0
0x0000003464ecbd66 in poll () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003464ecbd66 in poll () from /lib64/libc.so.6
#1  0x00000000006d63bf in MPIDU_Sock_wait () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#2  0x000000000065e1e7 in MPIDI_CH3I_Progress () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#3  0x00000000006cf87c in PMPI_Send () from
/home/roberto/.HRI/Proxy/HRI/External/mpich2/1.0.7/lib/linux-x86_64-gcc-glib
c2.3.4/libmpich.so.1.1
#4  0x0000000000401831 in NodeThread_threadMain (arg=0x120a6f0) at
testmaster.c:480
#5  0x0000003465a06407 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003464ed4b0d in clone () from /lib64/libc.so.6

(gdb) thread 26
[Switching to thread 26 (Thread 1098156368 (LWP 18229))]#0
0x0000003464e9ac61 in nanosleep () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003464e9ac61 in nanosleep () from /lib64/libc.so.6
#1  0x0000003464e9aa84 in sleep () from /lib64/libc.so.6
#2  0x000000000040197c in NodeThread_threadMain (arg=0x120d630) at
testmaster.c:505
#3  0x0000003465a06407 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003464ed4b0d in clone () from /lib64/libc.so.6
(gdb) 

Attachments (7)

Makefile.dat (428 bytes) - added by Rajeev Thakur 9 years ago.
Added by email2trac
part0001.html (9.3 KB) - added by Rajeev Thakur 9 years ago.
Added by email2trac
testmaster.c (10.5 KB) - added by Rajeev Thakur 9 years ago.
Added by email2trac
testmaster.pbs (788 bytes) - added by Rajeev Thakur 9 years ago.
Added by email2trac
testmaster.sh (171 bytes) - added by Rajeev Thakur 9 years ago.
Added by email2trac
testslave.c (1.6 KB) - added by Rajeev Thakur 9 years ago.
Added by email2trac
testslave.sh (119 bytes) - added by Rajeev Thakur 9 years ago.
Added by email2trac

Download all attachments as: .zip

Change History (22)

Changed 9 years ago by Rajeev Thakur

Added by email2trac

comment:1 Changed 9 years ago by Rajeev Thakur

  • id set to 122

This message has 7 attachment(s)

comment:2 Changed 9 years ago by balaji

  • Owner set to balaji

comment:3 Changed 9 years ago by thakur

  • Milestone set to mpich2-1.0.8

comment:4 Changed 9 years ago by balaji

  • Resolution set to fixed
  • Status changed from new to closed

This is fixed in the trunk. We decided against including the fixes in the 1.0.x series since it's too intrusive. Resolving.

comment:5 Changed 9 years ago by balaji

  • Description modified (diff)
  • Resolution fixed deleted
  • Status changed from closed to reopened
  • Summary changed from FW: [MPICH2 Req #4194] Re: [mvapich-discuss] Races with MPI_THREAD_MULTI to Dynamic process context IDs

Reopening this ticket as there seems to be a million bugs in dynamic process and I just fixed one of them.

There's no synchronization between the connector and the acceptor on the context ID, so depending on the application, the context IDs used on each side can go out of sync. This needs to be fixed by probably adding more synchronization between the connector and acceptor.

comment:6 Changed 9 years ago by balaji

  • Milestone changed from mpich2-1.0.8 to mpich2-1.1a2

comment:7 Changed 9 years ago by Rajeev Thakur

Reopening this ticket as there seems to be a million bugs in dynamic
process and I just fixed one of them.

There's no synchronization between the connector and the

acceptor on the

context ID, so depending on the application, the context IDs

used on each

side can go out of sync. This needs to be fixed by probably

adding more

synchronization between the connector and acceptor.

Just want to clarify that this issue is only for the temporary communicator
setup between the two roots to exchange information needed to create the
real intercommunicator. The context ids used in the intercommunicator are
ok.

Rajeev

comment:8 Changed 9 years ago by balaji

  • Resolution set to fixed
  • Status changed from reopened to closed

This has been fixed in [aec13eba1787f57134e367cd272ea7c74436b82a] (reviewed by thakur). Resolving.

comment:9 Changed 9 years ago by balaji

  • Priority changed from major to long-term
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening this ticket as it still doesn't fix the case where one slave connects to multiple masters.

comment:10 Changed 9 years ago by balaji

  • Milestone changed from mpich2-1.1a2 to mpich2-1.1b1
  • Owner changed from balaji to goodell
  • Status changed from reopened to new

comment:11 Changed 9 years ago by balaji

  • Milestone changed from mpich2-1.1b1 to mpich2-1.1

comment:12 Changed 9 years ago by balaji

  • Milestone changed from mpich2-1.1 to mpich2-1.1.1

comment:13 Changed 8 years ago by goodell

  • Milestone changed from mpich2-1.1.1 to mpich2-1.1.2

comment:14 Changed 8 years ago by balaji

  • Milestone changed from mpich2-1.1.2 to mpich2-1.2

Milestone mpich2-1.1.2 deleted

comment:15 Changed 8 years ago by goodell

  • Milestone changed from mpich2-1.2 to mpich2-1.2.1
  • Status changed from new to accepted

Replying to Rajeev Thakur:

Just want to clarify that this issue is only for the temporary communicator
setup between the two roots to exchange information needed to create the
real intercommunicator. The context ids used in the intercommunicator are
ok.

It turns out that the context ID used for intercomm_merge is very wrong though. See source:mpich2/trunk/src/mpi/intercomm_merge.c#212 and http://wiki.mcs.anl.gov/mpich2/index.php/Communicators_and_Context_IDs. Adding 2 just puts the context ID in the hierarchical communicator space.

-Dave

comment:16 Changed 8 years ago by balaji

  • Description modified (diff)

Tweaked the description formating.

comment:17 Changed 8 years ago by goodell

  • Milestone changed from mpich2-1.2.1 to mpich2-1.3

comment:18 Changed 7 years ago by thakur

  • Milestone changed from mpich2-1.3 to future

comment:19 Changed 5 years ago by balaji

  • Description modified (diff)
  • Status changed from accepted to new

comment:20 Changed 5 years ago by balaji

  • Owner goodell deleted

comment:21 Changed 18 months ago by raffenet

  • Owner set to loden
Note: See TracTickets for help on using tickets.