Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#1836 closed bug (fixed)

Debugger changes from IBM.

Reported by: blocksom Owned by: raffenet
Priority: major Milestone: future
Component: mpich Keywords: ibm-integ


There are misc debugger changes that were not contributed to the mpich master branch. As there is no individual commit that contained these changes, these are likely leftovers from a goofed up merge.

Regardless, this code exists in the internal ibm repository and is not in the mpich repository. I would like to discuss these changes and perhaps find a solution that will allow this (or similar) code to be added to the mpich master branch so that it would no longer be necessary to maintain this code separately.

Attachments (1)

eagerdt.c (1.5 KB) - added by blocksom 5 years ago.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 5 years ago by blocksom

See branch mpich-ibm/ticket-1835.

comment:2 Changed 5 years ago by blocksom

Bah! .. I meant branch mpich-master/ticket-1836.

comment:3 Changed 5 years ago by balaji

  • Milestone set to mpich-3.1
  • Owner set to gropp

Assigning this to Bill. Bill, can you look at this patch from IBM?

comment:4 Changed 5 years ago by gropp

Looks ok to me (the branch is mpich-ibm/ticket-1836). But someone with access to a debugger that implements the interface should build and test this against the default ch3:nemesis build.

Last edited 5 years ago by balaji (previous) (diff)

comment:5 Changed 5 years ago by blocksom

I rebased the mpich-ibm/ticket-1836 branch to mpich/master.

comment:6 Changed 5 years ago by balaji

  • Owner changed from gropp to raffenet

comment:7 Changed 5 years ago by raffenet

  • Resolution set to fixed
  • Status changed from new to closed

I verified that the mpich-ibm/ticket-1836 branch updates the message queue correctly in Totalview. Pushed changes to mpich/master.

comment:8 Changed 5 years ago by balaji

Adding commit hash for posterity: [08c22ebe].

comment:9 Changed 5 years ago by raffenet

  • Resolution fixed deleted
  • Status changed from closed to reopened

I'm not sure how I missed this, but this change seems to break the build in mpich/master and mpich-ibm/ticket-1836 with the below compiler error.

src/mpi/debugger/dbginit.c:337:18: error: no member named 'mpid' in 'struct MPID_Request'
            req-> = NULL;
            ~~~  ^
src/mpi/debugger/dbginit.c:349:10: error: no member named 'mpid' in 'struct MPID_Request'
    req-> = (MPID_Request *)p; /* overload 'next' for debugger SEND queue */
    ~~~  ^
src/mpi/debugger/dbginit.c:359:31: error: no member named 'mpid' in 'struct MPID_Request'
    p    = (MPIR_Sendq *)req->;
                         ~~~  ^
3 errors generated.

comment:10 Changed 5 years ago by balaji

This needs to be reverted. All tests are failing.

comment:11 Changed 5 years ago by raffenet

Reverted and pushed on commit [ef0c4526]

comment:12 Changed 5 years ago by raffenet


We haven't integrated these changes since the mpich-ibm/ticket-1836 branch fails to build. Are there additional changes that didn't make in yet?


comment:13 Changed 5 years ago by balaji

  • Milestone changed from mpich-3.1 to future

Pushing this to "future". Needs an updated patch from IBM.

comment:14 Changed 5 years ago by blocksom

It looks like this debugger code is using a structure field that is only defined for pamid. Both the pamid and ch3 devices have a struct MPID_Request * next; pointer in the mpid structure - it's just that ch3 doesn't define a mpid field as part of the request structure like pamid does.

$ git grep -n "mpid;"
src/mpid/pamid/include/mpidi_hooks.h:54:#define MPID_DEV_REQUEST_DECL    struct MPIDI_Request mpid;
src/mpid/pamid/include/mpidi_hooks.h:57:#define MPID_DEV_COMM_DECL       struct MPIDI_Comm    mpid;
src/mpid/pamid/include/mpidi_hooks.h:60:#define MPID_DEV_WIN_DECL        struct MPIDI_Win     mpid;
src/mpid/pamid/src/mpid_request.h:193:  struct MPIDI_Request* mpid = &req->mpid;
src/include/mpiimpl.h:1457:  saved within the device-specific fields provided by 'MPID_DEV_REQUEST_DECL'.
src/include/mpiimpl.h:1488:#ifdef MPID_DEV_REQUEST_DECL
src/include/mpiimpl.h:1489:    MPID_DEV_REQUEST_DECL
src/mpid/ch3/include/mpidpre.h:403:#define MPID_DEV_REQUEST_DECL                        \
src/mpid/ch3/include/mpidpre.h:407:#define MPID_DEV_REQUEST_DECL                        \
src/mpid/pamid/include/mpidi_hooks.h:54:#define MPID_DEV_REQUEST_DECL    struct MPIDI_Request mpid;

I'm wary of making changes to ch3, and also to common mpich code. Is there a way we can put this "next" pointer directly in the request somehow? maybe call it "debugger_next" instead of overloading the mpid "next" pointer (as documented in the code)?

FWIW, this code change caused a test that was taking 20 minutes to complete to finish in only 30 seconds. Here is some information on the original problem determination:

I'm seeing what I consider odd behavior with the testcase in the shared area. If I use MAX_MSGS from ~60K - 206K, the testcase stalls for a very long time - up to 20 minutes. Compare a 30K run to 206K, for example. From the debug output, it looks like the first ~50K isend/recvs complete almost immediately and then there's a long stall while the sender processes waitalls and the receiver seems to stall on the next recv. Then suddenly everything completes.

I'm not sure why it isn't smoother with async progress? I thought my testcase was hung and was originally trying to debug that. It normally completes in 30 seconds but took 20 minutes. 40x runtime isn't the sort of scaling I expected by increasing 7x the number of isends.

I've attached the "testcase in the shared work area" referenced above to this ticket. The test was run like so:

$ /bgusr/bobc/bgq/work/comm/gcc/bin/mpicc -o eagerdt /bgusr/bobc/bgq/comm/lib/dev/mpich2/test/mpi/pt2pt/eagerdt.c -g -DMAX_MSGS=206000

$ runjob --label --cwd `pwd` --exe eagerdt --ranks-per-node 16 --block R01-M0-N02

Later on, when this patch was created, the following was added to the discussion:

I think the WAIT behavior is because of the debugger code, which does queue SENDs and then has to unqueue them during completion. Because the queue is a simple, single-ended, linked list the entire list has to be scanned to find the one to remove, and since we complete in order that is always the last one on the list. The debugger SEND queue is single-ended because it uses the same field(s) in the request as RECVs use for posted/unexpected queues, which don't need to be double-ended. It may be sufficient to keep another "tail - 1" pointer for the debugger SEND queue, since the SENDs are almost always completed in the order they were posted.

Turns out, the new MPICH2 debugger SEND queue uses a meta-data structure wrapped around the request, so we can make it a DE queue without much difficulty. Then, SENDs can be removed anywhere in the queue.

The fix I used to test was to over-load the MPID_Request-> field to save a pointer to the debugger meta-structure, since that was the difficulty when removing a SEND from the queue, in addition to making the meta-structure use a DE queue. Performance for WAIT then "normalizes", effectively flat and quick.

Changed 5 years ago by blocksom

comment:15 Changed 5 years ago by blocksom

I rebased this to master, pushed as branch mpich-ibm/ticket-1836-alt, and deleted branch mpich-ibm/ticket-1836.

comment:16 Changed 5 years ago by blocksom

I found the original "debugger" commits, fixed up the code so that it doesn't use a pamid-specific field in the request struct, and pushed everything to the branch mpich-ibm/ticket-1836.

It compiles for pamid:bgq and also ch3 (I think - I just did a configure without any options).

Please review and comment. Thanks!

comment:17 Changed 5 years ago by raffenet

  • Resolution set to fixed
  • Status changed from reopened to closed

I pulled in these changes and was able to successfully view the message queue using Totalview. Commits [e0a1cf6], [4e00ba3], and [92a7e41] pushed to master.

comment:18 Changed 5 years ago by balaji

Here are the commits with 8-character hashes (trac seems to have trouble creating links for the default 7-character hashes): [e0a1cf61], [4e00ba38], and [92a7e411].

Note: See TracTickets for help on using tickets.