Opened 5 years ago

Closed 5 years ago

#451 closed bug (invalid)

Hydra failures on octagon

Reported by: Pavan Balaji <balaji@…> Owned by: balaji
Priority: major Milestone: mpich2-1.1rc1
Component: mpich Keywords:
Cc:

Description


The build fails with hydra and enable-strict on octagon.

make[5]: Leaving directory `/sandbox/thakur/tmp/src/pm/hydra/utils/launch'
make[5]: Entering directory `/sandbox/thakur/tmp/src/pm/hydra/utils/signals'
   CC
/homes/thakur/cvs/mpich2/src/pm/hydra/utils/signals/signals.c
/homes/thakur/cvs/mpich2/src/pm/hydra/utils/signals/signals.c: In
function ‘HYDU_Set_signal’:
/homes/thakur/cvs/mpich2/src/pm/hydra/utils/signals/signals.c:13: error:
storage size of ‘act’ isn’t known
/homes/thakur/cvs/mpich2/src/pm/hydra/utils/signals/signals.c:21:
warning: implicit declaration of function ‘sigaction’
/homes/thakur/cvs/mpich2/src/pm/hydra/utils/signals/signals.c:13:
warning: unused variable ‘act’
make[5]: *** [signals.o] Error 1
make[5]: Leaving directory `/sandbox/thakur/tmp/src/pm/hydra/utils/signals'
make[4]: *** [all-redirect] Error 2
make[4]: Leaving directory `/sandbox/thakur/tmp/src/pm/hydra/utils'
make[3]: *** [all-redirect] Error 2
make[3]: Leaving directory `/sandbox/thakur/tmp/src/pm/hydra'
make[2]: *** [all-redirect] Error 1
make[2]: Leaving directory `/sandbox/thakur/tmp/src/pm'
make[1]: *** [all-redirect] Error 2
make[1]: Leaving directory `/sandbox/thakur/tmp/src'
make: *** [all-redirect] Error 2

Change History (10)

comment:1 Changed 5 years ago by Pavan Balaji

  • id set to 451

This message has 0 attachment(s)

comment:2 Changed 5 years ago by balaji

  • Milestone set to mpich2-1.1b2
  • Summary changed from MPICH2 failures on octagon to Hydra failures on octagon

comment:3 Changed 5 years ago by balaji

  • Owner set to balaji

comment:4 Changed 5 years ago by Rajeev Thakur

All tests fail will the following error. I am using setenv
HYDRA_USE_LOCALHOST 1

octagon:/sandbox/thakur/tmp% mpiexec -n 1 examples/cpi
HYD_PMCU_pmi_get_appnum (346): could not find the process structure
HYD_PMCD_Central_cb (169): PMI server function returned an error
HYD_DMX_Wait_for_event (169): callback returned error status
HYD_CSI_Wait_for_completion (28): demux engine returned error when waiting
for event
main (111): control system returned error when waiting for process'
completion

comment:5 Changed 5 years ago by balaji

Is this a different problem that the above --strict-compile issue or within the same configuration?

I'll look at the strict compile stuff for 1.1b2. I'm looking into the above type of failures for 1.1b1 based on the new nightly tests (though I can't seem to be able to reproduce them).

comment:6 Changed 5 years ago by Rajeev Thakur

I replied to the wrong mail. This should have gone in ticket #453 (Warnings
with hydra on octagon with thread-multiple).


> -----Original Message-----
> From: mpich2-bugs-bounces@mcs.anl.gov
> [mailto:mpich2-bugs-bounces@mcs.anl.gov] On Behalf Of mpich2
> Sent: Thursday, March 12, 2009 3:47 PM
> To: undisclosed-recipients:
> Subject: Re: [mpich2-maint] #451: Hydra failures on octagon
>
> -----------------------------------------------+--------------
> --------------
>  Reporter:  Pavan Balaji <balaji@mcs.anl.gov>  |
> Owner:  balaji
>      Type:  bug                                |
> Status:  new
>  Priority:  major                              |
> Milestone:  mpich2-1.1b2
> Component:  mpich2                             |
> Resolution:
>  Keywords:                                     |
> -----------------------------------------------+--------------
> --------------
>
>
> Comment (by balaji):
>
>  Is this a different problem that the above --strict-compile issue or
>  within the same configuration?
>
>  I'll look at the strict compile stuff for 1.1b2. I'm looking into the
>  above type of failures for 1.1b1 based on the new nightly
> tests (though I
>  can't seem to be able to reproduce them).
>
> --
> Ticket URL:
> <https://trac.mcs.anl.gov/projects/mpich2/ticket/451#comment:5>
>

comment:7 Changed 5 years ago by Rajeev Thakur

Do you want to move even a build failure to b2?

>  I'll look at the strict compile stuff for 1.1b2. I'm looking into the
>  above type of failures for 1.1b1 based on the new nightly
> tests (though I
>  can't seem to be able to reproduce them).

comment:8 Changed 5 years ago by balaji

It seems to compile fine on the newer machines (e.g., breadboard or my laptop). It's probably some old compiler weirdness, which I can look into after this release.

comment:9 Changed 5 years ago by Rajeev Thakur

I get the same error on bblogin. Configure with --with-pm=hydra --enable-strict


make[5]: Leaving directory `/sandbox/thakur/mpich2/src/pm/hydra/utils/env'
make[5]: Entering directory `/sandbox/thakur/mpich2/src/pm/hydra/utils/launch'
  CC              allocate.c
  CC              launch.c
  AR cr ../../lib/libhydra.a allocate.o launch.o
ranlib ../../lib/libhydra.a
date > .libstamp0
make[5]: Leaving directory `/sandbox/thakur/mpich2/src/pm/hydra/utils/launch'
make[5]: Entering directory `/sandbox/thakur/mpich2/src/pm/hydra/utils/signals'
  CC              signals.c
signals.c: In function ‘HYDU_Set_signal’:
signals.c:13: error: storage size of ‘act’ isn’t known
signals.c:21: warning: implicit declaration of function ‘sigaction’
signals.c:13: warning: unused variable ‘act’
make[5]: *** [signals.o] Error 1
make[5]: Leaving directory `/sandbox/thakur/mpich2/src/pm/hydra/utils/signals'
make[4]: *** [all-redirect] Error 2
make[4]: Leaving directory `/sandbox/thakur/mpich2/src/pm/hydra/utils'
make[3]: *** [all-redirect] Error 2
make[3]: Leaving directory `/sandbox/thakur/mpich2/src/pm/hydra'
make[2]: *** [all-redirect] Error 1
make[2]: Leaving directory `/sandbox/thakur/mpich2/src/pm'
make[1]: *** [all-redirect] Error 2
make[1]: Leaving directory `/sandbox/thakur/mpich2/src'
make: *** [all-redirect] Error 2

> -----Original Message-----
> From: mpich2-bugs-bounces@mcs.anl.gov
> [mailto:mpich2-bugs-bounces@mcs.anl.gov] On Behalf Of mpich2
> Sent: Thursday, March 12, 2009 4:04 PM
> To: undisclosed-recipients:
> Subject: Re: [mpich2-maint] #451: Hydra failures on octagon
>
> -----------------------------------------------+--------------
> --------------
>  Reporter:  Pavan Balaji <balaji@mcs.anl.gov>  |
> Owner:  balaji
>      Type:  bug                                |
> Status:  new
>  Priority:  major                              |
> Milestone:  mpich2-1.1b2
> Component:  mpich2                             |
> Resolution:
>  Keywords:                                     |
> -----------------------------------------------+--------------
> --------------
>
>
> Comment (by balaji):
>
>  It seems to compile fine on the newer machines (e.g.,
> breadboard or my
>  laptop). It's probably some old compiler weirdness, which I
> can look into
>  after this release.
>
> --
> Ticket URL:
> <https://trac.mcs.anl.gov/projects/mpich2/ticket/451#comment:8>
>

comment:10 Changed 5 years ago by balaji

  • Resolution set to invalid
  • Status changed from new to closed

This seems to be a problem with --enable-strict that Dave had pointed out a long time back. We should use --enable-strict=posix if we want the configure to find sigaction and friends. I'll add this information in the release notes. Resolving this ticket.

Note: See TracTickets for help on using tickets.