Opened 7 years ago

Closed 5 years ago

#1539 closed feature (wontfix)

Embedded mpiexec within mpi process fails with errors

Reported by: Pramod Chandraiah <pramodc@…> Owned by: balaji
Priority: major Milestone: future
Component: mpich Keywords:
Cc:

Description (last modified by balaji)

I have an application where I need to call mpiexec from within a child
process launched by mpiexec. I am using "system()" to call the mpiexec
process from the child process. I am using mpich2-1.4.1 and the hydra
process manger. The errors I see are below. I am attaching the source
file main.c. Let me know what I am doing wrong here and if you need
more information.

To compile:

/home/install/mpich/mpich2-1.4.1/linux_x86_64//bin/mpicc   main.c
-I/home/install/mpich/mpich2-1.4.1/linux_x86_64/include

When I run the test on multiple nodes I get the following errors:

mpiexec -n 3 -f hosts.list a.out

proxy:0:0@machine3] HYDU_create_process
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/utils/launch/launch.c:36):
dup2 error (Bad file descriptor)
[proxy:0:0@machine3] launch_procs
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:751):
create process returned error
[proxy:0:0@machine3] HYD_pmcd_pmip_control_cmd_cb
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:935):
launch_procs returned error
[proxy:0:0@machine3] HYDT_dmxu_poll_wait_for_event
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/tools/demux/demux_poll.c:77):
callback returned error status
[proxy:0:0@machine3] main
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmip.c:226):
demux engine error waiting for event
[mpiexec@machine1.abc.com] control_cb
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:215):
assert (!closed) failed
[mpiexec@machine1.abc.com] HYDT_dmxu_poll_wait_for_event
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/tools/demux/demux_poll.c:77):
callback returned error status
[mpiexec@machine1.abc.com] HYD_pmci_wait_for_completion
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:181):
error waiting for event
[mpiexec@machine1.abc.com] main
(/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/ui/mpich/mpiexec.c:405):
process manager error waiting for completion

On a single node I get the following.

mpiexec -n 3 a.out
[proxy:0:0@machine1.abc.com] [proxy:0:0@machine1.abc.com] Killed

Attachments (1)

main.c (1.4 KB) - added by Pramod Chandraiah <pramodc@…> 7 years ago.

Download all attachments as: .zip

Change History (5)

Changed 7 years ago by Pramod Chandraiah <pramodc@…>

comment:1 Changed 7 years ago by balaji

  • Description modified (diff)

Reformatted trac entry.

comment:2 Changed 7 years ago by balaji

  • Milestone set to mpich2-1.5
  • Owner set to balaji
  • Status changed from new to assigned
  • Type changed from bug to feature

comment:3 Changed 6 years ago by goodell

  • Milestone changed from mpich2-1.5 to future

comment:4 Changed 5 years ago by balaji

  • Description modified (diff)
  • Resolution set to wontfix
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.