Opened 2 years ago

Last modified 4 weeks ago

#1782 new bug

stdin returns EAGAIN after system("mpiexec ...")

Reported by: goodell Owned by: raffenet
Priority: minor Milestone: mpich-3.2
Component: mpich Keywords:
Cc:

Description (last modified by balaji)

Originally reported on stackoverflow.com: http://stackoverflow.com/questions/14167620/stdin-seems-to-be-broken-after-call-to-system-invoking-mpiexec

Basically, a read syscall on fd 0 is returning EAGAIN for some reason after a parent process runs mpiexec in a subshell. For higher-level libraries like std::getline or libc's getline, this usually translates into something that looks like stdin has been closed.

I poked at it a while and couldn't figure out what's going on. The strace does not show mpiexec doing anything surprising. For example, we are not explicitly setting fd 0 to O_NONBLOCK AFAICS.

For debugging, I intentionally "broke" the proxy so that mpiexec could not exec it and the problem still occurs. The problem also still occurs whether or not the proxy is launched locally with fork+exec or with SSH. So whatever is inducing the problem lives entirely in the mpiexec executable.

Change History (4)

comment:1 Changed 2 years ago by buntinas

May be related to #1622???
-d

comment:2 Changed 19 months ago by balaji

  • Description modified (diff)
  • Milestone changed from mpich-3.1 to mpich-3.1.1

comment:3 Changed 11 months ago by balaji

  • Owner changed from balaji to raffenet

comment:4 Changed 4 weeks ago by balaji

  • Milestone changed from mpich-3.1.4 to mpich-3.2

Milestone mpich-3.1.4 deleted

Note: See TracTickets for help on using tickets.