stdin returns EAGAIN after system("mpiexec ...")
|Reported by:||goodell||Owned by:||raffenet|
Description (last modified by balaji)
Originally reported on stackoverflow.com: http://stackoverflow.com/questions/14167620/stdin-seems-to-be-broken-after-call-to-system-invoking-mpiexec
Basically, a read syscall on fd 0 is returning EAGAIN for some reason after a parent process runs mpiexec in a subshell. For higher-level libraries like std::getline or libc's getline, this usually translates into something that looks like stdin has been closed.
I poked at it a while and couldn't figure out what's going on. The strace does not show mpiexec doing anything surprising. For example, we are not explicitly setting fd 0 to O_NONBLOCK AFAICS.
For debugging, I intentionally "broke" the proxy so that mpiexec could not exec it and the problem still occurs. The problem also still occurs whether or not the proxy is launched locally with fork+exec or with SSH. So whatever is inducing the problem lives entirely in the mpiexec executable.
Change History (4)
comment:2 Changed 18 months ago by balaji
- Description modified (diff)
- Milestone changed from mpich-3.1 to mpich-3.1.1