Opened 9 years ago

Last modified 5 years ago

#304 new bug

Mem leak during error condns in MPIR/MPIC* funcs

Reported by: jayesh Owned by:
Priority: major Milestone: future
Component: mpich Keywords:
Cc:

Description (last modified by balaji)

This is a placeholder to remind us to cleanup memory in error cases for MPIR/MPIC* functions.
eg: In bcast.c we have the following code,

MPIR_Bcast(){
...
  if (!is_contig || !is_homogeneous)
  {
      tmp_buf = MPIU_Malloc(nbytes);
      ...
  }
  ...
  if ((nbytes < MPIR_BCAST_SHORT_MSG) || (comm_size < MPIR_BCAST_MIN_PROCS))
  {
      ...
      while (mask < comm_size)
      {
          if (relative_rank & mask)
          {
              ...
              if (mpi_errno != MPI_SUCCESS) {
                  /* FIXME: tmp_buf NOT FREED IN THIS CASE */
                  MPIU_ERR_POP(mpi_errno);
              }
              ...
          }
          mask <<= 1;
      }
      ...
 }

  if (!is_contig || !is_homogeneous)
  {
      ...
      MPIU_Free(tmp_buf);
  }

fn_exit: ...
fn_fail: ...
}

There are many cases like these in the MPIR/MPIC* funcs.
A good fix would be to get rid of MPIU_Malloc() and use MPIU_CHKLMEM_MALLOC()/MPIU_CHKLMEM_FREEALL() instead.

Regards,
Jayesh

Change History (12)

comment:1 Changed 9 years ago by balaji

  • Milestone changed from mpich2-1.1b1 to mpich2-1.1b2

comment:2 Changed 9 years ago by balaji

  • Owner set to jayesh

This had come up a few years back and one of the issues that was brought up is that allocation from the stack might fail if the allocated size is large (e.g., in the collectives). Bill had added CHKLBIGMEM at that time, which should probably be used here. CHKPMEM might also work, but we don't use that for memory local to a routine.

Jayesh: I've assigned this to you. Can you give it a shot?

comment:3 Changed 9 years ago by goodell

Can't we just make the CHKLMEM macro use malloc/free instead of alloca when the size is larger than some reasonable threshold (4kiB?). I don't like the idea of embedding memory allocation decisions between CHKLMEM and CHKLBIGMEM at several places in higher level code like the collectives. I'd much prefer a single knob in the CHKLMEM routines or an environment variable somewhere.

-Dave

comment:4 Changed 9 years ago by jayesh

  • Milestone changed from mpich2-1.1rc1 to mpich2-1.1.1

I won't have bandwidth to fix this for 1.1rc1. Moving to 1.1.1

Regards,
Jayesh

comment:5 Changed 8 years ago by jayesh

  • Milestone changed from mpich2-1.1.1 to mpich2-1.1.2

comment:6 Changed 8 years ago by balaji

  • Milestone changed from mpich2-1.1.2 to mpich2-1.2

Milestone mpich2-1.1.2 deleted

comment:7 Changed 8 years ago by jayesh

  • Milestone changed from mpich2-1.2.1 to mpich2-1.3

comment:8 Changed 8 years ago by jayesh

  • Milestone changed from mpich2-1.3 to mpich2-1.3.1

Moving to 1.3.1

comment:9 Changed 7 years ago by jayesh

  • Milestone changed from mpich2-1.3.2 to mpich2-1.3.3

comment:10 Changed 7 years ago by balaji

  • Milestone changed from mpich2-1.3.3 to mpich2-1.4

Milestone mpich2-1.3.3 deleted

comment:11 Changed 7 years ago by balaji

  • Description modified (diff)
  • Milestone changed from mpich2-1.4 to future

comment:12 Changed 5 years ago by balaji

  • Description modified (diff)
  • Owner jayesh deleted
Note: See TracTickets for help on using tickets.