Opened 2 years ago

Closed 16 months ago

#1742 closed bug (fixed)

MPI_SUCCESS returned but erroneous write done when trying to write between [2^31 - 4096, 2^31 [ bytes

Reported by: ericch Owned by: robl
Priority: major Milestone: future
Component: mpich Keywords:
Cc: robl@…

Description (last modified by robl)

MPI IO is not correctly writing more than (2^31 - 4096) but less than 2^31 bytes even if MPI_SUCCESS is returned...

Also, since there IS a limit of 2^31 bytes, I propose that MPI should give a #define for describing this limit in bytes, something like MPIIO_MAX_BYTES_PER_TRANSACTION for the programmer to be able to write code base on this #define...

Moreover, I suggest to raise this limit to 2^64, since it is not a worthy exercise to write code that reads by bunch of MPIIO_MAX_BYTES_PER_TRANSACTION bytes...

Here is the output of the test code included:

---------------------------------------------------------------

----------------------------------------
We try to write 268435455 long int(2147483640 bytes)
----------------------------------------

Wrote everything with and MPI_file_write returned OK...

Readed everything with and MPI_file_write returned OK...

***********************************************
ERROR! array is WRONG at indice:268434944, the wrong value is: -1

This is indice 511 from the END of the array
  or offset 4088 bytes from the END of the array
***********************************************
---------------------------------------------------------------

and attached is a very simple test code that demonstrates this problem.

Thanks,

Eric

Attachments (2)

write_3Gb.cc (3.2 KB) - added by ericch 2 years ago.
The test code to clearly demonstrat the wrong behavior
write_by2GB.cc (5.2 KB) - added by ericch 2 years ago.
Code for writing by blocs of ~2GB to bypass the MPI limitation

Download all attachments as: .zip

Change History (8)

Changed 2 years ago by ericch

The test code to clearly demonstrat the wrong behavior

comment:1 Changed 2 years ago by ericch

  • Milestone changed from mpich-3.0 to future

I have to add this information:

The test code is clearly showing that there is something wrong with either the read OR write.

If you take a look at the binary file, you will see that the numbers written after (2^31 - 4096 bytes) are *wrong*. So the writing *is* wring.

If you write correctly that many bytes, you will see that the reading is wrong too!

Here is a second example (write_by2GB.cc) in which I write by blocs of (2^31 -4096) bytes to bypass this. You can verify that the reading is really wrong with that test.

Eric

Changed 2 years ago by ericch

Code for writing by blocs of ~2GB to bypass the MPI limitation

comment:2 Changed 2 years ago by thakur

  • Cc robl@… added
  • Owner set to robl
  • Status changed from new to assigned

comment:3 Changed 2 years ago by goodell

It's unfortunately not surprising that this is currently broken. We recently enabled -Wshorten-64-to-32, which should go a long way towards helping us correct this bug (once we fix all instances), but further work may also be needed.

comment:4 Changed 16 months ago by robl

  • Description modified (diff)

Sorry it took me so long to look at this. The fix is actually pretty easy: write(2) and read(2) don't *have* to read or write "count" bytes. I added looping to the syscalls.

Still to do:

  • The status object is almost definitely not being updated correctly.
  • I have only fixed this in ADIOI_GEN_ReadContig and WriteContig?. Other file systems will need to be updated.

comment:5 Changed 16 months ago by robl

  • Description modified (diff)

comment:6 Changed 16 months ago by robl

  • Resolution set to fixed
  • Status changed from assigned to closed

I pushed a fix in [7d44307f] -- the comment says "partial" but only because I had not looked at the Blue Gene drivers yet, and they do not exhibit this specific bug.

Note: See TracTickets for help on using tickets.