PTL Logo

(Local Copy)Fault Tolerance Research @ Open Systems Laboratory

Transparent Checkpoint/Restart in Open MPI

  •  

Configure Options

MCA Parameters

C/R API MPI Extension

Deprecated Options

--with-ft

This configure option specifies the type of fault tolerance to enable in the Open MPI build. By default no fault tolerance is enabled, which is the same as if the option --without-ft was specified. Currently only the cr option is supported.

./configure --with-ft=cr

Back to top

--enable-ft-thread

This option enables a concurrent thread to assist the application in making progress on a checkpoint operation when not inside the MPI library. To enable this feature you must enable MPI threads in addition to the checkpointing thread. By default this is disabled.

./configure --with-ft=cr --enable-ft-thread --enable-mpi-threads

After r22841 the --enable-mpi-threads was replaced by --enable-opal-multi-threads. So you should use the following instead:

./configure --enable-ft-thread --with-ft=cr --enable-opal-multi-threads

Back to top

--with-blcr

This option specifies the path to the installation of the BLCR library. It is strongly suggested that users specify this option to ensure that the proper BLCR installation is selected.

./configure --with-ft=cr --with-blcr=/opt/blcr/

Back to top

--with-blcr-libdir

This option specifies the path to the library path specific to the installation of the BLCR library.

./configure --with-ft=cr --with-blcr=/opt/blcr/ --with-blcr-libdir=/opt/blcr/lib64

Back to top

--enable-crdebug

Introduced in r23587. Included in v1.5.1 and later releases.

This option activates the Checkpoint/Restart-enabled debugging support. See C/R Enabled Debugging

./configure --with-ft=cr --enable-crdebug

Back to top

-am ft-enable-cr

To enable checkpoint/restart fault tolerance for an MPI application you must use the Aggregate MCA parameter ft-enable-cr. This will enable the best available checkpoint/restart fault tolerance components currently available.

shell$ mpirun -am ft-enable-cr my-app

Back to top

-am ft-enable-cr-recovery

Introduced in r23629. Included in v1.5.1 and later releases.

To enable checkpoint/restart fault tolerance with automatic recovery and/or process migration for an MPI application you must use the Aggregate MCA parameter ft-enable-cr-recovery. This will enable the best available checkpoint/restart fault tolerance components currently available.

shell$ mpirun -am ft-enable-cr-recovery my-app

Back to top

--mca ompi_cr_verbose

Verbose output for the OMPI layer Checkpoint/Restart functionality.
Default: 0 (off)

shell$ mpirun --mca ompi_cr_verbose 10 -am ft-enable-cr my-app

Back to top

--mca orte_cr_verbose

Verbose output for the ORTE layer Checkpoint/Restart functionality.
Default: 0 (off)

shell$ mpirun --mca orte_cr_verbose 10 -am ft-enable-cr my-app

Back to top

--mca opal_cr_verbose

Verbose output for the OPAL layer Checkpoint/Restart functionality.
Default: 0 (off)

shell$ mpirun --mca opal_cr_verbose 10 -am ft-enable-cr my-app

Back to top

--mca ft_cr_enabled

Enable fault tolerance for this program.
Default: 0 (disabled)
Automatically enabled by ft-enable-cr. The user should never need to set this parameter.

Back to top

--mca opal_cr_enable_timer

Enable checkpoint timer
Default: 0 (disabled)

shell$ mpirun --mca opal_cr_enable_timer 1 -am ft-enable-cr my-app

Back to top

--mca opal_cr_enable_timer_barrier

Enable checkpoint timer barrier between stages to control for process skew.
Default: 0 (disabled)

shell$ mpirun --mca opal_cr_enable_timer_barrier 1 --mca opal_cr_enable_timer 1 \
              -am ft-enable-cr my-app

Back to top

--mca opal_cr_timer_target_rank

MPI rank that should display the checkpoint timer.
Default: 0

shell$ mpirun --mca opal_cr_timer_target_rank 2 \
              --mca opal_cr_enable_timer 1 \
              -am ft-enable-cr my-app

Back to top

--mca opal_cr_use_thread

Use an asynchronous thread to checkpoint this program.
Default: 0 (off)
Automatically enabled by ft-enable-cr when built with --enable-ft-thread. The user should never need to set this parameter.

Back to top

--mca opal_cr_thread_sleep_check

Time for the checkpoint thread to sleep between checking for a checkpoint.
Default: 0 microseconds

shell$ mpirun --mca opal_cr_thread_sleep_check 10 -am ft-enable-cr my-app

Back to top

--mca opal_cr_thread_sleep_wait

Time for the checkpoint thread to sleep when waiting for a process to exit the MPI library.
Default: 1000 microseconds (changed from 0 for v1.5 and later)

shell$ mpirun --mca opal_cr_thread_sleep_wait 10 -am ft-enable-cr my-app

Back to top

--mca opal_cr_is_tool

Is this a tool program, meaning does it require a fully operational OPAL or just enough to exec.
Default: 0 (false)
Automatically enabled when needed. The user should never need to set this parameter.

Back to top

--mca opal_cr_signal

Checkpoint/Restart signal used to initialize an OPAL Only checkpoint of a program.
Default: SIGUSR1

shell$ mpirun --mca opal_cr_signal 14 -am ft-enable-cr my-app

Back to top

--mca opal_cr_debug_sigpipe

Activate a signal handler for debugging SIGPIPE Errors that can happen on restart.
Default: 0 (disabled)

shell$ mpirun --mca opal_cr_debug_sigpipe 1 -am ft-enable-cr my-app

Back to top

--mca opal_cr_tmp_dir

Temporary directory to place rendezvous files for a checkpoint. Note that this is not the checkpoint storage directory, but should be a local file system to the machine.
Default: "/tmp"

shell$ mpirun --mca opal_cr_tmp_dir /tmp/ramdisk/ -am ft-enable-cr my-app

Back to top

--mca crs

Which CRS component to use
Default: NULL (auto-select)

shell$ mpirun --mca crs blcr -am ft-enable-cr my-app

Back to top

--mca crs_base_verbose

Set the verbose level for the CRS framework.
Default: 0 (off)

shell$ mpirun --mca crs_base_verbose 10 -am ft-enable-cr my-app

Back to top

--mca crs_blcr_priority

Set the Priority of the CRS BLCR component. The component with the highest priority wins.
Default: 50

shell$ mpirun --mca crs_blcr_priority 100 -am ft-enable-cr my-app

Back to top

--mca crs_blcr_verbose

Set the verbose level of the CRS BLCR component.
Default: 0 (set to match crs_base_verbose)

shell$ mpirun --mca crs_blcr_verbose 10 -am ft-enable-cr my-app

Back to top

--mca crs_blcr_dev_null

Save the local checkpoint to /dev/null. Note: This is not for general use. It is a benchmarking and debugging option that should be used with care.
Default: 0 (disabled)

shell$ mpirun --mca crs_blcr_dev_null 1 -am ft-enable-cr my-app

Back to top

--mca crs_self_priority

Set the Priority of the CRS SELF component. Only selected if lt_dlsym can find functions in the user program with the correct signatures. The component with the highest priority wins.
Default: 20

shell$ mpirun --mca crs_self_priority 100 -am ft-enable-cr my-app

Back to top

--mca crs_self_verbose

Set the verbose level of the CRS SELF component.
Default: 0 (set to match crs_base_verbose)

shell$ mpirun --mca crs_self_verbose 10 -am ft-enable-cr my-app

Back to top

--mca crs_self_prefix

Prefix for the user defined callback functions.
Default: "opal_crs_self_user"

shell$ mpirun --mca crs_self_prefix my_foo -am ft-enable-cr my-app

Back to top

--mca crs_self_do_restart

Start execution by calling the restart callback during MPI_INIT.
Default: 0 (disabled)
Automatically enabled when needed. The user should never need to set this parameter.

Back to top

--mca compress

Which Compress component to use
Default: NULL (auto-select)

shell$ mpirun --mca compress gzip \
              --mca sstore stage \
              --mca sstore_stage_compress 1 \
              -am ft-enable-cr my-app

Back to top

--mca compress_base_verbose

Set the verbose level for the Compress framework.
Default: 0 (off)

shell$ mpirun --mca compress_base_verbose 10 -am ft-enable-cr my-app

Back to top

--mca compress_gzip_priority

Set the Priority of the Compress gzip component. The component with the highest priority wins.
Default: 15

shell$ mpirun --mca compress gzip \
              --mca compress_gzip_priority 100 \
              --mca sstore stage \
              --mca sstore_stage_compress 1 \
              -am ft-enable-cr my-app

Back to top

--mca compress_gzip_verbose

Set the verbose level for the Compress gzip component.
Default: 0 (off)

shell$ mpirun --mca compress gzip \
              --mca compress_gzip_verbose 10 \
              --mca sstore stage \
              --mca sstore_stage_compress 1 \
              -am ft-enable-cr my-app

Back to top

--mca compress_bzip_priority

Set the Priority of the Compress bzip component. The component with the highest priority wins.
Default: 10

shell$ mpirun --mca compress bzip \
              --mca compress_bzip_priority 100 \
              --mca sstore stage \
              --mca sstore_stage_compress 1 \
              -am ft-enable-cr my-app

Back to top

--mca compress_bzip_verbose

Set the verbose level for the Compress bzip component.
Default: 0 (off)

shell$ mpirun --mca compress bzip \
              --mca compress_bzip_verbose 10 \
              --mca sstore stage \
              --mca sstore_stage_compress 1 \
              -am ft-enable-cr my-app

Back to top

--mca filem

Which FileM component to use
Default: NULL (auto-select)

shell$ mpirun --mca filem rsh -am ft-enable-cr my-app

Back to top

--mca filem_base_verbose

Set the verbose level for the FileM framework.
Default: 0 (off)

shell$ mpirun --mca filem_base_verbose 10 -am ft-enable-cr my-app

Back to top

--mca filem_rsh_priority

Set the Priority of the FileM RSH component. The component with the highest priority wins.
Default: 50

shell$ mpirun --mca filem_rsh_priority 100 -am ft-enable-cr my-app

Back to top

--mca filem_rsh_verbose

Set the verbose level of the FileM RSH component.
Default: 0 (set to match filem_base_verbose)

shell$ mpirun --mca filem_rsh_verbose 10 -am ft-enable-cr my-app

Back to top

--mca filem_rsh_rcp

The rsh Default: "scp"

shell$ mpirun --mca filem_rsh_rcp rcp -am ft-enable-cr my-app

Back to top

--mca filem_rsh_rsh

The rsh Default: "ssh"

shell$ mpirun --mca filem_rsh_rsh rsh -am ft-enable-cr my-app

Back to top

--mca filem_rsh_cp

The UNIX cp command to use for local copy operations. Useful when moving files from a local file system to a globally mounted file system (see sstore_stage_global_is_shared for more information).
Default: "cp"

shell$ mpirun --mca filem_rsh_cp my_cp -am ft-enable-cr my-app

Back to top

--mca filem_rsh_max_incomming

Maximum number of incomming connections (0 = any)
Default: 10

shell$ mpirun --mca filem_rsh_max_incomming 50 -am ft-enable-cr my-app

Back to top

--mca filem_rsh_progress_meter

Introduced in r23587. Included in v1.5.1 and later releases.

Display Progress every X percentage done.
Default: 0 (off)

shell$ mpirun --mca filem_rsh_progress_meter 10 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca snapc

Which SnapC component to use
Default: NULL (auto-select)

shell$ mpirun --mca snapc full -am ft-enable-cr my-app

Back to top

--mca snapc_base_verbose

Set the verbose level for the SnapC framework.
Default: 0 (off)

shell$ mpirun --mca snapc_base_verbose 10 -am ft-enable-cr my-app

Back to top

--mca snapc_base_only_one_seq

Only store one sequence number (reusing the checkpoint directory)
Default: 0 (disabled)

shell$ mpirun --mca snapc_base_only_one_seq 1 -am ft-enable-cr my-app

Back to top

--mca snapc_full_priority

Set the Priority of the SnapC FULL component. The component with the highest priority wins.
Default: 20

shell$ mpirun --mca snapc_full_priority 100 -am ft-enable-cr my-app

Back to top

--mca snapc_full_verbose

Set the verbose level of the Snapc FULL component.
Default: 0 (set to match snapc_base_verbose)

shell$ mpirun --mca snapc_full_verbose 10 -am ft-enable-cr my-app

Back to top

--mca snapc_full_skip_app

Shortcut the application level coordination (do not start the INC or checkpoint operations in the local processes, just pretend to do so). Note: This is not for general use. It is a benchmarking and debugging option that should be used with care.
Default: 0 (disabled)

shell$ mpirun --mca snapc_full_skip_app 1 -am ft-enable-cr my-app

Back to top

--mca snapc_full_enable_timing

Enable checkpoint timing information
Default: 0 (disabled)

shell$ mpirun --mca snapc_full_enable_timing 1 -am ft-enable-cr my-app

Back to top

--mca snapc_full_max_wait_time

Maximum time to wait before daemon gives up on the checkpoint operation. (values less than or equal to 0 mean wait infinitely long).
Default: 20 seconds

shell$ mpirun --mca snapc_full_max_wait_time 60 -am ft-enable-cr my-app

Back to top

--mca snapc_full_progress_meter

Introduced in r23587. Included in v1.5.1 and later releases.

Display Progress every X percentage done.
Default: 0 (off)

shell$ mpirun --mca snapc_full_progress_meter 10 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca sstore

Introduced in r23587. Included in v1.5.1 and later releases.

Which SStore component to use
Default: NULL (auto-select)

shell$ mpirun --mca sstore stage -am ft-enable-cr my-app

Back to top

--mca sstore_base_verbose

Introduced in r23587. Included in v1.5.1 and later releases.

Set the verbose level for the SStore framework.
Default: 0 (off)

shell$ mpirun --mca sstore_base_verbose 10 -am ft-enable-cr my-app

Back to top

--mca sstore_base_global_snapshot_dir

Introduced in r23587. Included in v1.5.1 and later releases.

The base directory to use when storing global snapshots. This is the directory where all checkpoint files will be gathered during a checkpoint operation. Usually this is a globally mounted file system, but it does not need to be if using the stage SStore framework.
Default: $HOME

shell$ mpirun --mca sstore_base_global_snapshot_dir /home/me/ckpts \
              -am ft-enable-cr my-app

Back to top

--mca sstore_base_global_snapshot_ref

Introduced in r23587. Included in v1.5.1 and later releases.

Specify the global snapshot reference that should be used for this job.
Default: "ompi_global_snapshot_PID.ckpt" (where PID is the PID of the mpirun process)

shell$ mpirun --mca snapc_base_global_snapshot_ref my_ref \
              -am ft-enable-cr my-app

Back to top

--mca sstore_central_priority

Introduced in r23587. Included in v1.5.1 and later releases.

Set the Priority of the SStore Central component. The component with the highest priority wins.
Default: 20

shell$ mpirun --mca sstore_central_priority 100 -am ft-enable-cr my-app

Back to top

--mca sstore_central_verbose

Introduced in r23587. Included in v1.5.1 and later releases.

Set the verbose level of the SStore Central component.
Default: 0 (set to match sstore_base_verbose)

shell$ mpirun --mca sstore_central_verbose 10 -am ft-enable-cr my-app

Back to top

--mca sstore_stage_priority

Introduced in r23587. Included in v1.5.1 and later releases.

Set the Priority of the SStore Stage component. The component with the highest priority wins.
Default: 10

shell$ mpirun --mca sstore_stage_priority 100 -am ft-enable-cr my-app

Back to top

--mca sstore_stage_verbose

Introduced in r23587. Included in v1.5.1 and later releases.

Set the verbose level of the SStore Stage component.
Default: 0 (set to match sstore_base_verbose)

shell$ mpirun --mca sstore_stage_verbose 10 -am ft-enable-cr my-app

Back to top

--mca sstore_stage_global_is_shared

Introduced in r23587. Included in v1.5.1 and later releases.

If the sstore_base_global_snapshot_dir is on a shared file system that all nodes can access, then the checkpoint files can be copied more efficiently when FileM is used in conjunction with the stage SStore.
Default: 0 (disabled)

shell$ mpirun --mca sstore_stage_global_is_shared 1 -am ft-enable-cr my-app

Back to top

--mca sstore_stage_skip_filem

Introduced in r23587. Included in v1.5.1 and later releases.

Only pretend to move files using FileM. Note: This is not for general use. It is a benchmarking and debugging option that should be used with care.
Default: 0 (disabled)

shell$ mpirun --mca sstore_stage_skip_filem 1 -am ft-enable-cr my-app

Back to top

--mca sstore_stage_caching

Introduced in r23587. Included in v1.5.1 and later releases.

Maintain a node local cache of last checkpoint.
Default: 0 (disabled)

shell$ mpirun --mca sstore_stage_caching 1 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca sstore_stage_compress

Introduced in r23587. Included in v1.5.1 and later releases.

Compress local snapshots.
Default: 0 (disabled)

shell$ mpirun --mca sstore_stage_compress 1 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca sstore_stage_compress_delay

Introduced in r23587. Included in v1.5.1 and later releases.

Seconds to delay the start of compression on sync()
Default: 0

shell$ mpirun --mca sstore_stage_compress_delay 5 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca sstore_stage_progress_meter

Introduced in r23587. Included in v1.5.1 and later releases.

Display Progress every X percentage done.
Default: 0 (off)

shell$ mpirun --mca sstore_stage_progress_meter 10 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca errmgr_hnp_autor_enable

Introduced in r23629. Included in v1.5.1 and later releases.

Enable Automatic Recovery feature.
Default: 0 (disabled)
Automaticly enabled when using the -am ft-enable-cr-recovery parameter. So the below two command lines are equivalent.

shell$ mpirun --mca errmgr_hnp_autor_enable 1 \
              -am ft-enable-cr-recovery my-app
shell$ mpirun -am ft-enable-cr-recovery my-app

Back to top

--mca errmgr_hnp_autor_timing

Introduced in r23629. Included in v1.5.1 and later releases.

Enable Automatic Recovery timing information.
Default: 0 (disabled)

shell$ mpirun --mca errmgr_hnp_autor_timing 1 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca errmgr_hnp_autor_recovery_delay

Introduced in r23629. Included in v1.5.1 and later releases.

Number of seconds to wait before starting to recover the job after a failure.
Default: 1

shell$ mpirun --mca errmgr_hnp_autor_recovery_delay 10 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca errmgr_hnp_autor_skip_oldnode

Introduced in r23629. Included in v1.5.1 and later releases.

Skip the old node from failed proc, even if it is still available.
Default: 1 (Enabled)

shell$ mpirun --mca errmgr_hnp_autor_skip_oldnode 0 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca errmgr_hnp_crmig_enable

Introduced in r23629. Included in v1.5.1 and later releases.

Enable C/R migration feature.
Default: 0 (disabled)
Automaticly enabled when using the -am ft-enable-cr-recovery parameter. So the below two command lines are equivalent.

shell$ mpirun --mca errmgr_hnp_crmig_enable 1 \
              -am ft-enable-cr-recovery my-app
shell$ mpirun -am ft-enable-cr-recovery my-app

Back to top

--mca errmgr_hnp_crmig_timing

Introduced in r23629. Included in v1.5.1 and later releases.

Enable C/R migration timing information.
Default: 0 (disabled)

shell$ mpirun --mca errmgr_hnp_crmig_timing 1 \
              -am ft-enable-cr-recovery my-app

Back to top

--mca crcp

Which CRCP component to use
Default: NULL (auto-select)

shell$ mpirun --mca crcp bkmrk -am ft-enable-cr my-app

Back to top

--mca crcp_base_verbose

Set the verbose level for the CRCP framework.
Default: 0 (off)

shell$ mpirun --mca crcp_base_verbose 10 -am ft-enable-cr my-app

Back to top

--mca crcp_bkmrk_priority

Set the Priority of the CRCP BKMRK component. The component with the highest priority wins.
Default: 20

shell$ mpirun --mca crcp_bkmrk_priority 100 -am ft-enable-cr my-app

Back to top

--mca crcp_bkmrk_verbose

Set the verbose level of the CRCP BKMRK component.
Default: 0 (set to match crcp_base_verbose)

shell$ mpirun --mca crcp_bkmrk_verbose 10 -am ft-enable-cr my-app

Back to top

--mca crcp_bkmrk_timing

Enable performance timing for the Bookmark Exchange.
Default: 0 (disabled)

shell$ mpirun --mca crcp_bkmrk_timing 1 -am ft-enable-cr my-app

Back to top



Deprecated Options

--mca crs_base_snapshot_dir

This option has been deprecated as of r23587. v1.5.0 is the last release containing this option. All later releases should use the following:
sstore_stage_local_snapshot_dir.

Directory to use when storing local snapshots. Note that this is only used if you disable snapc_base_store_in_place.
Default: "/tmp"

shell$ mpirun --mca crs_base_snapshot_dir /tmp/ramdisk \
              --mca snapc_base_store_in_place 0 \
              -am ft-enable-cr my-app

Back to top

--mca snapc_base_global_snapshot_dir

This option has been deprecated as of r23587. v1.5.0 is the last release containing this option. All later releases should use the following:
sstore_base_global_snapshot_dir.

The base directory to use when storing global snapshots. This is the directory where all checkpoint files will be gathered during a checkpoint operation. Usually this is a globally mounted file system, but it does not need to be if using the FileM framework.
Default: $HOME

shell$ mpirun --mca snapc_base_global_snapshot_dir /home/me/ckpts \
              -am ft-enable-cr my-app

Back to top

--mca snapc_base_global_shared

This option has been deprecated as of r23587. v1.5.0 is the last release containing this option. All later releases should use the following:
sstore_stage_global_is_shared.

If the snapc_base_global_snapshot_dir is on a shared file system that all nodes can access, then the checkpoint files can be copied more efficiently when FileM is used.
Default: 0 (disabled)

shell$ mpirun --mca snapc_base_global_shared 1 -am ft-enable-cr my-app

Back to top

--mca snapc_base_store_in_place

This option has been deprecated as of r23587. v1.5.0 is the last release containing this option. All later releases should use the following:
The 'stage' component of SStore.

If the snapc_base_global_snapshot_dir is on a shared file system that all nodes can access, then the checkpoint files can be stored in place instead of incurring a remote copy.
Default: 1 (enabled)

shell$ mpirun --mca snapc_base_store_in_place 0 -am ft-enable-cr my-app

Back to top

--mca snapc_base_global_snapshot_ref

This option has been deprecated as of r23587. v1.5.0 is the last release containing this option. All later releases should use the following:
sstore_base_global_snapshot_ref.

Specify the global snapshot reference that should be used for this job.
Default: "ompi_global_snapshot_PID.ckpt" (where PID is the PID of the mpirun process)

shell$ mpirun --mca snapc_base_global_snapshot_ref my_ref -am ft-enable-cr my-app

Back to top

--mca snapc_base_establish_global_snapshot_dir

This option has been deprecated as of r23587. v1.5.0 is the last release containing this option. This option was removed since it was never well supported.

Establish the global snapshot directory on job startup, instead of on the first checkpoint operation. Note that this is currently only lightly tested, and may not work properly.
Default: 0 (disabled)

shell$ mpirun --mca snapc_base_establish_global_snapshot_dir 1 -am ft-enable-cr my-app

Back to top

--mca snapc_full_skip_filem

This option has been deprecated as of r23587. v1.5.0 is the last release containing this option. All later releases should use the following:
sstore_stage_skip_filem.

Only pretend to move files using FileM. Note: This is not for general use. It is a benchmarking and debugging option that should be used with care.
Default: 0 (disabled)

shell$ mpirun --mca snapc_full_skip_filem 1 -am ft-enable-cr my-app

Back to top



C/R API MPI Extension

--enable-mpi-ext=cr

Introduced in r23587. Included in v1.5.1 and later releases.

This configure option enables the optional C/R MPI Extension APIs.

./configure --with-ft=cr --enable-mpi-ext=cr

Back to top

OMPI_CR_Checkpoint

Introduced in r23587. Included in v1.5.1 and later releases.

All processes must call this function.

OMPI_CR_CHECKPOINT(handle, seq, info)
 OUT    handle     Global snapshot reference (string)
 OUT    seq        Sequence number (int)
 INOUT  info       A set of key-value pairs providing additional
                   information to the MPI implementation regarding how
                   to continue after quiescence (handle, significant on
                   all ranks)
int OMPI_CR_CHECKPOINT(char **handle,  int *seq, MPI_Info info);
#include 
#ifdef OPEN_MPI
  #include 
#endif
{ MPI_Init(argc, argv);
  for(i=0; i < max_iter; ++i) {
#ifdef OMPI_HAVE_MPI_EXT_CR
  // Request a checkpoint before every step
  OMPI_CR_Checkpoint(&handle, &seq, MPI_INFO_NULL);
#endif
  // Resume normal operation.
  }
}

Back to top

OMPI_CR_Restart

Introduced in r23587. Included in v1.5.1 and later releases.

Not all processes must call this function.

OMPI_CR_RESTART(handle, seq, info)
 IN     handle     Global snapshot reference (string)
 IN     seq        Sequence number (int)
 INOUT  info       A set of key-value pairs providing additional
                   information to the MPI implementation regarding how
                   to continue after quiescence (handle, significant on
                   all ranks)
int OMPI_CR_RESTART(char *handle, int seq, MPI_Info info);
#include 
#ifdef OPEN_MPI
  #include 
#endif
{ MPI_Init(argc, argv);
  for(i=0; i < max_iter; ++i) {
#ifdef OMPI_HAVE_MPI_EXT_CR
    // Request a checkpoint before every step
    OMPI_CR_Checkpoint(&handle, &seq);
#endif
    // Resume normal operation.
    if( MPI_SUCCESS != MPI_Send(...) ) {
#ifdef OMPI_HAVE_MPI_EXT_CR
      // Restart from the last checkpoint, and keep processing
      OMPI_CR_Restart(handle, seq, MPI_INFO_NULL);
#else
      MPI_Abort(MPI_COMM_WORLD, -1);
#endif
    }
  }
}

Back to top

OMPI_CR_Migrate

Introduced in r23587. Included in v1.5.1 and later releases.

This is a collective operation.

OMPI_CR_MIGRATE(comm, hostname, rank, info)
 IN     comm      Communicator of processes to migrate
 IN     hostname  Name of the machine to move this rank onto.
                  May be NULL. (string)
 IN     rank      Process rank to move this rank close to.
                  May be negative, indicating NULL. (int)
 INOUT  info      A set of key-value pairs providing hints to the MPI
                  implementation regarding how this function should
                  behave (handle, significant on all ranks)
int OMPI_CR_MIGRATE(MPI_Comm comm, char *hostname, int rank, MPI_Info info)
#include 
#ifdef OPEN_MPI
  #include 
#endif
{ MPI_Info qinfo;
  MPI_Init(argc, argv);

  for(i=0; i < max_iter; ++i) {
    // Receive notification that this node is going to fail
#ifdef OMPI_HAVE_MPI_EXT_CR
    // Asked to be migrated anywhere else in the system,
    //   except this node.
    MPI_Info_set(qinfo, "CR_OFF_NODE", "true");
    OMPI_CR_MIGRATE(MPI_COMM_SELF, NULL, -1, MPI_INFO_NULL);
#endif
    // Resume normal operation.
  }
}
#include 
#ifdef OPEN_MPI
  #include 
#endif
{ MPI_Init(argc, argv);
  ...
  // Stage 1: Communication Pattern A
  for(i=0; i < max_iter; ++i) {
    ...
  }

#ifdef OMPI_HAVE_MPI_EXT_CR
  // Since the communication pattern is changing,
  // re-position my processes by using process migration.
  neighbor_rank = get_best_neighbor(my_rank);
  OMPI_CR_MIGRATE(MPI_COMM_WORLD, NULL, neighbor_rank,
                  MPI_INFO_NULL);
#endif

  // Stage 2: Communication Pattern B
  for(i=0; i < max_iter; ++i) {
    ...
  }
}

Back to top

OMPI_CR_INC_register_callback

Introduced in r23587. Included in v1.5.1 and later releases.

This is a collective operation.

// INC Registration Function
int OMPI_CR_INC_register_callback(OMPI_CR_INC_callback_event_t event,
                                  OMPI_CR_INC_callback_function function,
                                  OMPI_CR_INC_callback_function *prev_function);
// INC Callback Function Signature
typedef int (*OMPI_CR_INC_callback_function)(OMPI_CR_INC_callback_event_t event,
                                             OMPI_CR_INC_callback_state_t state);
OMPI_CR_INC_callback_event_t
  OMPI_CR_INC_PRE_CRS_PRE_MPI  Pre-checkpoint, before OMPI INC.
  OMPI_CR_INC_PRE_CRS_POST_MPI Pre-checkpoint, after OMPI INC.
  OMPI_CR_INC_POST_CRS_PRE_MPI  Continue/Restart, before OMPI INC.
  OMPI_CR_INC_POST_CRS_POST_MPI Continue/Restart, after OMPI INC.

OMPI_CR_INC_callback_state_t
  OMPI_CR_INC_STATE_PREPARE  Pre-checkpoint
  OMPI_CR_INC_STATE_CONTINUE Continue
  OMPI_CR_INC_STATE_RESTART  Restart
  OMPI_CR_INC_STATE_ERROR    Error

Back to top

OMPI_CR_Quiesce_start

Introduced in r23587. Included in v1.5.1 and later releases.

This is a collective operation.

OMPI_CR_QUIESCE_START(comm, info)
 IN       comm        communicator (handle)
 INOUT    info        A set of key-value pairs providing hints to the MPI
                      implementation regarding how this function should
                      behave (handle, significant on all ranks)
int OMPI_CR_QUIESCE_START(MPI_Comm comm, MPI_Info info);
#include 
#ifdef OPEN_MPI
  #include 
#endif
{ MPI_Init(argc, argv);
#ifdef OMPI_HAVE_MPI_EXT_CR
  OMPI_CR_Quiesce_start(MPI_COMM_WORLD, MPI_INFO_NULL);
  // Prepare application for application-level checkpoint.
  // Wait on any important outstanding receives
  // Save application state
  OMPI_CR_Quiesce_end(MPI_COMM_WORLD, MPI_INFO_NULL);
#endif
  // Resume normal operation.
}

Back to top

OMPI_CR_Quiesce_checkpoint

Introduced in r23587. Included in v1.5.1 and later releases.

This is a collective operation.

OMPI_CR_QUIESCE_CHECKPOINT(comm, handle, seq, info)
 IN     comm       communicator (handle)
 OUT    handle     Global snapshot reference (string)
 OUT    seq        Sequence number (int)
 INOUT  info       A set of key-value pairs providing hints to the MPI
                   implementation regarding how this function should
                   behave (handle, significant on all ranks)
int OMPI_CR_QUIESCE_CHECKPOINT(MPI_Comm comm, char **handle, int *seq,
                               MPI_Info info);
#include 
#ifdef OPEN_MPI
  #include 
#endif
{ MPI_Init(argc, argv);
#ifdef OMPI_HAVE_MPI_EXT_CR
  OMPI_CR_Quiesce_start(MPI_COMM_WORLD, MPI_INFO_NULL);
  // Prepare application for checkpoint.
  // Wait on any important outstanding receives
  // Mark some memory regions for exclusion
  OMPI_CR_Quiesce_checkpoint(MPI_COMM_WORLD, &handle, &seq,
                                MPI_INFO_NULL);
  OMPI_CR_Quiesce_end(MPI_COMM_WORLD, MPI_INFO_NULL);
#endif
  // Resume normal operation.
}

Back to top

OMPI_CR_Quiesce_end

Introduced in r23587. Included in v1.5.1 and later releases.

This is a collective operation.

OMPI_CR_QUIESCE_END(comm, info)
 IN       comm        communicator (handle)
 INOUT    info        A set of key-value pairs providing additional
                      information to the MPI implementation regarding how
                      to continue after quiescence (handle, significant on
                      all ranks)
int OMPI_CR_QUIESCE_END(MPI_Comm comm, MPI_Info info);
#include 
#ifdef OPEN_MPI
  #include 
#endif
{ MPI_Init(argc, argv);
#ifdef OMPI_HAVE_MPI_EXT_CR
  OMPI_CR_Quiesce_start(MPI_COMM_WORLD, MPI_INFO_NULL);
  // Prepare application for application-level checkpoint.
  // Wait on any important outstanding receives
  // Save application state
  OMPI_CR_Quiesce_end(MPI_COMM_WORLD, MPI_INFO_NULL);
#endif
  // Resume normal operation.
}

Back to top

OMPI_CR_self_register_checkpoint_callback

Introduced in r23587. Included in v1.5.1 and later releases.

The self CRS must be used for these functions to work.

// Default Checkpoint Callback
int opal_crs_self_user_checkpoint(char **restart_cmd);

// SELF CRS Checkpoint Registration Function
int OMPI_CR_self_register_checkpoint_callback(OMPI_CR_self_checkpoint_fn function);
// SELF CRS Callback Function Signature
typedef int (*OMPI_CR_self_checkpoint_fn)(char **restart_cmd);

Back to top

OMPI_CR_self_register_restart_callback

Introduced in r23587. Included in v1.5.1 and later releases.

This is a collective operation.

// Default Restart Callback
int opal_crs_self_user_restart(void);

// SELF CRS Restart Registration Function
int OMPI_CR_self_register_restart_callback(OMPI_CR_self_restart_fn function);
// SELF CRS Callback Function Signature
typedef int (*OMPI_CR_self_restart_fn)(void);

Back to top

OMPI_CR_self_register_continue_callback

Introduced in r23587. Included in v1.5.1 and later releases.

This is a collective operation.

// Default Continue Callback
int opal_crs_self_user_continue(void);

// SELF CRS Continue Registration Function
int OMPI_CR_self_register_continue_callback(OMPI_CR_self_continue_fn function);
// SELF CRS Callback Function Signature
typedef int (*OMPI_CR_self_continue_fn)(void);

Back to top