FMS  2025.04
Flexible Modeling System
mpp.F90
1 !***********************************************************************
2 !* Apache License 2.0
3 !*
4 !* This file is part of the GFDL Flexible Modeling System (FMS).
5 !*
6 !* Licensed under the Apache License, Version 2.0 (the "License");
7 !* you may not use this file except in compliance with the License.
8 !* You may obtain a copy of the License at
9 !*
10 !* http://www.apache.org/licenses/LICENSE-2.0
11 !*
12 !* FMS is distributed in the hope that it will be useful, but WITHOUT
13 !* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied;
14 !* without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
15 !* PARTICULAR PURPOSE. See the License for the specific language
16 !* governing permissions and limitations under the License.
17 !***********************************************************************
18 !-----------------------------------------------------------------------
19 ! Communication for message-passing codes
20 !
21 ! AUTHOR: V. Balaji (V.Balaji@noaa.gov)
22 ! SGI/GFDL Princeton University
23 !
24 !-----------------------------------------------------------------------
25 
26 !> @defgroup mpp_mod mpp_mod
27 !> @ingroup mpp
28 !> @brief This module defines interfaces for common operations using message-passing libraries.
29 !! Any type-less arguments in the documentation are MPP_TYPE_ which is defined by the pre-processor
30 !! to create multiple subroutines out of one implementation for use in an interface. See the note
31 !! below for more information
32 !!
33 !> @author V. Balaji <"V.Balaji@noaa.gov">
34 !!
35 !! A set of simple calls to provide a uniform interface
36 !! to different message-passing libraries. It currently can be
37 !! implemented either in the SGI/Cray native SHMEM library or in the MPI
38 !! standard. Other libraries (e.g MPI-2, Co-Array Fortran) can be
39 !! incorporated as the need arises.
40 !!
41 !! The data transfer between a processor and its own memory is based
42 !! on <TT>load</TT> and <TT>store</TT> operations upon
43 !! memory. Shared-memory systems (including distributed shared memory
44 !! systems) have a single address space and any processor can acquire any
45 !! data within the memory by <TT>load</TT> and
46 !! <TT>store</TT>. The situation is different for distributed
47 !! parallel systems. Specialized MPP systems such as the T3E can simulate
48 !! shared-memory by direct data acquisition from remote memory. But if
49 !! the parallel code is distributed across a cluster, or across the Net,
50 !! messages must be sent and received using the protocols for
51 !! long-distance communication, such as TCP/IP. This requires a
52 !! ``handshaking'' between nodes of the distributed system. One can think
53 !! of the two different methods as involving <TT>put</TT>s or
54 !! <TT>get</TT>s (e.g the SHMEM library), or in the case of
55 !! negotiated communication (e.g MPI), <TT>send</TT>s and
56 !! <TT>recv</TT>s.
57 !!
58 !! The difference between SHMEM and MPI is that SHMEM uses one-sided
59 !! communication, which can have very low-latency high-bandwidth
60 !! implementations on tightly coupled systems. MPI is a standard
61 !! developed for distributed computing across loosely-coupled systems,
62 !! and therefore incurs a software penalty for negotiating the
63 !! communication. It is however an open industry standard whereas SHMEM
64 !! is a proprietary interface. Besides, the <TT>put</TT>s or
65 !! <TT>get</TT>s on which it is based cannot currently be implemented in
66 !! a cluster environment (there are recent announcements from Compaq that
67 !! occasion hope).
68 !!
69 !! The message-passing requirements of climate and weather codes can be
70 !! reduced to a fairly simple minimal set, which is easily implemented in
71 !! any message-passing API. <TT>mpp_mod</TT> provides this API.
72 !!
73 !! Features of <TT>mpp_mod</TT> include:
74 !! <ol>
75 !! <li> Simple, minimal API, with free access to underlying API for </li>
76 !! more complicated stuff.<BR/>
77 !! <li> Design toward typical use in climate/weather CFD codes. </li>
78 !! <li> Performance to be not significantly lower than any native API. </li>
79 !! </ol>
80 !!
81 !! This module is used to develop higher-level calls for
82 !! domain decomposition (@ref mpp_domains) and parallel I/O (@ref fms2_io)
83 !! <br/>
84 !! Parallel computing is initially daunting, but it soon becomes
85 !! second nature, much the way many of us can now write vector code
86 !! without much effort. The key insight required while reading and
87 !! writing parallel code is in arriving at a mental grasp of several
88 !! independent parallel execution streams through the same code (the SPMD
89 !! model). Each variable you examine may have different values for each
90 !! stream, the processor ID being an obvious example. Subroutines and
91 !! function calls are particularly subtle, since it is not always obvious
92 !! from looking at a call what synchronization between execution streams
93 !! it implies. An example of erroneous code would be a global barrier
94 !! call (see @ref mpp_sync below) placed
95 !! within a code block that not all PEs will execute, e.g:
96 !!
97 !! <PRE>
98 !! if( pe.EQ.0 )call mpp_sync()
99 !! </PRE>
100 !!
101 !! Here only PE 0 reaches the barrier, where it will wait
102 !! indefinitely. While this is a particularly egregious example to
103 !! illustrate the coding flaw, more subtle versions of the same are
104 !! among the most common errors in parallel code.
105 !! <br/>
106 !! It is therefore important to be conscious of the context of a
107 !! subroutine or function call, and the implied synchronization. There
108 !! are certain calls here (e.g <TT>mpp_declare_pelist, mpp_init,
109 !! mpp_set_stack_size</TT>) which must be called by all
110 !! PEs. There are others which must be called by a subset of PEs (here
111 !! called a <TT>pelist</TT>) which must be called by all the PEs in the
112 !! <TT>pelist</TT> (e.g <TT>mpp_max, mpp_sum, mpp_sync</TT>). Still
113 !! others imply no synchronization at all. I will make every effort to
114 !! highlight the context of each call in the MPP modules, so that the
115 !! implicit synchronization is spelt out.
116 !! <br/>
117 !! For performance it is necessary to keep synchronization as limited
118 !! as the algorithm being implemented will allow. For instance, a single
119 !! message between two PEs should only imply synchronization across the
120 !! PEs in question. A <I>global</I> synchronization (or <I>barrier</I>)
121 !! is likely to be slow, and is best avoided. But codes first
122 !! parallelized on a Cray T3E tend to have many global syncs, as very
123 !! fast barriers were implemented there in hardware.
124 !! <br/>
125 !! Another reason to use pelists is to run a single program in MPMD
126 !! mode, where different PE subsets work on different portions of the
127 !! code. A typical example is to assign an ocean model and atmosphere
128 !! model to different PE subsets, and couple them concurrently instead of
129 !! running them serially. The MPP module provides the notion of a
130 !! <I>current pelist</I>, which is set when a group of PEs branch off
131 !! into a subset. Subsequent calls that omit the <TT>pelist</TT> optional
132 !! argument (seen below in many of the individual calls) assume that the
133 !! implied synchronization is across the current pelist. The calls
134 !! <TT>mpp_root_pe</TT> and <TT>mpp_npes</TT> also return the values
135 !! appropriate to the current pelist. The <TT>mpp_set_current_pelist</TT>
136 !! call is provided to set the current pelist.
137 !! </DESCRIPTION>
138 !! <br/>
139 !!
140 !! @note F90 is a strictly-typed language, and the syntax pass of the
141 !! compiler requires matching of type, kind and rank (TKR). Most calls
142 !! listed here use a generic type, shown here as <TT>MPP_TYPE_</TT>. This
143 !! is resolved in the pre-processor stage to any of a variety of
144 !! types. In general the MPP operations work on 4-byte and 8-byte
145 !! variants of <TT>integer, real, complex, logical</TT> variables, of
146 !! rank 0 to 5, leading to 48 specific module procedures under the same
147 !! generic interface. Any of the variables below shown as
148 !! <TT>MPP_TYPE_</TT> is treated in this way.
149 
150 module mpp_mod
151 
152 ! Define rank(X) for PGI compiler
153 #if defined( __PGI) || defined (__FLANG)
154 #define rank(X) size(shape(X))
155 #endif
156 
157 
158 #if defined(use_libMPI)
159  use mpi
160 #endif
161 
162  use iso_fortran_env, only : input_unit, output_unit, error_unit
163  use mpp_parameter_mod, only : mpp_verbose, mpp_debug, all_pes, any_pe, null_pe
164  use mpp_parameter_mod, only : note, warning, fatal, mpp_clock_detailed,mpp_clock_sync
165  use mpp_parameter_mod, only : clock_component, clock_subcomponent, clock_module_driver
166  use mpp_parameter_mod, only : clock_module, clock_routine, clock_loop, clock_infra
167  use mpp_parameter_mod, only : max_events, max_bins, max_event_types, max_clocks
168  use mpp_parameter_mod, only : maxpes, event_wait, event_allreduce, event_broadcast
169  use mpp_parameter_mod, only : event_alltoall
170  use mpp_parameter_mod, only : event_type_create, event_type_free
171  use mpp_parameter_mod, only : event_recv, event_send, mpp_ready, mpp_wait
172  use mpp_parameter_mod, only : mpp_parameter_version=>version
173  use mpp_parameter_mod, only : default_tag
174  use mpp_parameter_mod, only : comm_tag_1, comm_tag_2, comm_tag_3, comm_tag_4
175  use mpp_parameter_mod, only : comm_tag_5, comm_tag_6, comm_tag_7, comm_tag_8
176  use mpp_parameter_mod, only : comm_tag_9, comm_tag_10, comm_tag_11, comm_tag_12
177  use mpp_parameter_mod, only : comm_tag_13, comm_tag_14, comm_tag_15, comm_tag_16
178  use mpp_parameter_mod, only : comm_tag_17, comm_tag_18, comm_tag_19, comm_tag_20
179  use mpp_parameter_mod, only : mpp_fill_int,mpp_fill_double
180  use mpp_data_mod, only : stat, mpp_stack, ptr_stack, status, ptr_status, sync, ptr_sync
181  use mpp_data_mod, only : mpp_from_pe, ptr_from, remote_data_loc, ptr_remote
182  use mpp_data_mod, only : mpp_data_version=>version
183  use platform_mod
184 
185 implicit none
186 private
187 
188  !--- public parameters -----------------------------------------------
189  public :: mpp_verbose, mpp_debug, all_pes, any_pe, null_pe, note, warning, fatal
190  public :: mpp_clock_sync, mpp_clock_detailed, clock_component, clock_subcomponent
191  public :: clock_module_driver, clock_module, clock_routine, clock_loop, clock_infra
192  public :: maxpes, event_recv, event_send
193  public :: comm_tag_1, comm_tag_2, comm_tag_3, comm_tag_4
194  public :: comm_tag_5, comm_tag_6, comm_tag_7, comm_tag_8
195  public :: comm_tag_9, comm_tag_10, comm_tag_11, comm_tag_12
196  public :: comm_tag_13, comm_tag_14, comm_tag_15, comm_tag_16
197  public :: comm_tag_17, comm_tag_18, comm_tag_19, comm_tag_20
198  public :: mpp_fill_int,mpp_fill_double,mpp_info_null,mpp_comm_null
199  public :: mpp_init_test_full_init, mpp_init_test_init_true_only, mpp_init_test_peset_allocated
200  public :: mpp_init_test_clocks_init, mpp_init_test_datatype_list_init, mpp_init_test_logfile_init
201  public :: mpp_init_test_read_namelist, mpp_init_test_etc_unit, mpp_init_test_requests_allocated
202 
203  !--- public interface from mpp_util.h ------------------------------
204  public :: stdin, stdout, stderr, stdlog, warnlog, lowercase, uppercase, mpp_error, mpp_error_state
205  public :: mpp_set_warn_level, mpp_sync, mpp_sync_self, mpp_pe
206  public :: mpp_npes, mpp_root_pe, mpp_commid, mpp_set_root_pe, mpp_declare_pelist
207  public :: mpp_get_current_pelist, mpp_set_current_pelist, mpp_get_current_pelist_name
208  public :: mpp_clock_id, mpp_clock_set_grain, mpp_record_timing_data, get_unit
209  public :: read_ascii_file, read_input_nml, mpp_clock_begin, mpp_clock_end
210  public :: get_ascii_file_num_lines, get_ascii_file_num_lines_and_length
211  public :: mpp_record_time_start, mpp_record_time_end
212 
213  !--- public interface from mpp_comm.h ------------------------------
215  public :: mpp_sum_ad
216  public :: mpp_broadcast, mpp_init, mpp_exit
218  public :: mpp_type, mpp_byte, mpp_type_create, mpp_type_free
219 
220  !*********************************************************************
221  !
222  ! public data type
223  !
224  !*********************************************************************
225  !> Communication information for message passing libraries
226  !!
227  !> peset hold communicators as SHMEM-compatible triads (start, log2(stride), num)
228  !> @ingroup mpp_mod
229  type :: communicator
230  private
231  character(len=32) :: name
232  integer, pointer :: list(:) =>null()
233  integer :: count
234  integer :: start, log2stride !< dummy variables when libMPI is defined.
235  integer :: id, group !< MPI communicator and group id for this PE set.
236  end type communicator
237 
238  !> Communication event profile
239  !> @ingroup mpp_mod
240  type :: event
241  private
242  character(len=16) :: name
243  integer(i8_kind), dimension(MAX_EVENTS) :: ticks, bytes
244  integer :: calls
245  end type event
246 
247  !> a clock contains an array of event profiles for a region
248  !> @ingroup mpp_mod
249  type :: clock
250  private
251  character(len=32) :: name
252  integer(i8_kind) :: hits
253  integer(i8_kind) :: tick
254  integer(i8_kind) :: total_ticks
255  integer :: peset_num
256  logical :: sync_on_begin, detailed
257  integer :: grain
258  type(event), pointer :: events(:) =>null() !> if needed, allocate to MAX_EVENT_TYPES
259  logical :: is_on !> initialize to false. set true when calling mpp_clock_begin
260  !! set false when calling mpp_clock_end
261  end type clock
262 
263  !> Summary of information from a clock run
264  !> @ingroup mpp_mod
266  private
267  character(len=16) :: name
268  real(r8_kind) :: msg_size_sums(MAX_BINS)
269  real(r8_kind) :: msg_time_sums(MAX_BINS)
270  real(r8_kind) :: total_data
271  real(r8_kind) :: total_time
272  integer(i8_kind) :: msg_size_cnts(MAX_BINS)
273  integer(i8_kind) :: total_cnts
274  end type clock_data_summary
275 
276  !> holds name and clock data for use in @ref mpp_util.h
277  !> @ingroup mpp_mod
279  private
280  character(len=16) :: name
281  type (Clock_Data_Summary) :: event(MAX_EVENT_TYPES)
282  end type summary_struct
283 
284  !> Data types for generalized data transfer (e.g. MPI_Type)
285  !> @ingroup mpp_mod
286  type :: mpp_type
287  private
288  integer :: counter !> Number of instances of this type
289  integer :: ndims
290  integer, allocatable :: sizes(:)
291  integer, allocatable :: subsizes(:)
292  integer, allocatable :: starts(:)
293  integer :: etype !> Elementary data type (e.g. MPI_BYTE)
294  integer :: id !> Identifier within message passing library (e.g. MPI)
295 
296  type(mpp_type), pointer :: prev => null()
297  type(mpp_type), pointer :: next => null()
298  end type mpp_type
299 
300  !> Persisent elements for linked list interaction
301  !> @ingroup mpp_mod
303  private
304  type(mpp_type), pointer :: head => null()
305  type(mpp_type), pointer :: tail => null()
306  integer :: length
307  end type mpp_type_list
308 
309 !***********************************************************************
310 !
311 ! public interface from mpp_util.h
312 !
313 !***********************************************************************
314  !> @brief Error handler.
315  !!
316  !> It is strongly recommended that all error exits pass through
317  !! <TT>mpp_error</TT> to assure the program fails cleanly. An individual
318  !! PE encountering a <TT>STOP</TT> statement, for instance, can cause the
319  !! program to hang. The use of the <TT>STOP</TT> statement is strongly
320  !! discouraged.
321  !!
322  !! Calling mpp_error with no arguments produces an immediate error
323  !! exit, i.e:
324  !! <PRE>
325  !! call mpp_error
326  !! call mpp_error()
327  !! </PRE>
328  !! are equivalent.
329  !!
330  !! The argument order
331  !! <PRE>
332  !! call mpp_error( routine, errormsg, errortype )
333  !! </PRE>
334  !! is also provided to support legacy code. In this version of the
335  !! call, none of the arguments may be omitted.
336  !!
337  !! The behaviour of <TT>mpp_error</TT> for a <TT>WARNING</TT> can be
338  !! controlled with an additional call <TT>mpp_set_warn_level</TT>.
339  !! <PRE>
340  !! call mpp_set_warn_level(ERROR)
341  !! </PRE>
342  !! causes <TT>mpp_error</TT> to treat <TT>WARNING</TT>
343  !! exactly like <TT>FATAL</TT>.
344  !! <PRE>
345  !! call mpp_set_warn_level(WARNING)
346  !! </PRE>
347  !! resets to the default behaviour described above.
348  !!
349  !! <TT>mpp_error</TT> also has an internal error state which
350  !! maintains knowledge of whether a warning has been issued. This can be
351  !! used at startup in a subroutine that checks if the model has been
352  !! properly configured. You can generate a series of warnings using
353  !! <TT>mpp_error</TT>, and then check at the end if any warnings has been
354  !! issued using the function <TT>mpp_error_state()</TT>. If the value of
355  !! this is <TT>WARNING</TT>, at least one warning has been issued, and
356  !! the user can take appropriate action:
357  !!
358  !! <PRE>
359  !! if( ... )call mpp_error( WARNING, '...' )
360  !! if( ... )call mpp_error( WARNING, '...' )
361  !! if( ... )call mpp_error( WARNING, '...' )
362  !! ...
363  !! if( mpp_error_state().EQ.WARNING )call mpp_error( FATAL, '...' )
364  !! </PRE>
365  !! </DESCRIPTION>
366  !! <br> Example usage:
367  !! @code{.F90}
368  !! call mpp_error( errortype, routine, errormsg )
369  !! @endcode
370  !! @param errortype
371  !! One of <TT>NOTE</TT>, <TT>WARNING</TT> or <TT>FATAL</TT>
372  !! (these definitions are acquired by use association).
373  !! <TT>NOTE</TT> writes <TT>errormsg</TT> to <TT>STDOUT</TT>.
374  !! <TT>WARNING</TT> writes <TT>errormsg</TT> to <TT>STDERR</TT>.
375  !! <TT>FATAL</TT> writes <TT>errormsg</TT> to <TT>STDERR</TT>,
376  !! and induces a clean error exit with a call stack traceback.
377  !! @param routine Calling routine name
378  !! @param errmsg Message to output
379  !! </IN>
380  !> @ingroup mpp_mod
381  interface mpp_error
382  module procedure mpp_error_basic
383  module procedure mpp_error_mesg
384  module procedure mpp_error_noargs
385  module procedure mpp_error_is
386  module procedure mpp_error_rs
387  module procedure mpp_error_ia
388  module procedure mpp_error_ra
389  module procedure mpp_error_ia_ia
390  module procedure mpp_error_ia_ra
391  module procedure mpp_error_ra_ia
392  module procedure mpp_error_ra_ra
393  module procedure mpp_error_ia_is
394  module procedure mpp_error_ia_rs
395  module procedure mpp_error_ra_is
396  module procedure mpp_error_ra_rs
397  module procedure mpp_error_is_ia
398  module procedure mpp_error_is_ra
399  module procedure mpp_error_rs_ia
400  module procedure mpp_error_rs_ra
401  module procedure mpp_error_is_is
402  module procedure mpp_error_is_rs
403  module procedure mpp_error_rs_is
404  module procedure mpp_error_rs_rs
405  end interface
406  !> Takes a given integer or real array and returns it as a string
407  !> @param[in] array An array of integers or reals
408  !> @returns string equivalent of given array
409  !> @ingroup mpp_mod
410  interface array_to_char
411  module procedure iarray_to_char
412  module procedure rarray_to_char
413  end interface
414 
415 !***********************************************************************
416 !
417 ! public interface from mpp_comm.h
418 !
419 !***********************************************************************
420 
421 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
422  ! !
423  ! ROUTINES TO INITIALIZE/FINALIZE MPP MODULE: mpp_init, mpp_exit !
424  ! !
425 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
426 
427 !> @fn mpp_mod::mpp_init::mpp_init( flags, localcomm, test_level)
428 !> @ingroup mpp_mod
429 !> @brief Initialize @ref mpp_mod
430 !!
431 !> Called to initialize the <TT>mpp_mod</TT> package. It is recommended
432 !! that this call be the first executed line in your program. It sets the
433 !! number of PEs assigned to this run (acquired from the command line, or
434 !! through the environment variable <TT>NPES</TT>), and associates an ID
435 !! number to each PE. These can be accessed by calling @ref mpp_npes and
436 !! @ref mpp_pe.
437 !! <br> Example usage:
438 !!
439 !! call mpp_init( flags )
440 !!
441 !! @param flags
442 !! <TT>flags</TT> can be set to <TT>MPP_VERBOSE</TT> to
443 !! have <TT>mpp_mod</TT> keep you informed of what it's up to.
444 !! @param test_level
445 !! Debugging flag to set amount of initialization tasks performed
446 
447 !> @fn mpp_mod::mpp_exit()
448 !> @brief Exit <TT>@ref mpp_mod</TT>.
449 !!
450 !> Called at the end of the run, or to re-initialize <TT>mpp_mod</TT>,
451 !! should you require that for some odd reason.
452 !!
453 !! This call implies synchronization across all PEs.
454 !!
455 !! <br>Example usage:
456 !!
457 !! call mpp_exit()
458 !> @ingroup mpp_mod
459 
460  !#####################################################################
461 
462  !> @fn subroutine mpp_set_stack_size(n)
463  !> @brief Allocate module internal workspace.
464  !> @param Integer to set stack size to(in words)
465  !> <TT>mpp_mod</TT> maintains a private internal array called
466  !! <TT>mpp_stack</TT> for private workspace. This call sets the length,
467  !! in words, of this array.
468  !!
469  !! The <TT>mpp_init</TT> call sets this
470  !! workspace length to a default of 32768, and this call may be used if a
471  !! longer workspace is needed.
472  !!
473  !! This call implies synchronization across all PEs.
474  !!
475  !! This workspace is symmetrically allocated, as required for
476  !! efficient communication on SGI and Cray MPP systems. Since symmetric
477  !! allocation must be performed by <I>all</I> PEs in a job, this call
478  !! must also be called by all PEs, using the same value of
479  !! <TT>n</TT>. Calling <TT>mpp_set_stack_size</TT> from a subset of PEs,
480  !! or with unequal argument <TT>n</TT>, may cause the program to hang.
481  !!
482  !! If any MPP call using <TT>mpp_stack</TT> overflows the declared
483  !! stack array, the program will abort with a message specifying the
484  !! stack length that is required. Many users wonder why, if the required
485  !! stack length can be computed, it cannot also be specified at that
486  !! point. This cannot be automated because there is no way for the
487  !! program to know if all PEs are present at that call, and with equal
488  !! values of <TT>n</TT>. The program must be rerun by the user with the
489  !! correct argument to <TT>mpp_set_stack_size</TT>, called at an
490  !! appropriate point in the code where all PEs are known to be present.
491  !! @verbose call mpp_set_stack_size(n)
492  !!
493  !> @ingroup mpp_mod
494  public :: mpp_set_stack_size
495  ! from mpp_util.h
496 
497 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
498 ! !
499 ! DATA TRANSFER TYPES: mpp_type_create !
500 ! !
501 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
502 
503  !> @brief Create a mpp_type variable
504  !> @param[in] field A field of any numerical or logical type
505  !> @param[in] array_of_subsizes Integer array of subsizes
506  !> @param[in] array_of_starts Integer array of starts
507  !> @param[out] dtype_out Output variable for created @ref mpp_type
508  !> @ingroup mpp_mod
509  interface mpp_type_create
510  module procedure mpp_type_create_int4
511  module procedure mpp_type_create_int8
512  module procedure mpp_type_create_real4
513  module procedure mpp_type_create_real8
514  module procedure mpp_type_create_cmplx4
515  module procedure mpp_type_create_cmplx8
516  module procedure mpp_type_create_logical4
517  module procedure mpp_type_create_logical8
518  end interface mpp_type_create
519 
520 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
521  ! !
522  ! GLOBAL REDUCTION ROUTINES: mpp_max, mpp_sum, mpp_min !
523  ! !
524 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
525 
526  !> @brief Reduction operations.
527  !> Find the max of scalar a from the PEs in pelist
528  !! result is also automatically broadcast to all PEs
529  !! @code{.F90}
530  !! call mpp_max( a, pelist )
531  !! @endcode
532  !> @param a <TT>real</TT> or <TT>integer</TT>, of 4-byte of 8-byte kind.
533  !> @param pelist If <TT>pelist</TT> is omitted, the context is assumed to be the
534  !! current pelist. This call implies synchronization across the PEs in
535  !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
536  !> @ingroup mpp_mod
537  interface mpp_max
538  module procedure mpp_max_real8_0d
539  module procedure mpp_max_real8_1d
540  module procedure mpp_max_int8_0d
541  module procedure mpp_max_int8_1d
542  module procedure mpp_max_real4_0d
543  module procedure mpp_max_real4_1d
544  module procedure mpp_max_int4_0d
545  module procedure mpp_max_int4_1d
546  end interface
547 
548  !> @brief Reduction operations.
549  !> Find the min of scalar a from the PEs in pelist
550  !! result is also automatically broadcast to all PEs
551  !! @code{.F90}
552  !! call mpp_min( a, pelist )
553  !! @endcode
554  !> @param a <TT>real</TT> or <TT>integer</TT>, of 4-byte of 8-byte kind.
555  !> @param pelist If <TT>pelist</TT> is omitted, the context is assumed to be the
556  !! current pelist. This call implies synchronization across the PEs in
557  !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
558  !> @ingroup mpp_mod
559  interface mpp_min
560  module procedure mpp_min_real8_0d
561  module procedure mpp_min_real8_1d
562  module procedure mpp_min_int8_0d
563  module procedure mpp_min_int8_1d
564  module procedure mpp_min_real4_0d
565  module procedure mpp_min_real4_1d
566  module procedure mpp_min_int4_0d
567  module procedure mpp_min_int4_1d
568  end interface
569 
570 
571  !> @brief Reduction operation.
572  !!
573  !> <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
574  !! <TT>integer, real, complex</TT> variables, of rank 0 or 1. A
575  !! contiguous block from a multi-dimensional array may be passed by its
576  !! starting address and its length, as in <TT>f77</TT>.
577  !!
578  !! Library reduction operators are not required or guaranteed to be
579  !! bit-reproducible. In any case, changing the processor count changes
580  !! the data layout, and thus very likely the order of operations. For
581  !! bit-reproducible sums of distributed arrays, consider using the
582  !! <TT>mpp_global_sum</TT> routine provided by the
583  !! @ref mpp_domains module.
584  !!
585  !! The <TT>bit_reproducible</TT> flag provided in earlier versions of
586  !! this routine has been removed.
587  !!
588  !!
589  !! If <TT>pelist</TT> is omitted, the context is assumed to be the
590  !! current pelist. This call implies synchronization across the PEs in
591  !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
592  !! Example usage:
593  !! call mpp_sum( a, length, pelist )
594  !!
595  !> @ingroup mpp_mod
596  interface mpp_sum
597  module procedure mpp_sum_int8
598  module procedure mpp_sum_int8_scalar
599  module procedure mpp_sum_int8_2d
600  module procedure mpp_sum_int8_3d
601  module procedure mpp_sum_int8_4d
602  module procedure mpp_sum_int8_5d
603  module procedure mpp_sum_real8
604  module procedure mpp_sum_real8_scalar
605  module procedure mpp_sum_real8_2d
606  module procedure mpp_sum_real8_3d
607  module procedure mpp_sum_real8_4d
608  module procedure mpp_sum_real8_5d
609 #ifdef OVERLOAD_C8
610  module procedure mpp_sum_cmplx8
611  module procedure mpp_sum_cmplx8_scalar
612  module procedure mpp_sum_cmplx8_2d
613  module procedure mpp_sum_cmplx8_3d
614  module procedure mpp_sum_cmplx8_4d
615  module procedure mpp_sum_cmplx8_5d
616 #endif
617  module procedure mpp_sum_int4
618  module procedure mpp_sum_int4_scalar
619  module procedure mpp_sum_int4_2d
620  module procedure mpp_sum_int4_3d
621  module procedure mpp_sum_int4_4d
622  module procedure mpp_sum_int4_5d
623  module procedure mpp_sum_real4
624  module procedure mpp_sum_real4_scalar
625  module procedure mpp_sum_real4_2d
626  module procedure mpp_sum_real4_3d
627  module procedure mpp_sum_real4_4d
628  module procedure mpp_sum_real4_5d
629 #ifdef OVERLOAD_C4
630  module procedure mpp_sum_cmplx4
631  module procedure mpp_sum_cmplx4_scalar
632  module procedure mpp_sum_cmplx4_2d
633  module procedure mpp_sum_cmplx4_3d
634  module procedure mpp_sum_cmplx4_4d
635  module procedure mpp_sum_cmplx4_5d
636 #endif
637  end interface
638 
639  !> Calculates sum of a given numerical array across pe's for adjoint domains
640  !> @ingroup mpp_mod
641  interface mpp_sum_ad
642  module procedure mpp_sum_int8_ad
643  module procedure mpp_sum_int8_scalar_ad
644  module procedure mpp_sum_int8_2d_ad
645  module procedure mpp_sum_int8_3d_ad
646  module procedure mpp_sum_int8_4d_ad
647  module procedure mpp_sum_int8_5d_ad
648  module procedure mpp_sum_real8_ad
649  module procedure mpp_sum_real8_scalar_ad
650  module procedure mpp_sum_real8_2d_ad
651  module procedure mpp_sum_real8_3d_ad
652  module procedure mpp_sum_real8_4d_ad
653  module procedure mpp_sum_real8_5d_ad
654 #ifdef OVERLOAD_C8
655  module procedure mpp_sum_cmplx8_ad
656  module procedure mpp_sum_cmplx8_scalar_ad
657  module procedure mpp_sum_cmplx8_2d_ad
658  module procedure mpp_sum_cmplx8_3d_ad
659  module procedure mpp_sum_cmplx8_4d_ad
660  module procedure mpp_sum_cmplx8_5d_ad
661 #endif
662  module procedure mpp_sum_int4_ad
663  module procedure mpp_sum_int4_scalar_ad
664  module procedure mpp_sum_int4_2d_ad
665  module procedure mpp_sum_int4_3d_ad
666  module procedure mpp_sum_int4_4d_ad
667  module procedure mpp_sum_int4_5d_ad
668  module procedure mpp_sum_real4_ad
669  module procedure mpp_sum_real4_scalar_ad
670  module procedure mpp_sum_real4_2d_ad
671  module procedure mpp_sum_real4_3d_ad
672  module procedure mpp_sum_real4_4d_ad
673  module procedure mpp_sum_real4_5d_ad
674 #ifdef OVERLOAD_C4
675  module procedure mpp_sum_cmplx4_ad
676  module procedure mpp_sum_cmplx4_scalar_ad
677  module procedure mpp_sum_cmplx4_2d_ad
678  module procedure mpp_sum_cmplx4_3d_ad
679  module procedure mpp_sum_cmplx4_4d_ad
680  module procedure mpp_sum_cmplx4_5d_ad
681 #endif
682  end interface
683 
684  !> @brief Gather data sent from pelist onto the root pe
685  !! Wrapper for MPI_gather, can be used with and without indices
686  !> @ingroup mpp_mod
687  !!
688  !> @param sbuf MPP_TYPE_ data buffer to send
689  !> @param rbuf MPP_TYPE_ data buffer to receive
690  !> @param pelist integer(:) optional pelist to gather from, defaults to current
691  !>
692  !> <BR> Example usage:
693  !!
694  !! call mpp_gather(send_buffer,recv_buffer, pelist)
695  !! call mpp_gather(is, ie, js, je, pelist, array_seg, data, is_root_pe)
696  !!
697  interface mpp_gather
698  module procedure mpp_gather_logical4
699  module procedure mpp_gatherv_logical4
700  module procedure mpp_gather_logical_1d
701  module procedure mpp_gather_int4
702  module procedure mpp_gather_int8
703  module procedure mpp_gatherv_int4
704  module procedure mpp_gatherv_int8
705  module procedure mpp_gather_int4_1d
706  module procedure mpp_gather_int8_1d
707  module procedure mpp_gather_real4
708  module procedure mpp_gather_real8
709  module procedure mpp_gatherv_real4
710  module procedure mpp_gatherv_real8
711  module procedure mpp_gather_real4_1d
712  module procedure mpp_gather_real8_1d
713  module procedure mpp_gather_logical_1dv
714  module procedure mpp_gather_int4_1dv
715  module procedure mpp_gather_int8_1dv
716  module procedure mpp_gather_real4_1dv
717  module procedure mpp_gather_real8_1dv
718  module procedure mpp_gather_pelist_logical_2d
719  module procedure mpp_gather_pelist_logical_3d
720  module procedure mpp_gather_pelist_int4_2d
721  module procedure mpp_gather_pelist_int4_3d
722  module procedure mpp_gather_pelist_int8_2d
723  module procedure mpp_gather_pelist_int8_3d
724  module procedure mpp_gather_pelist_real4_2d
725  module procedure mpp_gather_pelist_real4_3d
726  module procedure mpp_gather_pelist_real8_2d
727  module procedure mpp_gather_pelist_real8_3d
728  end interface
729 
730  !> @brief Scatter (ie - is) * (je - js) contiguous elements of array data from the designated root pe
731  !! into contigous members of array segment in each pe that is included in the pelist argument.
732  !> @ingroup mpp_mod
733  !!
734  !> @param is, ie integer start and end index of the first dimension of the segment array
735  !> @param je, js integer start and end index of the second dimension of the segment array
736  !> @param pelist integer(:) the PE list of target pes, needs to be monotonically increasing
737  !> @param array_seg MPP_TYPE_ 2D array that the data is to be copied into
738  !> @param data MPP_TYPE_ the source array
739  !> @param is_root_pe logical true if calling from root pe
740  !> @param ishift integer offsets specifying the first elelement in the data array
741  !> @param nk integer size of third dimension for 3D calls
742  !!
743  !> <BR> Example usage:
744  !!
745  !! call mpp_scatter(is, ie, js, je, pelist, segment, data, .true.)
746  !!
747  interface mpp_scatter
748  module procedure mpp_scatterv_int4
749  module procedure mpp_scatter_pelist_int4_2d
750  module procedure mpp_scatter_pelist_int4_3d
751  module procedure mpp_scatterv_int8
752  module procedure mpp_scatter_pelist_int8_2d
753  module procedure mpp_scatter_pelist_int8_3d
754  module procedure mpp_scatterv_real4
755  module procedure mpp_scatter_pelist_real4_2d
756  module procedure mpp_scatter_pelist_real4_3d
757  module procedure mpp_scatterv_real8
758  module procedure mpp_scatter_pelist_real8_2d
759  module procedure mpp_scatter_pelist_real8_3d
760  end interface
761 
762  !#####################################################################
763  !> @brief Scatter a vector across all PEs
764  !!
765  !> Transpose the vector and PE index
766  !! Wrapper for the MPI_alltoall function, includes more generic _V and _W
767  !! versions if given displacements/data types
768  !!
769  !! Generic MPP_TYPE_ implentations:
770  !! <li> @ref mpp_alltoall_ </li>
771  !! <li> @ref mpp_alltoallv_ </li>
772  !! <li> @ref mpp_alltoallw_ </li>
773  !!
774  !> @ingroup mpp_mod
775  interface mpp_alltoall
776  module procedure mpp_alltoall_int4
777  module procedure mpp_alltoall_int8
778  module procedure mpp_alltoall_real4
779  module procedure mpp_alltoall_real8
780 #ifdef OVERLOAD_C4
781  module procedure mpp_alltoall_cmplx4
782 #endif
783 #ifdef OVERLOAD_C8
784  module procedure mpp_alltoall_cmplx8
785 #endif
786  module procedure mpp_alltoall_logical4
787  module procedure mpp_alltoall_logical8
788  module procedure mpp_alltoall_int4_v
789  module procedure mpp_alltoall_int8_v
790  module procedure mpp_alltoall_real4_v
791  module procedure mpp_alltoall_real8_v
792 #ifdef OVERLOAD_C4
793  module procedure mpp_alltoall_cmplx4_v
794 #endif
795 #ifdef OVERLOAD_C8
796  module procedure mpp_alltoall_cmplx8_v
797 #endif
798  module procedure mpp_alltoall_logical4_v
799  module procedure mpp_alltoall_logical8_v
800  module procedure mpp_alltoall_int4_w
801  module procedure mpp_alltoall_int8_w
802  module procedure mpp_alltoall_real4_w
803  module procedure mpp_alltoall_real8_w
804 #ifdef OVERLOAD_C4
805  module procedure mpp_alltoall_cmplx4_w
806 #endif
807 #ifdef OVERLOAD_C8
808  module procedure mpp_alltoall_cmplx8_w
809 #endif
810  module procedure mpp_alltoall_logical4_w
811  module procedure mpp_alltoall_logical8_w
812  end interface
813 
814 
815  !#####################################################################
816  !> @brief Basic message-passing call.
817  !!
818  !> <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
819  !! <TT>integer, real, complex, logical</TT> variables, of rank 0 or 1. A
820  !! contiguous block from a multi-dimensional array may be passed by its
821  !! starting address and its length, as in <TT>f77</TT>.
822  !!
823  !! <TT>mpp_transmit</TT> is currently implemented as asynchronous
824  !! outward transmission and synchronous inward transmission. This follows
825  !! the behaviour of <TT>shmem_put</TT> and <TT>shmem_get</TT>. In MPI, it
826  !! is implemented as <TT>mpi_isend</TT> and <TT>mpi_recv</TT>. For most
827  !! applications, transmissions occur in pairs, and are here accomplished
828  !! in a single call.
829  !!
830  !! The special PE designations <TT>NULL_PE</TT>,
831  !! <TT>ANY_PE</TT> and <TT>ALL_PES</TT> are provided by use
832  !! association.
833  !!
834  !! <TT>NULL_PE</TT>: is used to disable one of the pair of
835  !! transmissions.<BR/>
836  !! <TT>ANY_PE</TT>: is used for unspecific remote
837  !! destination. (Please note that <TT>put_pe=ANY_PE</TT> has no meaning
838  !! in the MPI context, though it is available in the SHMEM invocation. If
839  !! portability is a concern, it is best avoided).<BR/>
840  !! <TT>ALL_PES</TT>: is used for broadcast operations.
841  !!
842  !! It is recommended that
843  !! @ref mpp_broadcast be used for
844  !! broadcasts.
845  !!
846  !! The following example illustrates the use of
847  !! <TT>NULL_PE</TT> and <TT>ALL_PES</TT>:
848  !!
849  !! <PRE>
850  !! real, dimension(n) :: a
851  !! if( pe.EQ.0 )then
852  !! do p = 1,npes-1
853  !! call mpp_transmit( a, n, p, a, n, NULL_PE )
854  !! end do
855  !! else
856  !! call mpp_transmit( a, n, NULL_PE, a, n, 0 )
857  !! end if
858  !!
859  !! call mpp_transmit( a, n, ALL_PES, a, n, 0 )
860  !! </PRE>
861  !!
862  !! The do loop and the broadcast operation above are equivalent.
863  !!
864  !! Two overloaded calls <TT>mpp_send</TT> and
865  !! <TT>mpp_recv</TT> have also been
866  !! provided. <TT>mpp_send</TT> calls <TT>mpp_transmit</TT>
867  !! with <TT>get_pe=NULL_PE</TT>. <TT>mpp_recv</TT> calls
868  !! <TT>mpp_transmit</TT> with <TT>put_pe=NULL_PE</TT>. Thus
869  !! the do loop above could be written more succinctly:
870  !!
871  !! <PRE>
872  !! if( pe.EQ.0 )then
873  !! do p = 1,npes-1
874  !! call mpp_send( a, n, p )
875  !! end do
876  !! else
877  !! call mpp_recv( a, n, 0 )
878  !! end if
879  !! </PRE>
880  !! <br>Example call:
881  !! @code{.F90}
882  !! call mpp_transmit( put_data, put_len, put_pe, get_data, get_len, get_pe )
883  !! @endcode
884  !> @ingroup mpp_mod
885  interface mpp_transmit
886  module procedure mpp_transmit_real8
887  module procedure mpp_transmit_real8_scalar
888  module procedure mpp_transmit_real8_2d
889  module procedure mpp_transmit_real8_3d
890  module procedure mpp_transmit_real8_4d
891  module procedure mpp_transmit_real8_5d
892 #ifdef OVERLOAD_C8
893  module procedure mpp_transmit_cmplx8
894  module procedure mpp_transmit_cmplx8_scalar
895  module procedure mpp_transmit_cmplx8_2d
896  module procedure mpp_transmit_cmplx8_3d
897  module procedure mpp_transmit_cmplx8_4d
898  module procedure mpp_transmit_cmplx8_5d
899 #endif
900  module procedure mpp_transmit_int8
901  module procedure mpp_transmit_int8_scalar
902  module procedure mpp_transmit_int8_2d
903  module procedure mpp_transmit_int8_3d
904  module procedure mpp_transmit_int8_4d
905  module procedure mpp_transmit_int8_5d
906  module procedure mpp_transmit_logical8
907  module procedure mpp_transmit_logical8_scalar
908  module procedure mpp_transmit_logical8_2d
909  module procedure mpp_transmit_logical8_3d
910  module procedure mpp_transmit_logical8_4d
911  module procedure mpp_transmit_logical8_5d
912 
913  module procedure mpp_transmit_real4
914  module procedure mpp_transmit_real4_scalar
915  module procedure mpp_transmit_real4_2d
916  module procedure mpp_transmit_real4_3d
917  module procedure mpp_transmit_real4_4d
918  module procedure mpp_transmit_real4_5d
919 
920 #ifdef OVERLOAD_C4
921  module procedure mpp_transmit_cmplx4
922  module procedure mpp_transmit_cmplx4_scalar
923  module procedure mpp_transmit_cmplx4_2d
924  module procedure mpp_transmit_cmplx4_3d
925  module procedure mpp_transmit_cmplx4_4d
926  module procedure mpp_transmit_cmplx4_5d
927 #endif
928  module procedure mpp_transmit_int4
929  module procedure mpp_transmit_int4_scalar
930  module procedure mpp_transmit_int4_2d
931  module procedure mpp_transmit_int4_3d
932  module procedure mpp_transmit_int4_4d
933  module procedure mpp_transmit_int4_5d
934  module procedure mpp_transmit_logical4
935  module procedure mpp_transmit_logical4_scalar
936  module procedure mpp_transmit_logical4_2d
937  module procedure mpp_transmit_logical4_3d
938  module procedure mpp_transmit_logical4_4d
939  module procedure mpp_transmit_logical4_5d
940  end interface
941  !> @brief Receive data from another PE
942  !!
943  !> @param[out] get_data scalar or array to get written with received data
944  !> @param get_len size of array to recv from get_data
945  !> @param from_pe PE number to receive from
946  !> @param block true for blocking, false for non-blocking. Defaults to true
947  !> @param tag communication tag
948  !> @param[out] request MPI request handle
949  !> @ingroup mpp_mod
950  interface mpp_recv
951  module procedure mpp_recv_real8
952  module procedure mpp_recv_real8_scalar
953  module procedure mpp_recv_real8_2d
954  module procedure mpp_recv_real8_3d
955  module procedure mpp_recv_real8_4d
956  module procedure mpp_recv_real8_5d
957 #ifdef OVERLOAD_C8
958  module procedure mpp_recv_cmplx8
959  module procedure mpp_recv_cmplx8_scalar
960  module procedure mpp_recv_cmplx8_2d
961  module procedure mpp_recv_cmplx8_3d
962  module procedure mpp_recv_cmplx8_4d
963  module procedure mpp_recv_cmplx8_5d
964 #endif
965  module procedure mpp_recv_int8
966  module procedure mpp_recv_int8_scalar
967  module procedure mpp_recv_int8_2d
968  module procedure mpp_recv_int8_3d
969  module procedure mpp_recv_int8_4d
970  module procedure mpp_recv_int8_5d
971  module procedure mpp_recv_logical8
972  module procedure mpp_recv_logical8_scalar
973  module procedure mpp_recv_logical8_2d
974  module procedure mpp_recv_logical8_3d
975  module procedure mpp_recv_logical8_4d
976  module procedure mpp_recv_logical8_5d
977 
978  module procedure mpp_recv_real4
979  module procedure mpp_recv_real4_scalar
980  module procedure mpp_recv_real4_2d
981  module procedure mpp_recv_real4_3d
982  module procedure mpp_recv_real4_4d
983  module procedure mpp_recv_real4_5d
984 
985 #ifdef OVERLOAD_C4
986  module procedure mpp_recv_cmplx4
987  module procedure mpp_recv_cmplx4_scalar
988  module procedure mpp_recv_cmplx4_2d
989  module procedure mpp_recv_cmplx4_3d
990  module procedure mpp_recv_cmplx4_4d
991  module procedure mpp_recv_cmplx4_5d
992 #endif
993  module procedure mpp_recv_int4
994  module procedure mpp_recv_int4_scalar
995  module procedure mpp_recv_int4_2d
996  module procedure mpp_recv_int4_3d
997  module procedure mpp_recv_int4_4d
998  module procedure mpp_recv_int4_5d
999  module procedure mpp_recv_logical4
1000  module procedure mpp_recv_logical4_scalar
1001  module procedure mpp_recv_logical4_2d
1002  module procedure mpp_recv_logical4_3d
1003  module procedure mpp_recv_logical4_4d
1004  module procedure mpp_recv_logical4_5d
1005  end interface
1006  !> Send data to a receiving PE.
1007  !!
1008  !> @param put_data scalar or array to get sent to a receiving PE
1009  !> @param put_len size of data to send from put_data
1010  !> @param to_pe PE number to send to
1011  !> @param block true for blocking, false for non-blocking. Defaults to true
1012  !> @param tag communication tag
1013  !> @param[out] request MPI request handle
1014  !! <br> Example usage:
1015  !! @code{.F90} call mpp_send(data, ie, pe) @endcode
1016  !> @ingroup mpp_mod
1017  interface mpp_send
1018  module procedure mpp_send_real8
1019  module procedure mpp_send_real8_scalar
1020  module procedure mpp_send_real8_2d
1021  module procedure mpp_send_real8_3d
1022  module procedure mpp_send_real8_4d
1023  module procedure mpp_send_real8_5d
1024 #ifdef OVERLOAD_C8
1025  module procedure mpp_send_cmplx8
1026  module procedure mpp_send_cmplx8_scalar
1027  module procedure mpp_send_cmplx8_2d
1028  module procedure mpp_send_cmplx8_3d
1029  module procedure mpp_send_cmplx8_4d
1030  module procedure mpp_send_cmplx8_5d
1031 #endif
1032  module procedure mpp_send_int8
1033  module procedure mpp_send_int8_scalar
1034  module procedure mpp_send_int8_2d
1035  module procedure mpp_send_int8_3d
1036  module procedure mpp_send_int8_4d
1037  module procedure mpp_send_int8_5d
1038  module procedure mpp_send_logical8
1039  module procedure mpp_send_logical8_scalar
1040  module procedure mpp_send_logical8_2d
1041  module procedure mpp_send_logical8_3d
1042  module procedure mpp_send_logical8_4d
1043  module procedure mpp_send_logical8_5d
1044 
1045  module procedure mpp_send_real4
1046  module procedure mpp_send_real4_scalar
1047  module procedure mpp_send_real4_2d
1048  module procedure mpp_send_real4_3d
1049  module procedure mpp_send_real4_4d
1050  module procedure mpp_send_real4_5d
1051 
1052 #ifdef OVERLOAD_C4
1053  module procedure mpp_send_cmplx4
1054  module procedure mpp_send_cmplx4_scalar
1055  module procedure mpp_send_cmplx4_2d
1056  module procedure mpp_send_cmplx4_3d
1057  module procedure mpp_send_cmplx4_4d
1058  module procedure mpp_send_cmplx4_5d
1059 #endif
1060  module procedure mpp_send_int4
1061  module procedure mpp_send_int4_scalar
1062  module procedure mpp_send_int4_2d
1063  module procedure mpp_send_int4_3d
1064  module procedure mpp_send_int4_4d
1065  module procedure mpp_send_int4_5d
1066  module procedure mpp_send_logical4
1067  module procedure mpp_send_logical4_scalar
1068  module procedure mpp_send_logical4_2d
1069  module procedure mpp_send_logical4_3d
1070  module procedure mpp_send_logical4_4d
1071  module procedure mpp_send_logical4_5d
1072  end interface
1073 
1074 
1075  !> @brief Perform parallel broadcasts
1076  !!
1077  !> The <TT>mpp_broadcast</TT> call has been added because the original
1078  !! syntax (using <TT>ALL_PES</TT> in <TT>mpp_transmit</TT>) did not
1079  !! support a broadcast across a pelist.
1080  !!
1081  !! <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
1082  !! <TT>integer, real, complex, logical</TT> variables, of rank 0 or 1. A
1083  !! contiguous block from a multi-dimensional array may be passed by its
1084  !! starting address and its length, as in <TT>f77</TT>.
1085  !!
1086  !! Global broadcasts through the <TT>ALL_PES</TT> argument to
1087  !! @ref mpp_transmit are still provided for
1088  !! backward-compatibility.
1089  !!
1090  !! If <TT>pelist</TT> is omitted, the context is assumed to be the
1091  !! current pelist. <TT>from_pe</TT> must belong to the current
1092  !! pelist. This call implies synchronization across the PEs in
1093  !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
1094  !!
1095  !! <br>Example usage:
1096  !!
1097  !! call mpp_broadcast( data, length, from_pe, pelist )
1098  !!
1099  !> @param[inout] data Data to broadcast
1100  !> @param length Length of data to broadcast
1101  !> @param from_pe PE to send the data from
1102  !> @param pelist List of PE's to broadcast across, if not provided uses current list
1103  !> @ingroup mpp_mod
1104  interface mpp_broadcast
1105  module procedure mpp_broadcast_char
1106  module procedure mpp_broadcast_real8
1107  module procedure mpp_broadcast_real8_scalar
1108  module procedure mpp_broadcast_real8_2d
1109  module procedure mpp_broadcast_real8_3d
1110  module procedure mpp_broadcast_real8_4d
1111  module procedure mpp_broadcast_real8_5d
1112 #ifdef OVERLOAD_C8
1113  module procedure mpp_broadcast_cmplx8
1114  module procedure mpp_broadcast_cmplx8_scalar
1115  module procedure mpp_broadcast_cmplx8_2d
1116  module procedure mpp_broadcast_cmplx8_3d
1117  module procedure mpp_broadcast_cmplx8_4d
1118  module procedure mpp_broadcast_cmplx8_5d
1119 #endif
1120  module procedure mpp_broadcast_int8
1121  module procedure mpp_broadcast_int8_scalar
1122  module procedure mpp_broadcast_int8_2d
1123  module procedure mpp_broadcast_int8_3d
1124  module procedure mpp_broadcast_int8_4d
1125  module procedure mpp_broadcast_int8_5d
1126  module procedure mpp_broadcast_logical8
1127  module procedure mpp_broadcast_logical8_scalar
1128  module procedure mpp_broadcast_logical8_2d
1129  module procedure mpp_broadcast_logical8_3d
1130  module procedure mpp_broadcast_logical8_4d
1131  module procedure mpp_broadcast_logical8_5d
1132 
1133  module procedure mpp_broadcast_real4
1134  module procedure mpp_broadcast_real4_scalar
1135  module procedure mpp_broadcast_real4_2d
1136  module procedure mpp_broadcast_real4_3d
1137  module procedure mpp_broadcast_real4_4d
1138  module procedure mpp_broadcast_real4_5d
1139 
1140 #ifdef OVERLOAD_C4
1141  module procedure mpp_broadcast_cmplx4
1142  module procedure mpp_broadcast_cmplx4_scalar
1143  module procedure mpp_broadcast_cmplx4_2d
1144  module procedure mpp_broadcast_cmplx4_3d
1145  module procedure mpp_broadcast_cmplx4_4d
1146  module procedure mpp_broadcast_cmplx4_5d
1147 #endif
1148  module procedure mpp_broadcast_int4
1149  module procedure mpp_broadcast_int4_scalar
1150  module procedure mpp_broadcast_int4_2d
1151  module procedure mpp_broadcast_int4_3d
1152  module procedure mpp_broadcast_int4_4d
1153  module procedure mpp_broadcast_int4_5d
1154  module procedure mpp_broadcast_logical4
1155  module procedure mpp_broadcast_logical4_scalar
1156  module procedure mpp_broadcast_logical4_2d
1157  module procedure mpp_broadcast_logical4_3d
1158  module procedure mpp_broadcast_logical4_4d
1159  module procedure mpp_broadcast_logical4_5d
1160  end interface
1161 
1162  !#####################################################################
1163 
1164  !> @brief Calculate parallel checksums
1165  !!
1166  !> \e mpp_chksum is a parallel checksum routine that returns an
1167  !! identical answer for the same array irrespective of how it has been
1168  !! partitioned across processors. \e int_kind is the KIND
1169  !! parameter corresponding to long integers (see discussion on
1170  !! OS-dependent preprocessor directives) defined in
1171  !! the file platform.F90. \e MPP_TYPE_ corresponds to any
1172  !! 4-byte and 8-byte variant of \e integer, \e real, \e complex, \e logical
1173  !! variables, of rank 0 to 5.
1174  !!
1175  !! Integer checksums on FP data use the F90 <TT>TRANSFER()</TT>
1176  !! intrinsic.
1177  !!
1178  !! This provides identical results on a single-processor job, and to perform
1179  !! serial checksums on a single processor of a parallel job, you only
1180  !! need to use the optional <TT>pelist</TT> argument.
1181  !! <PRE>
1182  !! use mpp_mod
1183  !! integer :: pe, chksum
1184  !! real :: a(:)
1185  !! pe = mpp_pe()
1186  !! chksum = mpp_chksum( a, (/pe/) )
1187  !! </PRE>
1188  !!
1189  !! The additional functionality of <TT>mpp_chksum</TT> over
1190  !! serial checksums is to compute the checksum across the PEs in
1191  !! <TT>pelist</TT>. The answer is guaranteed to be the same for
1192  !! the same distributed array irrespective of how it has been
1193  !! partitioned.
1194  !!
1195  !! If <TT>pelist</TT> is omitted, the context is assumed to be the
1196  !! current pelist. This call implies synchronization across the PEs in
1197  !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
1198  !! <br> Example usage:
1199  !!
1200  !! mpp_chksum( var, pelist )
1201  !!
1202  !! @param var Data to calculate checksum of
1203  !! @param pelist Optional list of PE's to include in checksum calculation if not using
1204  !! current pelist
1205  !! @return Parallel checksum of var across given or implicit pelist
1206  !!
1207  !! Generic MPP_TYPE_ implentations:
1208  !! <li> @ref mpp_chksum_</li>
1209  !! <li> @ref mpp_chksum_int_</li>
1210  !! <li> @ref mpp_chksum_int_rmask_</li>
1211  !!
1212  !> @ingroup mpp_mod
1213  interface mpp_chksum
1214  module procedure mpp_chksum_i8_1d
1215  module procedure mpp_chksum_i8_2d
1216  module procedure mpp_chksum_i8_3d
1217  module procedure mpp_chksum_i8_4d
1218  module procedure mpp_chksum_i8_5d
1219  module procedure mpp_chksum_i8_1d_rmask
1220  module procedure mpp_chksum_i8_2d_rmask
1221  module procedure mpp_chksum_i8_3d_rmask
1222  module procedure mpp_chksum_i8_4d_rmask
1223  module procedure mpp_chksum_i8_5d_rmask
1224 
1225  module procedure mpp_chksum_i4_1d
1226  module procedure mpp_chksum_i4_2d
1227  module procedure mpp_chksum_i4_3d
1228  module procedure mpp_chksum_i4_4d
1229  module procedure mpp_chksum_i4_5d
1230  module procedure mpp_chksum_i4_1d_rmask
1231  module procedure mpp_chksum_i4_2d_rmask
1232  module procedure mpp_chksum_i4_3d_rmask
1233  module procedure mpp_chksum_i4_4d_rmask
1234  module procedure mpp_chksum_i4_5d_rmask
1235 
1236  module procedure mpp_chksum_r8_0d
1237  module procedure mpp_chksum_r8_1d
1238  module procedure mpp_chksum_r8_2d
1239  module procedure mpp_chksum_r8_3d
1240  module procedure mpp_chksum_r8_4d
1241  module procedure mpp_chksum_r8_5d
1242 
1243  module procedure mpp_chksum_r4_0d
1244  module procedure mpp_chksum_r4_1d
1245  module procedure mpp_chksum_r4_2d
1246  module procedure mpp_chksum_r4_3d
1247  module procedure mpp_chksum_r4_4d
1248  module procedure mpp_chksum_r4_5d
1249 #ifdef OVERLOAD_C8
1250  module procedure mpp_chksum_c8_0d
1251  module procedure mpp_chksum_c8_1d
1252  module procedure mpp_chksum_c8_2d
1253  module procedure mpp_chksum_c8_3d
1254  module procedure mpp_chksum_c8_4d
1255  module procedure mpp_chksum_c8_5d
1256 #endif
1257 #ifdef OVERLOAD_C4
1258  module procedure mpp_chksum_c4_0d
1259  module procedure mpp_chksum_c4_1d
1260  module procedure mpp_chksum_c4_2d
1261  module procedure mpp_chksum_c4_3d
1262  module procedure mpp_chksum_c4_4d
1263  module procedure mpp_chksum_c4_5d
1264 #endif
1265  end interface
1266 
1267 !> @addtogroup mpp_mod
1268 !> @{
1269 !***********************************************************************
1270 !
1271 ! module variables
1272 !
1273 !***********************************************************************
1274  integer, parameter :: PESET_MAX = 10000
1275  integer :: current_peset_max = 32
1276  type(communicator), allocatable :: peset(:) !< Will be allocated starting from 0, 0 is a dummy used
1277  !! to hold single-PE "self" communicator
1278  logical :: module_is_initialized = .false.
1279  logical :: debug = .false.
1280  integer :: npes=1, root_pe=0, pe=0
1281  integer(i8_kind) :: tick, ticks_per_sec, max_ticks, start_tick, end_tick, tick0=0
1282  integer :: mpp_comm_private
1283  logical :: first_call_system_clock_mpi=.true.
1284  real(r8_kind) :: mpi_count0=0 !< use to prevent integer overflow
1285  real(r8_kind) :: mpi_tick_rate=0.d0 !< clock rate for mpi_wtick()
1286  logical :: mpp_record_timing_data=.true.
1287  type(clock),save :: clocks(max_clocks)
1288  integer :: log_unit, etc_unit
1289  integer :: warn_unit !< unit number of the warning log
1290  character(len=32), parameter :: configfile='logfile'
1291  character(len=32), parameter :: warnfile='warnfile' !< base name for warninglog (appends ".<PE>.out")
1292  integer :: peset_num=0, current_peset_num=0
1293  integer :: world_peset_num !<the world communicator
1294  integer :: error
1295  integer :: clock_num=0, num_clock_ids=0,current_clock=0, previous_clock(max_clocks)=0
1296  real :: tick_rate
1297 
1298  type(mpp_type_list) :: datatypes
1299  type(mpp_type), target :: mpp_byte
1300 
1301  integer :: cur_send_request = 0
1302  integer :: cur_recv_request = 0
1303  integer, allocatable :: request_send(:)
1304  integer, allocatable :: request_recv(:)
1305  integer, allocatable :: size_recv(:)
1306  integer, allocatable :: type_recv(:)
1307 ! if you want to save the non-root PE information uncomment out the following line
1308 ! and comment out the assigment of etcfile to '/dev/null'
1309 #ifdef NO_DEV_NULL
1310  character(len=32) :: etcfile='._mpp.nonrootpe.msgs'
1311 #else
1312  character(len=32) :: etcfile='/dev/null'
1313 #endif
1314 
1315 !> Use the intrinsics in iso_fortran_env
1316  integer :: in_unit=input_unit, out_unit=output_unit, err_unit=error_unit
1317  integer :: stdout_unit
1318 
1319  !--- variables used in mpp_util.h
1320  type(summary_struct) :: clock_summary(max_clocks)
1321  logical :: warnings_are_fatal = .false.
1322  integer :: error_state=0
1323  integer :: clock_grain=clock_loop-1
1324 
1325  !--- variables used in mpp_comm.h
1326  integer :: clock0 !<measures total runtime from mpp_init to mpp_exit
1327  integer :: mpp_stack_size=0, mpp_stack_hwm=0
1328  logical :: verbose=.false.
1329 
1330  integer :: get_len_nocomm = 0 !< needed for mpp_transmit_nocomm.h
1331 
1332  !--- variables used in mpp_comm_mpi.inc
1333  integer, parameter :: mpp_init_test_full_init = -1
1334  integer, parameter :: mpp_init_test_init_true_only = 0
1335  integer, parameter :: mpp_init_test_peset_allocated = 1
1336  integer, parameter :: mpp_init_test_clocks_init = 2
1337  integer, parameter :: mpp_init_test_datatype_list_init = 3
1338  integer, parameter :: mpp_init_test_logfile_init = 4
1339  integer, parameter :: mpp_init_test_read_namelist = 5
1340  integer, parameter :: mpp_init_test_etc_unit = 6
1341  integer, parameter :: mpp_init_test_requests_allocated = 7
1342 
1343 !> MPP_INFO_NULL acts as an analagous mpp-macro for MPI_INFO_NULL to share with fms2_io NetCDF4
1344 !! mpi-io. The default value for the no-mpi case comes from Intel MPI and MPICH. OpenMPI sets
1345 !! a default value of '0'
1346 #if defined(use_libMPI)
1347  integer, parameter :: mpp_info_null = mpi_info_null
1348 #else
1349  integer, parameter :: mpp_info_null = 469762048
1350 #endif
1351 
1352 !> MPP_COMM_NULL acts as an analagous mpp-macro for MPI_COMM_NULL to share with fms2_io NetCDF4
1353 !! mpi-io. The default value for the no-mpi case comes from Intel MPI and MPICH. OpenMPI sets
1354 !! a default value of '2'
1355 #if defined(use_libMPI)
1356  integer, parameter :: mpp_comm_null = mpi_comm_null
1357 #else
1358  integer, parameter :: mpp_comm_null = 67108864
1359 #endif
1360 
1361 !***********************************************************************
1362 ! variables needed for subroutine read_input_nml (include/mpp_util.inc)
1363 !
1364 ! public variable needed for reading input nml file from an internal file
1365  character(len=:), dimension(:), allocatable, target, public :: input_nml_file
1366  logical :: read_ascii_file_on = .false.
1367 !***********************************************************************
1368 
1369 ! Include variable "version" to be written to log file.
1370 #include<file_version.h>
1371  public version
1372 
1373  integer, parameter :: max_request_min = 10000
1374  integer :: request_multiply = 20
1375 
1376  logical :: etc_unit_is_stderr = .false.
1377  integer :: max_request = 0
1378  logical :: sync_all_clocks = .false.
1379  namelist /mpp_nml/ etc_unit_is_stderr, request_multiply, mpp_record_timing_data, sync_all_clocks
1380 
1381  contains
1382 #include <system_clock.fh>
1383 #include <mpp_util.inc>
1384 #include <mpp_comm.inc>
1385 
1386  end module mpp_mod
1387 !> @}
1388 ! close documentation grouping
subroutine mpp_sync_self(pelist, check, request, msg_size, msg_type)
This is to check if current PE's outstanding puts are complete but we can't use shmem_fence because w...
integer warn_unit
unit number of the warning log
Definition: mpp.F90:1289
subroutine mpp_error_basic(errortype, errormsg)
A very basic error handler uses ABORT and FLUSH calls, may need to use cpp to rename.
integer function stdout()
This function returns the current standard fortran unit numbers for output.
Definition: mpp_util.inc:42
subroutine read_ascii_file(FILENAME, LENGTH, Content, PELIST)
Reads any ascii file into a character array and broadcasts it to the non-root mpi-tasks....
Definition: mpp_util.inc:1446
subroutine mpp_error_mesg(routine, errormsg, errortype)
overloads to mpp_error_basic, support for error_mesg routine in FMS
Definition: mpp_util.inc:174
subroutine mpp_set_current_pelist(pelist, no_sync)
Set context pelist.
Definition: mpp_util.inc:498
integer, parameter, public mpp_comm_null
MPP_COMM_NULL acts as an analagous mpp-macro for MPI_COMM_NULL to share with fms2_io NetCDF4 mpi-io....
Definition: mpp.F90:1358
subroutine mpp_init(flags, localcomm, test_level, alt_input_nml_path)
Initialize the mpp_mod module. Must be called before any usage.
integer get_len_nocomm
needed for mpp_transmit_nocomm.h
Definition: mpp.F90:1330
integer function stderr()
This function returns the current standard fortran unit numbers for error messages.
Definition: mpp_util.inc:50
subroutine read_input_nml(pelist_name_in, alt_input_nml_path)
Reads an existing input nml file into a character array and broadcasts it to the non-root mpi-tasks....
Definition: mpp_util.inc:1227
character(len=32), parameter warnfile
base name for warninglog (appends ".<PE>.out")
Definition: mpp.F90:1291
type(communicator), dimension(:), allocatable peset
Will be allocated starting from 0, 0 is a dummy used to hold single-PE "self" communicator.
Definition: mpp.F90:1276
subroutine mpp_type_free(dtype)
Deallocates memory for mpp_type objects @TODO This should probably not take a pointer,...
integer, parameter, public mpp_info_null
MPP_INFO_NULL acts as an analagous mpp-macro for MPI_INFO_NULL to share with fms2_io NetCDF4 mpi-io....
Definition: mpp.F90:1349
integer clock0
measures total runtime from mpp_init to mpp_exit
Definition: mpp.F90:1326
subroutine mpp_clock_set_grain(grain)
Set the level of granularity of timing measurements.
Definition: mpp_util.inc:649
real(r8_kind) mpi_tick_rate
clock rate for mpi_wtick()
Definition: mpp.F90:1285
subroutine mpp_declare_pelist(pelist, name, commID)
Declare a pelist.
Definition: mpp_util.inc:469
integer world_peset_num
the world communicator
Definition: mpp.F90:1293
subroutine mpp_exit()
Finalizes process termination. To be called at the end of a run. Certain mpi implementations(openmpi)...
integer function stdlog()
This function returns the current standard fortran unit numbers for log messages. Log messages,...
Definition: mpp_util.inc:58
integer in_unit
Use the intrinsics in iso_fortran_env.
Definition: mpp.F90:1316
integer function mpp_npes()
Returns processor count for current pelist.
Definition: mpp_util.inc:420
subroutine mpp_set_stack_size(n)
Set the mpp_stack variable to be at least n LONG words long.
integer function, dimension(2) get_ascii_file_num_lines_and_length(FILENAME, PELIST)
Function to determine the maximum line length and number of lines from an ascii file.
Definition: mpp_util.inc:1353
real(r8_kind) mpi_count0
use to prevent integer overflow
Definition: mpp.F90:1284
integer function mpp_pe()
Returns processor ID.
Definition: mpp_util.inc:406
subroutine mpp_sync(pelist, do_self)
Synchronize PEs in list.
integer function mpp_clock_id(name, flags, grain)
Return an ID for a new or existing clock.
Definition: mpp_util.inc:713
integer function warnlog()
This function returns unit number for the warning log if on the root pe, otherwise returns the etc_un...
Definition: mpp_util.inc:140
subroutine mpp_broadcast_char(char_data, length, from_pe, pelist)
Broadcasts a character string from the given pe to it's pelist.
integer function stdin()
This function returns the current standard fortran unit numbers for input.
Definition: mpp_util.inc:35
Takes a given integer or real array and returns it as a string.
Definition: mpp.F90:410
Scatter a vector across all PEs.
Definition: mpp.F90:775
Perform parallel broadcasts.
Definition: mpp.F90:1104
Calculate parallel checksums.
Definition: mpp.F90:1213
Error handler.
Definition: mpp.F90:381
Gather data sent from pelist onto the root pe Wrapper for MPI_gather, can be used with and without in...
Definition: mpp.F90:697
Reduction operations. Find the max of scalar a from the PEs in pelist result is also automatically br...
Definition: mpp.F90:537
Reduction operations. Find the min of scalar a from the PEs in pelist result is also automatically br...
Definition: mpp.F90:559
Receive data from another PE.
Definition: mpp.F90:950
Scatter (ie - is) * (je - js) contiguous elements of array data from the designated root pe into cont...
Definition: mpp.F90:747
Send data to a receiving PE.
Definition: mpp.F90:1017
Reduction operation.
Definition: mpp.F90:596
Calculates sum of a given numerical array across pe's for adjoint domains.
Definition: mpp.F90:641
Basic message-passing call.
Definition: mpp.F90:885
Create a mpp_type variable.
Definition: mpp.F90:509
a clock contains an array of event profiles for a region
Definition: mpp.F90:249
Summary of information from a clock run.
Definition: mpp.F90:265
Communication information for message passing libraries.
Definition: mpp.F90:229
Communication event profile.
Definition: mpp.F90:240
Data types for generalized data transfer (e.g. MPI_Type)
Definition: mpp.F90:286
Persisent elements for linked list interaction.
Definition: mpp.F90:302
holds name and clock data for use in mpp_util.h
Definition: mpp.F90:278