Prokopalo Y., Kaliberda N.V.

DNU, the Faculty of Applied Mathematics

The basic concepts and work principles of MPI technologies.

                     Many of those, who seriously engage in programming, certainly ran into such sort of tasks decision of which occupies the enormous amount of time through known algorithms. Even for the most modern and powerful computers weeks, months, and even years are needed in order to execute the enormous amount of calculations which are necessary for a decision similar tasks.

                  In these cases the so named concurrent programming comes for the programmers to help. The methods of the concurrent programming allow us to distribute necessary calculations between the processes of different computers, allowing us through that to unite their calculable power in-process on the one problem. MPI is now the most widespread technology of programming for parallel computers with the up-diffused memory. The basic method of parallel interprocess communication in such systems is passing of messages to each other. It is reflected in the name of this technology - Message Passing Interface (interface of passing of messages). The standard of MPI fixes an interface, which must be observed by both the system of programming on every calculable platform and user during creation of the programs. Modern performances mostly correspond to the standard of MPI of version 1.1. The standard of MPI- 2.0, considerably extending functionality of previous version has appeared in 1997-1998. However until now this variant of MPI did not get wide distribution and in full not realized in no one system.

                  MPI supports work with Fortran and C languages. However knowledge of that or another language is not fundamental, as basic ideas of MPI and design rules of separate constructions for these languages are in a great deal similar. The complete version of interface contains description of more than 125 procedures and functions. My task is to describe the idea of technology.

                  The interface of MPI supports creation of the parallel programs in style of  MIMD (Multiple Instruction Multiple Data), that implies the association of processes with different original texts. However writing and debugging of such programs are very difficult, therefore in practice programmers use SPMD –model much more frequent (Single Program Multiple Data) of the concurrent programming, within the framework of which the same code is used for all parallel processes. At present time all anymore and more perfomances of MPI support work with filaments.

         As MPI is a library, then during the compiling of the program corresponding library modules are needed to be linked. It can do it in a command line or take advantage of the commands or scripts mostly foreseen. For example,  mpiCC for the programs in language of C++. Option of compiler of "- o name" allows to set  the name for the got executable file, executable file is named a.out by default, for example:

         mpiCC - o program program.f

After the receipt of executable file it is necessary to start him on the required amount of processors. For this purpose the command of start of MPI-applications is usually given mpirun, for example:

         mpirun - np N <program with arguments>,

where N is a number of processes which it must be no more than the number of processes settled in this system for one task. After a start the same program will be executed by all started processes, the result of implementation which depends on the system will be given out on a terminal or written down in a file with the predefined name.

         All additional objects: the names of procedures, constants, predefined types of data etc., used in MPI, have prefix of  MPI_. If a user will not use in the program the names with such prefix, then conflicts with the objects of MPI certainly will not be. In the C language, in addition, a register of symbols is substantial in the names of functions. Usually in the names of functions of MPI the first letter after prefix of MPI_ is written in an upper case, subsequent letters - in lowercase, and the names of constants of MPI are written down wholly in an upper case. All interface of MPI specifications are collected in the file of mpif.h (mpi.h) therefore at the beginning of MPI –program must stand directive  # include "mpi.h".

         MPI -program is the great number of parallel interactive processes. All processes are generated one time, forming parallel part of the program. During execution of MPI program the generation of additional processes or elimination of existing is shut (in MPI - 2.0 such possibility has appeared) out. Every process works in its address space, no general variables or data in MPI are not present. The basic method of co-operation between processes is an obvious parcel of reports.

         For localization of parallel interprocess communication it is possible to create the groups of processes, providing to them a separate environment for intercourse - a communicator. Composition of the formed groups is arbitrary. Groups can fully coincide, included one in other, not intersect, or intersect partly. Processes can co-operate only into some communicator, reports, sent in different communicators, do not intersect and does not interfere with each other. Communicators have in the C language the predefined type of mpi_comm.

         At the start of the program it always is considered that all descendant processes work within the framework of all-embracing communicator, having the predefined name of mpi_comm_world. This communicator exists always and serves for all started interprocess communication of MPI -program. Except it at the start of the program there is a communicator of mpi_comm_self, containing only one current process, and also communicator of  mpi_comm_null, containing no one process. All interprocess communications flow within the framework of certain communicator, reports, passed in different communicators, in any way do not interfere with each other.

         The basic concepts and work principles of MPI technologies are like that, which allow you to decide quickly any task having a sufficient amount of computers.