Prokopalo Y., Kaliberda
N.V.
DNU, the Faculty of Applied Mathematics
The basic
concepts and work principles of MPI technologies.
Many of
those, who seriously engage in programming, certainly ran into such sort of
tasks decision of which occupies the enormous amount of time through known
algorithms. Even for the most modern and powerful computers weeks, months, and
even years are needed in order to execute the enormous amount of calculations which
are necessary for a decision similar tasks.
In
these cases the so named concurrent programming comes for the programmers to
help. The methods of the concurrent programming allow us to distribute
necessary calculations between the processes of different computers, allowing us
through that to unite their calculable power in-process on the one problem. MPI
is now the most widespread technology of programming for parallel computers
with the up-diffused memory. The basic method of parallel interprocess
communication in such systems is passing of messages to each other. It is
reflected in the name of this technology - Message Passing Interface (interface
of passing of messages). The standard of MPI fixes an interface, which must be
observed by both the system of programming on every calculable platform and
user during creation of the programs. Modern performances mostly correspond to
the standard of MPI of version 1.1. The standard of MPI- 2.0, considerably
extending functionality of previous version has appeared in 1997-1998. However
until now this variant of MPI did not get wide distribution and in full not
realized in no one system.
MPI
supports work with Fortran and C languages. However knowledge of that or
another language is not fundamental, as basic ideas of MPI and design rules of
separate constructions for these languages are in a great deal similar. The
complete version of interface contains description of more than 125 procedures
and functions. My task is to describe the idea of technology.
The
interface of MPI supports creation of the parallel programs in style of MIMD (Multiple Instruction Multiple Data),
that implies the association of processes with different original texts.
However writing and debugging of such programs are very difficult, therefore in
practice programmers use SPMD –model much more frequent (Single Program
Multiple Data) of the concurrent programming, within the framework of which the
same code is used for all parallel processes. At present time all anymore and
more perfomances of MPI support work with filaments.
As MPI is a library, then during
the compiling of the program corresponding library modules are needed to be
linked. It can do it in a command line or take advantage of the commands or
scripts mostly foreseen. For example, mpiCC for the programs in language of C++. Option of compiler of
"- o name" allows to set the
name for the got executable file, executable file is named a.out by default,
for example:
mpiCC - o program program.f
After the receipt of executable file it is necessary to start him on the
required amount of processors. For this purpose the command of start of MPI-applications
is usually given mpirun, for example:
mpirun - np N <program with arguments>,
where N is a number of processes which it must be no more than the
number of processes settled in this system for one task. After a start the same
program will be executed by all started processes, the result of implementation
which depends on the system will be given out on a terminal or written down in
a file with the predefined name.
All additional objects:
the names of procedures, constants, predefined types of data etc., used in MPI,
have prefix of MPI_. If a user will not
use in the program the names with such prefix, then conflicts with the objects
of MPI certainly will not be. In the C language, in addition, a register of
symbols is substantial in the names of functions. Usually in the names of
functions of MPI the first letter after prefix of MPI_ is written in an upper
case, subsequent letters - in lowercase, and the names of constants of MPI are
written down wholly in an upper case. All interface of MPI specifications are
collected in the file of mpif.h (mpi.h) therefore at the beginning of MPI –program
must stand directive # include
"mpi.h".
MPI -program is the great
number of parallel interactive processes. All processes are generated one time,
forming parallel part of the program. During execution of MPI program the generation
of additional processes or elimination of existing is shut (in MPI - 2.0 such
possibility has appeared) out. Every process works in its address space, no
general variables or data in MPI are not present. The basic method of
co-operation between processes is an obvious parcel of reports.
For localization of
parallel interprocess communication it is possible to create the groups of
processes, providing to them a separate environment for intercourse - a
communicator. Composition of the formed groups is arbitrary. Groups can fully
coincide, included one in other, not intersect, or intersect partly. Processes
can co-operate only into some communicator, reports, sent in different
communicators, do not intersect and does not interfere with each other.
Communicators have in the C language the predefined type of mpi_comm.
At the start of the
program it always is considered that all descendant processes work within the
framework of all-embracing communicator, having the predefined name of
mpi_comm_world. This communicator exists always and serves for all started interprocess
communication of MPI -program. Except it at the start of the program there is a
communicator of mpi_comm_self, containing only one current process, and also
communicator of mpi_comm_null,
containing no one process. All interprocess communications flow within the
framework of certain communicator, reports, passed in different communicators,
in any way do not interfere with each other.
The basic concepts and work
principles of MPI technologies are like that, which allow you to decide quickly
any task having a sufficient amount of computers.