Modern information technology
/ Computer engineering
Shvachych G.G.,
Tkach Ì.À., Shcherbyna P.A.
National
metallurgical academy of Ukraine
About
problems of designing of the high-efficiency integrated environment on the
basis of computing clusters
The computing direction of application of computers always remained the
basic engine of progress in computer technologies. In this connection it is not
surprising, that as the basic characteristic of computers such parameter as
productivity – the size showing, it can execute what quantity of arithmetic
operations for a time unit is used. This parameter with the greatest
evidence shows scales of the progress achieved in computer technologies. So,
for example, productivity of one of the very first computers EDSAC made all
about 100 operations in a second whereas peak productivity of the most powerful
for today of supercomputer Earth Simulator is estimated in 40 billion
operations/sec. I.e. there was an increase in speed in 400 billion times! It is
impossible to name other sphere of human activity where so prompt progress is
observed.
It is
obvious, that the problem of growth of productivity of computing resources in
anybody does not cause doubt. Growth of productivity of computing systems now
is provided with two ways:
-
creation Large-Scale Arrays, working on ultrahigh frequencies with a high
degree conveyorizations;
-
creation of multiprocessing computing systems.
The
first way is connected to rather significant capital investments and is not the
guarantor of creation of superhigh-efficiency system. Experience of firm Cray
which have created a supercomputer on the basis of arsenide of gallium, has
shown, that development of essentially new element base for high-efficiency
computing systems is an excessive problem even for such eminent corporations.
The
second way began to dominate after the announcement in the USA so-called « the
Strategic supercomputer initiative ». For the last few years the largest
computer companies (IBM, Intel, Compaq) are engaged within the framework of
program ASCI in development highly parallel computing systems on base of
commercially accessible processors (Alpha, Pentium and so forth). Some firms
(Hitachi, Fujitsu) continue development of vector systems on Large-Scale Arrays
own manufacture. It testifies to affinity of productivity of commercial
processors to restrictions, íàêëàäûâàåìûì technology Large-Scale
Arrays.
Considering
specified, it is possible to note, that recently has received a wide
circulation êëàñòåðíûé the approach
at construction of high-efficiency systems. The essential achievements received
at creation of multiprocessing computing systems provide new opportunities for
application of parallel calculations in the decision complex multivariate
problems.
Nevertheless it is necessary to note, that till now
application of parallel computing systems has not received so a wide
circulation as it was already repeatedly predicted by many researchers. We
shall analyse these reasons.
At the first stage we shall consider the objective
reasons. The first and, perhaps, a principal cause of restraint of mass
distribution of parallelism will be, that for carrying out of parallel
calculations "parallel" generalization traditional - consecutive - is
necessary for technology of the decision of problems on PERSONAL COMPUTER. So,
numerical methods in case of multiprocessing systems should be projected as
system of the processes parallel and cooperating among themselves, calculations
admitting process on independent processors. Used algorithmic languages and the
system software should provide creation of parallel programs, organize
synchronization and mutual exclusion of asynchronous processes, etc.
The second reason will consist in primary
aiming of thinking of modern programmers at strict sequence of performance of
calculations. This is not interfered even by a realized reality of wide
application of parallelism in performance of microcommands by modern
microprocessors. This factor speaks corresponding preparation of experts. So, for
many high schools it is characteristic only theoretical (if not scholastic) a
bias of employment on parallel data processing on the one hand, and all the
reduced volume of the academic employment on technologies of parallel
programming, on the other hand.
The third reason will be, that procedure of automation
parallel of division now is
insufficient effective. In general there are fundamental reasons which consequence
is the practical impossibility qualitative parallel of division the consecutive program written in
habitual programming languages.
Besides it is possible to note, that a number of
researchers is stated with the doubts about an opportunity of wide practical
application of parallel calculations by virtue of the following financial and
technical problems.
1. High cost of parallel systems. This thesis is
proved by law Grosch confirmed in practice: productivity of a computer
grows proportionally to a square of his cost. As the result, is much more
favourable to receive required computing capacity purchase of one productive
processor, than use of several less high-speed processors.
Here it is necessary to take into account, that growth of speed
consecutive PERSONAL COMPUTER cannot proceed indefinitely; besides computers
are subject to a fast obsolescence and productivity of the most powerful
processor at present time is quickly blocked by new models of computers (and
updating of an obsolete computer demands, as a rule, the big financial
expenditure). On the other hand, we shall notice, that essential interest to
construction of multiprocessing parallel computing systems (clusters) recently
was planned on the basis of standard popular technologies and components. It is
caused by a number of factors. We shall note the basic from them. In - the
first, growth, according to needs of the market, productivity of such standard
network technologies as Ethernet, has allowed to consider them as the
communication environment for multiprocessing computing systems. Second, one
important factor became increase in popularity of freely distributed
operational system Linux. This operational system was originally positioned as
variant UNIX for platforms on the basis of architecture Intel, but versions for
other popular microprocessors, including for leaders on productivity for the
last few years — microprocessors Alpha have quickly enough appeared.
In view
of economic realities of our country use of the systems constructed on the
basis of standard technologies, becomes more than actually. And depending on
problems and the budget of the project various enough variants of a
configuration are possible. In the most accessible configuration standard
parent payments for processors Intel Pentium III and network adapters Fast
Ethernet are used. Units êëàñòåðà are united
among themselves by means of switchboard Fast Ethernet on corresponding number
of ports. The quantity of units and their configuration depends on requirements
showed to computing resources specific targets and accessible financial
opportunities.
2. Losses of productivity at the organization of parallelism according
to hypothesis Minsky.That the acceleration achievable at use of parallel
system, to proportionally binary logarithm from number of processors (i.e. at
1000 processors possible acceleration appears equal 10) here the question is.
Certainly, the similar estimation of acceleration can
take place at parallel of division the
certain algorithms. At the same time, there is a plenty of problems at which
parallel decision 100% use of all available processors of the parallel
computing system are reached.
3. Constant
perfection of consecutive computers. In conformity
with widely known law Moore capacity of consecutive processors grows
practically in 2 times each 18-24 months (historical digression shows, that
speed personal computer increased for the order each 5 years) and as the
result, necessary productivity can be achieved and on "usual"
consecutive computers.
The
given remark is close as a matter of fact to first of the objections resulted
here against parallelism. In addition to the available answer to this remark it
is possible to note, that similar development is peculiar also to parallel
systems. Besides application of parallelism allows to receive desirable
acceleration of calculations without any expectation of new more high-speed
processors.
4.Existence of consecutive calculations
according to law of Amdahl. Acceleration of process of calculations at
use p processors is limited to size
,
where
f there is a share of consecutive calculations in used algorithm of
data processing (i.e., for example, at presence of only 10 % of consecutive
commands in carried out calculations,
the effect of use of parallelism cannot exceed 10-fold acceleration of data
processing).
The given remark characterizes one of the most serious
problems in the field of parallel programming (algorithms without the certain
share of consecutive commands practically do not exist). However frequently the
share of consecutive actions characterizes not an opportunity of the parallel
decision of problems, and consecutive properties of used algorithms. As the
result, a share of consecutive calculations can be essentially reduced at a choice
more suitable for parallel of division algorithms.
5. Dependence of efficiency of parallelism on the
account of characteristic properties of parallel systems. As
against uniqueness of the classical circuit Neumann background consecutive personal
computer parallel systems differ an essential variety of architectural
principles of construction. The maximal effect from use of parallelism can be
received only at full use of all features of the equipment - as result, carry
of parallel algorithms and programs between different types of systems becomes
inconvenient (if at all it is possible).
For the answer to the given remark it is necessary to
note, that "uniformity" of consecutive COMPUTERS also is apparent and
their effective utilization too demands the account of properties of the
equipment. On the other hand, at all a variety of architecture of parallel
systems, nevertheless, exist and the certain "settled" ways of
maintenance of parallelism (conveyor calculations, multiprocessing systems, etc.).
Besides invariancy of the created software can be provided and by means of use
of typical software of support of parallel calculations (such as program
libraries MPI, PVM, etc.);
6. The existing software is focused basically on
consecutive COMPUTERS. The given objection will be, that for a
plenty of problems already there is a prepared software and all these programs
are focused mainly on consecutive COMPUTERS - as result, processing of such
quantity of programs for parallel systems hardly appears possible.
The answer to the given objection is rather simple -
if existing programs provide the decision of tasks in view, that, certainly,
processing of these programs and is not necessary. However if consecutive
programs do not allow to receive the decision of problems for comprehensible
time or there is a necessity of the decision of new problems necessary there is
a development of the new software and these programs can be realized in
parallel execution.
Summing up all listed remarks, it is possible to
conclude, that parallel calculations are perspective (and attractive) a scope
of computer facilities and represent a complex scientific and technical
problem. Thus, the problems arising by development of parallel computing
systems, adequate to the set characteristics, as a rule, are paramount and
demand deep studying and research. Really, the distributed (parallel) computer
modelling covers all spectrum of modern computer facilities: supercomputers, êëàñòåðíûå computing
systems, local and global networks, etc. Besides the distributed modelling
allows to solve the problems demanding a plenty of processor time, to integrate
mathematical models, which are processed on various (including geographically
remote) computing systems. Thus and hardware for achievement of parallelism,
skill to develop model, methods and programs of the parallel decision of
problems data processing it is necessary to attribute knowledge of modern lines
of development personal computer to number of the important qualifying
characteristics of the modern expert on applied mathematics, computer science
and computer facilities.