Modern information technology / Computer engineering

Shvachych G.G., Tkach Ì.À., Shcherbyna P.A.

National metallurgical academy of Ukraine

About problems of designing of the high-efficiency integrated environment on the basis of computing clusters 

The computing direction of application of computers always remained the basic engine of progress in computer technologies. In this connection it is not surprising, that as the basic characteristic of computers such parameter as productivity – the size showing, it can execute what quantity of arithmetic operations for a time unit is used. This parameter with the greatest evidence shows scales of the progress achieved in computer technologies. So, for example, productivity of one of the very first computers EDSAC made all about 100 operations in a second whereas peak productivity of the most powerful for today of supercomputer Earth Simulator is estimated in 40 billion operations/sec. I.e. there was an increase in speed in 400 billion times! It is impossible to name other sphere of human activity where so prompt progress is observed.

It is obvious, that the problem of growth of productivity of computing resources in anybody does not cause doubt. Growth of productivity of computing systems now is provided with two ways:

      - creation Large-Scale Arrays, working on ultrahigh frequencies with a high degree conveyorizations;

      - creation of multiprocessing computing systems.

The first way is connected to rather significant capital investments and is not the guarantor of creation of superhigh-efficiency system. Experience of firm Cray which have created a supercomputer on the basis of arsenide of gallium, has shown, that development of essentially new element base for high-efficiency computing systems is an excessive problem even for such eminent corporations.

The second way began to dominate after the announcement in the USA so-called « the Strategic supercomputer initiative ». For the last few years the largest computer companies (IBM, Intel, Compaq) are engaged within the framework of program ASCI in development highly parallel computing systems on base of commercially accessible processors (Alpha, Pentium and so forth). Some firms (Hitachi, Fujitsu) continue development of vector systems on Large-Scale Arrays own manufacture. It testifies to affinity of productivity of commercial processors to restrictions, íàêëàäûâàåìûì technology Large-Scale Arrays.

Considering specified, it is possible to note, that recently has received a wide circulation êëàñòåðíûé the approach at construction of high-efficiency systems. The essential achievements received at creation of multiprocessing computing systems provide new opportunities for application of parallel calculations in the decision complex multivariate problems.

Nevertheless it is necessary to note, that till now application of parallel computing systems has not received so a wide circulation as it was already repeatedly predicted by many researchers. We shall analyse these reasons.

At the first stage we shall consider the objective reasons. The first and, perhaps, a principal cause of restraint of mass distribution of parallelism will be, that for carrying out of parallel calculations "parallel" generalization traditional - consecutive - is necessary for technology of the decision of problems on PERSONAL COMPUTER. So, numerical methods in case of multiprocessing systems should be projected as system of the processes parallel and cooperating among themselves, calculations admitting process on independent processors. Used algorithmic languages and the system software should provide creation of parallel programs, organize synchronization and mutual exclusion of asynchronous processes, etc.

     The second reason will consist in primary aiming of thinking of modern programmers at strict sequence of performance of calculations. This is not interfered even by a realized reality of wide application of parallelism in performance of microcommands by modern microprocessors. This factor speaks corresponding preparation of experts. So, for many high schools it is characteristic only theoretical (if not scholastic) a bias of employment on parallel data processing on the one hand, and all the reduced volume of the academic employment on technologies of parallel programming, on the other hand.

The third reason will be, that procedure of automation parallel of division now is insufficient effective. In general there are fundamental reasons which consequence is the practical impossibility qualitative parallel of division the consecutive program written in habitual programming languages.

Besides it is possible to note, that a number of researchers is stated with the doubts about an opportunity of wide practical application of parallel calculations by virtue of the following financial and technical problems.

1. High cost of parallel systems. This thesis is proved by law Grosch confirmed in practice: productivity of a computer grows proportionally to a square of his cost. As the result, is much more favourable to receive required computing capacity purchase of one productive processor, than use of several less high-speed processors.

Here it is necessary to take into account, that growth of speed consecutive PERSONAL COMPUTER cannot proceed indefinitely; besides computers are subject to a fast obsolescence and productivity of the most powerful processor at present time is quickly blocked by new models of computers (and updating of an obsolete computer demands, as a rule, the big financial expenditure). On the other hand, we shall notice, that essential interest to construction of multiprocessing parallel computing systems (clusters) recently was planned on the basis of standard popular technologies and components. It is caused by a number of factors. We shall note the basic from them. In - the first, growth, according to needs of the market, productivity of such standard network technologies as Ethernet, has allowed to consider them as the communication environment for multiprocessing computing systems. Second, one important factor became increase in popularity of freely distributed operational system Linux. This operational system was originally positioned as variant UNIX for platforms on the basis of architecture Intel, but versions for other popular microprocessors, including for leaders on productivity for the last few years — microprocessors Alpha have quickly enough appeared.

In view of economic realities of our country use of the systems constructed on the basis of standard technologies, becomes more than actually. And depending on problems and the budget of the project various enough variants of a configuration are possible. In the most accessible configuration standard parent payments for processors Intel Pentium III and network adapters Fast Ethernet are used. Units êëàñòåðà are united among themselves by means of switchboard Fast Ethernet on corresponding number of ports. The quantity of units and their configuration depends on requirements showed to computing resources specific targets and accessible financial opportunities.

2. Losses of productivity at the organization of parallelism according to hypothesis Minsky.That the acceleration achievable at use of parallel system, to proportionally binary logarithm from number of processors (i.e. at 1000 processors possible acceleration appears equal 10) here the question is.

Certainly, the similar estimation of acceleration can take place at parallel of division the certain algorithms. At the same time, there is a plenty of problems at which parallel decision 100% use of all available processors of the parallel computing system are reached.

3. Constant perfection of consecutive computers. In conformity with widely known law Moore capacity of consecutive processors grows practically in 2 times each 18-24 months (historical digression shows, that speed personal computer increased for the order each 5 years) and as the result, necessary productivity can be achieved and on "usual" consecutive computers.

The given remark is close as a matter of fact to first of the objections resulted here against parallelism. In addition to the available answer to this remark it is possible to note, that similar development is peculiar also to parallel systems. Besides application of parallelism allows to receive desirable acceleration of calculations without any expectation of new more high-speed processors.

4.Existence of consecutive calculations according to law of Amdahl. Acceleration of process of calculations at use p processors is limited to size

ôîðìóëà,

where f there is a share of consecutive calculations in used algorithm of data processing (i.e., for example, at presence of only 10 % of consecutive commands  in carried out calculations, the effect of use of parallelism cannot exceed 10-fold acceleration of data processing).

The given remark characterizes one of the most serious problems in the field of parallel programming (algorithms without the certain share of consecutive commands practically do not exist). However frequently the share of consecutive actions characterizes not an opportunity of the parallel decision of problems, and consecutive properties of used algorithms. As the result, a share of consecutive calculations can be essentially reduced at a choice more suitable for parallel of division algorithms.

5. Dependence of efficiency of parallelism on the account of characteristic properties of parallel systems. As against uniqueness of the classical circuit Neumann background consecutive personal computer parallel systems differ an essential variety of architectural principles of construction. The maximal effect from use of parallelism can be received only at full use of all features of the equipment - as result, carry of parallel algorithms and programs between different types of systems becomes inconvenient (if at all it is possible).

For the answer to the given remark it is necessary to note, that "uniformity" of consecutive COMPUTERS also is apparent and their effective utilization too demands the account of properties of the equipment. On the other hand, at all a variety of architecture of parallel systems, nevertheless, exist and the certain "settled" ways of maintenance of parallelism (conveyor calculations, multiprocessing systems, etc.). Besides invariancy of the created software can be provided and by means of use of typical software of support of parallel calculations (such as program libraries MPI, PVM, etc.);

6. The existing software is focused basically on consecutive COMPUTERS. The given objection will be, that for a plenty of problems already there is a prepared software and all these programs are focused mainly on consecutive COMPUTERS - as result, processing of such quantity of programs for parallel systems hardly appears possible.

The answer to the given objection is rather simple - if existing programs provide the decision of tasks in view, that, certainly, processing of these programs and is not necessary. However if consecutive programs do not allow to receive the decision of problems for comprehensible time or there is a necessity of the decision of new problems necessary there is a development of the new software and these programs can be realized in parallel execution.

Summing up all listed remarks, it is possible to conclude, that parallel calculations are perspective (and attractive) a scope of computer facilities and represent a complex scientific and technical problem. Thus, the problems arising by development of parallel computing systems, adequate to the set characteristics, as a rule, are paramount and demand deep studying and research. Really, the distributed (parallel) computer modelling covers all spectrum of modern computer facilities: supercomputers, êëàñòåðíûå computing systems, local and global networks, etc. Besides the distributed modelling allows to solve the problems demanding a plenty of processor time, to integrate mathematical models, which are processed on various (including geographically remote) computing systems. Thus and hardware for achievement of parallelism, skill to develop model, methods and programs of the parallel decision of problems data processing it is necessary to attribute knowledge of modern lines of development personal computer to number of the important qualifying characteristics of the modern expert on applied mathematics, computer science and computer facilities.