IMPLEMENTATION
PARALLEL COMPUTING IN A
MULTIPROCESSOR SYSTEM
Zhoranova N. Zh., Tazhibai A
Many tasks require a lot of computing
operations, which occupy considerable resources even modern technology,
moreover, can be read with confidence that no matter what the velocity no reached
computer technology will always be problems to be solved which took
considerable time. Many of these complex tasks require that the result was obtained for the
least possible time, or even strictly limited. These problems, for example,
include weather forecasting, image processing and recognition of images in the
management of technology. On the other hand represents a great technical
problem of reducing the time of execution of each operation in the
microprocessor.
The obvious way to increase computing speed
would be the use of not just one computing device, and several working together
to solve a problem. This approach is known as parallel computing. Despite the seeming simplicity of the solution,
it is sometimes very trivial task for the design of computer technology and the
development of algorithms. The first problem lies in the fact that for the task
can be solved with the help of parallel computing algorithm for its solution
should allow parallelization moreover, not every problem can be solved by a
parallel algorithm. The other, equally important problem is the construction of
the system, which has been the implementation of parallel computing, would be
possible.
There are two basic approaches to parallel
computing in microprocessor systems, called single-threaded and multithreaded parallelism.
The difference lies in the use of one or more execution threads for parallel
computing.
Single-threaded
parallelism - is parallel execution of operations within a single thread of
execution. Possibility of single-threaded parallelism determined by architecture
the microprocessors, namely its ability to be read from the memory and perform
multiple operations.
Multithreaded
parallelism - the use of multiple threads to achieve parallel execution of
operations. In order to ensure multi-threaded parallelism need to create a
system with multiple processors or processor cores.
Principles
of organization of single-threaded and multithreaded parallelism, their
features, advantages and disadvantages are very different and have little in
common in the implementation of the computer system, and in building software.
Single-threaded parallelism
As
already described above, a single-threaded parallelism is implemented within a
single thread of execution. Possibility of single-threaded parallel execution
is almost entirely on the microprocessor architecture - it determines the
ability to perform parallel microprocessor instructions.
Single-threaded
parallelism possesses with its advantages and disadvantages.
Advantages:
·
Lack of need of synchronization -
all the operations are performed within a single stream, and therefore in a
strictly defined sequence.
·
Lack of need of support parallelism
at the operating system level.
·
Lack of need for shared management
funds (arbitration).
Disadvantages:
·
Difficulty use algorithms with conditional transitions.
·
Need to adapt the program to effectively use the resources of the
microprocessor, for example, when switching from one model to another.
There
are the following methods to achieve parallel computing:
·
Data of package for batch processing unit instructions using
special techniques. For example, it is possible to add two pairs of 8-bit data,
16-bit operation, eliminating the possibility of overflow. The method has very
limited application.
·
Superscalar architecture.
Microprocessor control unit independently analyzes the flow of instructions for
the possibility of parallel execution and manages several functional units.
Vector
processing. The microprocessor has instructions that produce the same type of
group operations. Multiples operands are packed in one vector register. This
method is similar to the first, but to ensure parallelism lies in the
microprocessor architecture. Vector registers tend to have a greater capacity.
Required to adapt the program to use vector instructions or use an optimizing
compiler.
·
Microprocessor with explicit
parallelism. The method is similar to the second, but the program for this
processor contains explicit guidance on what operations should be carried out
in parallel. Parallelization of computations in this case the sole
responsibility of the programmer or optimizing compiler.
Therefore,
special technologies for creating multiprocessor systems have been developed. This
allowed to process data in parallel, and, therefore, faster. Respectively,
created the operating system supports multi-processor technology. Such as:
Solaris (Sun Microsystems), Unix-like OS: Irix (SGI), AIX (IBM); Linux Red Hat;
Windows XP. Consider the Solaris operating system version 2.4. Solaris 2.4 - is
a Unix-like system developed by Sun Microsystems.
In
the Solaris 2.4 operating system, there is such a thing as a stream. The stream
(thread) - a sequence of instructions executed within the context of the process.
This operating system supports multi-threaded processes. The word
"multi-threaded" means the set of managed content flows. The
traditional UNIX process contains a controlled flow. Multithreading in turn
comprises a plurality of threads that run independently. Because each thread is
executed independently of the program code parallelization leads to:
·
improved sensitivity applications
·
using multiprocessor more
efficiently
·
improve the structure of the
program,
·
use less system resources,
·
improving the presentation.
Definition
of dimensional and parallel execution. Simultaneity occurs when at least two
streams in the process at one time. Parallelism occurs when at least two
threads are executed concurrently. In a multithreaded process on a single
processor, the processor can allocate resources between threads, resulting in
concurrent execution. In a similar process on a multithreaded shared memory in
a multiprocessor system, each thread in the process can be run on a separate
processor at the same time, the result is a parallel execution. When the
process has the same thread, or less, and many processors, the threads support
system and operational system "confident" that each thread is
executed on your processor.
The
streams are only visible inside the process where they share the process
resources like address space, open files, and so on. D. Each thread has a
unique ID, a register stack, the mask priority. Because flows divide the
process instructions and most of the data, changing data by one thread can be
seen by other threads in the process. When a thread needs to interact with
other process streams, it can do so without involving the operating system. The
streams are the main programming interface in multithreaded programming.
Streams user levels are managed in user space and can therefore prohibit kernel
context switching resources. An application can have thousands of flows and not
consume much resources of the nucleus. The number of kernel resources consumed
by the application, is largely determined by the application.
"Lightweight" Default streams. For more control over the flow, the
application may limit the flows. When the application restricts the flow of
access to resources, the flow becomes a core resource. Options for streaming,
such as thr_create (....), Thr_self (), thr_join (....), And so on are described in libthread library.
thr_create function creates a thread, depending on the set parameters. thr_join
function combines flows in a parallel process. thr_self function returns the
thread number in the process.
Using a larger number of processors speeds
up the work program and greatly complicates the work of programmers. Therefore,
when working with large amounts of data make better use of multi-processor
systems.
Thus, according to the observer, which was
later called "Moore's Law", about 2000-2001, the number of
transistors on a chip doubles every two years, with the growing and clock
frequency. But since the beginning of the two thousandth's production
technology has changed. The development of affordable server processors went by
increasing the clock frequency, and in the direction of adding processors. Now,
not only increases the amount of CPU, but also the number of their nuclei. This
happened because now the development of crystals is utilized nanotechnology and
there are physical limitations already defined, in particular, the size of the
atomic lattice.
Today,
parallel computing have been using a small number of programs. This is due
primarily to the fact that the development of applications, using parallels
computations, more complex and more time consuming than with traditional
sequential computing.
Modern
DBMS use multithreaded schemes of work, and at first glance it seems that
parallel computing is provided automatically. Each user corresponds to a
compound that runs in a separate thread.
Users
can send their requests to the database are absolutely parallel. The problem is
that not all requests (requests implied by including the change operations) may
be processed in parallel. This is due to the logic and the problems associated
with dirty "reading. In MSSQL inefficiently configured locking mechanism
does not allow all users to work as much as possible in parallel with the
system. This is only possible for systems in which all requests are read-only.
Also it is necessary to allocate separate regulatory procedures with long
execution time. In some cases, they may be well parallelized and thus
accelerated, but for this they must be appropriately programmed. Otherwise, the
routine procedures to be executed in one thread and use the limited server
resources and, accordingly, will be run much longer.
You
can give an example for the simulation tasks. For example, the problem of
calculating the trajectory of interacting physical bodies. Such problems are
solved, as a rule, by computer modeling. It is assumed that, if the time
interval divided into small pieces, it is possible to iteratively calculate new
coordinate for each of the bodies. The time interval is less, the accuracy of
calculations is higher. Previously, single-processor servers were meaningless
data parallelize problem. In the cycle corresponding to the set of all bodies
was calculated for each coordinate. You also need a pool of sessions 1C and
coordinator tasks that will give each session 1C your document to hold. Once
the session holding the document and released, the coordinator gives her a new
holding on. You may ask, what about the sequence and exact chronology?
Coordinator is responsible for this lock. If the document is associated with
ongoing at this time by other documents at the data level, for example, the
nomenclature or the client, and then it will wait for their turn. If the
document is not connected, it will be carried out in parallel. Each individual
processor is parallel to the handle holding his instrument, thereby will
accelerate. In this case, if the data are not closely related to each other,
the acceleration can be very significant, sometimes 10s times. Of course, this
depends on the server resources. On the weak acceleration servers will be
negligible.
Now,
many languages support multithreading and the possibility of parallel
computing. But, in my opinion, it is made inconvenient for developers. The
complexity of the design covers the advantages of scalability features of the
system.
Literature:
1.Nemnyugin
S. Stesik O. Parallel Programming for multiprocessor systems. SPb .:
BHV-Peterburg, 2002. 400p.
2.
Antonov AS Parallel programming using MPI technology. M., from the Moscow State
University in 2004
3.
Zhangisina GD Parallel computing and multiprocessor computers in solving
scientific and engineering problems. Almaty: ATU 2006-1 copy.