O.G.Berestneva, D.V. Devjatyh, V.V. Parubets

Tomsk Polytechnic University, Russia

USING NVIDIA CUDA TECHNOLOGY FOR NEURAL NETWORKS OPTIMIZATION

E-mail: ogb6@yandex.ru

      Constantly developing technologies are inevitably seeking for spheres of live to be applied to. Requirements of modern medicine lead to such kind of collaboration with any known science. As for mathematics and computer science, methods of data analysis that these branches of science possess are suitable for solving diagnostics issues. Generally speaking  all the problems solved by man, from the point of view of computer technologies technologies can be classified into two groups:

1.       Tasks that provide specific set of conditions that let getting a clear, precise answer by using particular algorithm.

2.       Tasks that don’t let taking into considerations all circumstances impacting on answer. These features lead to using approximate input of the most influential data. Due to general approach the answer is fuzzy.

In order to solve tasks from the first group it is possible to use classical methods of data analysis. No matter how complex the algorithm is, determined amount in input parameters still let achieving precise answer.

The thing is that medicine faces problems that are representatives from the second group, for example identification of the nature and cause of disease or assisting doctors in the interpretation of medical images. These factors anticipated appearing of such technology as Computer Aided Diagnosis (CAD). It uses various algorithms in particular Artificial Neural Networks (ANN) that are commonly used for performing classifying or clustering processes.  Known limitations of neural network algorithms include high computational cost of implementing such methods [3]. Traditional methods of solving this problem include organizing parallel and distributed computing using hardware.

The peculiarity of the equipment that supports the CUDA (Compute Unified Device Architecture) is unified architecture that  provides higher order of magnitude of the memory bandwidth [4].

Starting eighth series, NVIDIA graphic accelerators got implementation with parallel computing architecture CUDA, that provides specialized programming interface for non-graphics calculations[ 5].

CUDA supported GPU can be considered as a set of multi-core processors. Basic computational units of GPUs are multiprocessors that consist of eight cores, a few thousand 32-bit registers, 16 Kbytes of total memory, texture and constant caches [6].

We select the core stream processing, running on the video card, transforming the computing scheme in such a way that we will gain ability to operate with the desired size during the data processing and also provide the processor elements utilization that could hide a delay while accessing the global memory card. This will also speed up processing by a particular video duplication, as a vector computing on graphics card.

Thus, the learning of network is following:

1.     signal is fed to the hidden layer

image

2.     the output signal strength

image

3.     elements of the matrix image at new step of the network settings

image

4.     an auxiliary matrix image

image

5.     matrix image a new learning step

image

Representing the solution of the problem of diagnosis, as a solution to the problem of classification, created two artificial neural networks.

The first neural network was created using NeuroPro and the package was a multilayer perceptron with the following parameters:

The second neural network had the same characteristics, but was implemented with the optimized learning algorithms, and network performance in a threaded form using CUDA.

 

References

1.     Галушкин А.И. О методике решения задач в нейросетевом логическом базисе  // Нейрокомпьютер.- 2006.- №2.- С.49-70.

2.     Мызников А.В., Россиев Д.А., Лохман В.Ф. Нейросетевая экспертная система для оптимизации лечения облитерирующего тромбангиита и прогнозирования его непосредственных исходов // Ангиология и сосудистая хирургия.- 1995.- №2.- С.100.

3.     Горбань А.Н., Россиев Д.А. Нейронные сети на персональном компьютере.- Новосибирск: Наука.- 1996.- 276с.

4.     Harris M. Mapping Computational Concepts to GPUs. // GPU Gems 2.-Addison Wesley.- 2006.- С.493-508.

5.     J. D. Hall, N. A. Carr, and J. C. Hart. Cache and Bandwidth Aware Matrix Multiplication on the GPU. Technical Report. -UIUCDCS-R-2003-2328, URL:  http://graphics.cs.uiuc.edu/~jch/papers/UIUCDCS-R-2003-2328.pdf

6.     Horn D. Stream Reduction Operations for GPGPU Applications. // GPU Gems 2. -Addison Wesley.- 2006.- С. 573-589

7.     Хайкин С, Нейронные  сети  полный  курс, 2-е  изд.,  испр.:  пер.  с  англ. – М.:ООО «И.Д. Вильямс», 2006. 1104 с.

8.     Лазарев И.А., Русаков В.И., Ткачева Т.Н. Иммунопатологические изменения у больных язвенной болезнью желудка и двенадцатиперстной кишки и их имму-нокоррекция // 4 Всесоюзный Съезд гастроэнтерологов.: Тез. докл. - Москва, Ле-нинград, 1990.- Т.1.- С.702-703.

9.     А.Н.Горбань, В.Л.Дунин-Барковский, А.Н.Кардин «Нейроинформатика» Отв. Ред. Новиков Е.А., РАН, Сиб. Отд., Институт выч. Моделирования – Новосибирск: Наука, 1998.

10. NVIDIA CUDA Programming Guide Version 2.3.1 // NVIDIA – World Leader in Visual Computing Technologies. 2011. URL: http://www.nvidia.com/object/cuda_develop.html