Data Warehouse and OLAP solutions
within the decision support systems

 

 

Professor Adrian COJOCARIU, PhD

Assistant Cristina Ofelia STANCIU, PhD Candidate

 

 

TibiscusUniversity of Timişoara

Faculty of Economic Science

1/A Daliei Street, 300558, Timisoara, Romania

Phone: +40-256-202931, Fax: +40-256-202930

E-mail: a_cojocariu@yahoo.com, ofelia_stanciu@yahoo.com

 

 

ABSTRACT

A Data Warehouse is a complex system which contains operational and historical data concerning an organization, data provided by internal and external sources of the organization. The Data Warehouse overtakes data from the operational data base, the data being processed and analyzed in order to support the decision system. The main means to benefit by the data from the Data Warehouse are the on-line analytical processing (OLAP) solutions and Data Mining Techniques.

 

KEY WORDS: Data Warehouse, Data Mining, OLAP, decisions

 

 

INTRODUCTION

Management activity within organizations has suffered significant changes once with the development of the informational society, and the new informational technologies have positively influenced the most important field of interest this activity offers, decision making. Decisional tasks start to get harder to fulfill if the human decider is not assisted by computer-based tools, also known as decision support systems (DSS).

Nowadays, the decision support system components are similar to the ones Sprague identified in 1982: the user interface, knowledge-based subsystems, the data management module and the model management manner.

 

 

 

 

 

 

 

 

 


Figure no. 1. Data management module

(From Turban, 2001)

 

The data management method is a subsystem of the computer-based decision support system, and has a number of subcomponents of its own (Figure no. 1.):

§        the integrated decision support system database, which includes data extracted from internal and external sources, data which can be maintained in the database or can be accessed only when is useful;

§        the database management system; the database can be relational or multidimensional;

§        a data dictionary, implying a catalog containing all the definitions of database data; it is used in the decisional process identification and definition phase;

§        query tools, assuming the existence of languages for querying databases.

 

PROBLEM DEFINITION

The Data Warehouse is a complex system that holds operational and historical data for an organization, representing a separate entity from the other operational databases. The huge quantity of data maintained in a data warehouse is collected from internal sources as well as from external sources of the organization. The data warehouse fetches data from operational databases; the data can be then analyzed in various ways, for the purpose of helping the decider in the decisional process.

W. H. Inmon, the most remarkable author in the area of building data warehouses, describes these as “a subject-oriented, integrated, historical and non-volatile data collection designed to assist in the decisional process”, thus the primary properties of data warehouses: subject-oriented, integrated, with an historical character and persistent data.

The process of building and using data warehouses is known as data warehousing; this process implies integration, filtering and consolidating the data.

The objectives of a data warehouse can be identified within the following:

Ø    providing the user with persistent data access – the computer is the tool permitting easier access to the data warehouse;

Ø    providing an unique version of the data – ambiguous data will not be supplied to the user, as so there will be no debates regarding the truthfulness of used data;

Ø    recording and accurate playback of past events – historical data can be extremely important to the user because often the present data is meaningless unless compared to past data;

Ø    allowing a high level as well as a detailed level access to data – information can be collected and formatted easier using the data within data warehouses;

Ø    splitting operational level and analytic level processing – maintaining an informational system in which the decisional and operational information must be gathered together raises many issues.

Data warehouses can hold different types of data: detailed data, aggregated data, metadata; the latter also allow specifying the data structure, source, transformation rules, being used when loading data and thus playing an important role in populating the data warehouse. The data warehouse architecture is presented in Figure no. 2.

Using data warehouses presents a series of advantages that can be identified among the following:

§       the deciders can easily be provided with a series of reports assisting in the decisional process;

§       the increase of data consistency, data “productivity” and the decrease of computational costs;

§       the users are provided with access to a large variety of data

§       the structure of the data warehouse makes is easily adaptable to data changes and capable of transmitting these changed data to the operational system;

 

Ïîäïèñü: OPERATIONAL
DATABASES
Ïîäïèñü: DATA ACCESSÏîäïèñü: DATA WAREHOUSE
Ïîäïèñü: DATA ACCESSÏîäïèñü: INFORMATION 
ACCESAS
Ïîäïèñü: USERS

 

Figure no. 2. General architecture of the data warehouse

 

A similar entity to the data warehouse is the Data Mart, often encountered in specialized literature, which has generated long debates on the fact whether it is or it isn’t equivalent to a data warehouse. Data Mart is not equivalent to a data warehouse, it represents a data collection specialized on areas of interest, depending on the needs of a particular department in the organization. There is a financial Data Mart, a marketing one, and so on, these being almost totally independent from one another. Each department is considered to be the owner of hardware and software components constituting the Data Mart.

Data Mart comes in two forms: dependent and independent. A dependent Data Mart is a data warehouse sourced one, while the independent one is sourced by its own applications.

A dependent Data Mart is formed from loading data from operational systems into the organization’s data warehouse which will be divided in smaller units named Data Mart, and their dependency is actually this derivation from data warehouses.

An independent Data Mart is more unstable than a dependent one, and its deficiency determines it to delay the moment it begins to manifest until there are more independent Data Marts within the organization. Due to the fact that organizations develop in time, there are situations when many Data Marts are encountered, Data Marts that have grown large and each of them needs to collect data from operational databases; this fact can be relatively costly but also inefficient for those operational databases, as the working time is reduced in favor of supplying data to the Data Mart.

Data warehouses can be very useful to various categories of deciders, and the most important ways to benefit from the data within the warehouses are online analytical processing (OLAP) and Data Mining techniques. The OLAP technology refers to the possibility of aggregation of data in a warehouse, being able to filter the large amount of data to obtain useful information for the decisional process within an organization. According to specialists, an alternative term for describing the OLAP concept would be FASMI (Fast Analysis of Shared Multidimensional Information). The essence of each OLAP is the OLAP cube, also known as the multidimensional cube composed from numeric facts called measurements, categorized by dimensions [8]. These measurements are obtained from records in the relational databases tables. The outcomes of user requirements care be achieved by dynamically traversing the dimensions of the data cube, on a high or detailed level.

OLAP systems have the following properties [7]:

Ø     multidimensional data view;

Ø     intensive evaluation capabilities;

Ø     timeline orientation (time intelligence).

 

Ïîäïèñü: ETERNAL DATA

Figure no. 3. The components of a decision support system in an organization using information technologies

 

The available technologies for managing data and information must lead to better understanding of past events and to predict the future through an increase efficiency brought to the decisions made, also involving Data Mining here. Data Mining techniques integrated with decision support systems determine the existence of a decision support tool that is still based on the man-machine interaction (man-computer system), and these two entities taken together represent a specter of computer-based analytic technologies developing a platform for an optimum combination for an data-driven analysis, but controlled by man [4].

A decision support system in an organization using information technologies has the components shown in Figure no. 3. Still, depending on the system, on its complexity and functionality, the mentioned elements may or may not appear.

 

RESULTS

          Nowadays decision makers within an organization could hardly fulfil all their tasks without decision support systems. The decision support systems are a great help for the human decision maker, by analyzing and processing data, resulting information, from which derives knowledge. Operational and historical data of an organization are usually contained by a data warehouse. The main techniques to benefit of the data within data warehouses are the OLAP solutions and the Data Mining technique. Data Mining have greatly evolved, being capable to “learn” from previous behaviour of the elements, and according to the acquired knowledge it will set hypothesis which will be tested.

 

SOURCES

1.     Ackoff, R. L., From Data to Wisdom, Journal of Applied Systems Analysis, Volume 16, 1989, pages 3-9

2.     Filip F.Gh., Sisteme suport pentru decizii, Editura Tehnică, Bucureşti, 2004

3.     Filip F.Gh., Decizie asistată de calculator: decizii, decidenţi – metode de bază şi instrumente informatice asociate, Editura Tehnică, Bucureşti, 2005

4.     Ganguly, A. R., Gupta A., Data Mining Technologies and decision Support Systems for Business and Scientific Applications, Encyclopedia of Data Warehousing and Mining, Blackwell Publishing, 2005

5.     Graz, P., Watson, H., Decision Support in the Data Warehouse, Prentice Hall, Upper Saddle River Publishing, 1998

6.     Turban E., Aronson J.,  Decision Support Systems and Intelligent Systems, Prentice Hall, SUA, 2001

7.     Zaharie D., Albulescu F., Bojan I., Ivacenco V., Vasilescu C., Sisteme informatice pentru asistarea deciziei, Editura Dual Tech, 2001

8.     http://en.wikipedia.org