Alexander N. Podobry

Ulyanovsk State Technical University, Russia

Integration Method Design for Common Corporate Data Warehouses

 

Any manufacturing enterprise is mainly concerned with the timely output of finished products. Achievement of this goal requires tracking the product lifecycle from conclusion of a development contract to testing and transfer of finished products to the Customer. The cycle itself includes tracking of the requests for material procurement, acceptance of the design and engineering documentation, etc.

A problem arising when designing a computer-aided system for tracking the product lifecycle is that the data storage sources are dispersed. The role of these sources plays corporate data warehouses [2].

A corporate data warehouse is the integrated gallery of data, which are structured and collected from diverse information sources. It gives an opportunity to analyze the data accumulated and forms the basis for designing decision-making systems.  

There are a lot of ready-made solutions which help to resolve this task in some degree, e.g., PDM (Product Data Management) and PLM (Product Lifecycle Management) systems [3,4]. As a rule, systems like these ones are expensive and provide for a general task solution.   Anyway, these systems to be used at maximum, all corporate data warehouses shall be integrated in one information space of the company.

The main approaches include the integration of: data, business-processes, applications and user interactions.

The data integration gives a common representation of all information objects within a company, at all production and business-process levels.

The dispersed-application integration level allows controlling the streams of events and application management within the context of transactions, messages or data.  

The business-process integration is to define and implement processes of exchange and implementation of corporate data between corporate date warehouses.

The user-interaction integration gives a common interface for access to corporate data warehouses taking into account personal and safe access level. This type of integration allows aligning users’ interoperation with a full range of the data given.

The data integration forms a basis for all the integration approaches mentioned. It is the basis which defines the integration success at all other levels of an information system. The main methods for integration of corporate data are: consolidation, federalization and dissemination [1].

When using the consolidation method, data are collected from several primary systems and integrated into one permanent warehouse. Applying the data- federalization method, a common information cyberspace appears where data may be stored in different sources at that the requesting side has no access to the information on the data location. Finally, it is the dissemination method where data are transferred from one system to another.

Each method mentioned has the advantages and disadvantages. When using the consolidation method, a time delay may occur between the moment of the data update in primary systems and the time when these changes appear in the final storage location. The advantage of this method is in aggregating (aligning) of data during transfer of information to the final warehouse. The advantage of the federalization method is that it allows the access to the requisite data and excludes the need in data transfer from one warehouse to another. The disadvantage of this method is the cost associated with productivity and access to multiple data warehouses. A big advantage of the dissemination method is that it can be used for data transfer in real-time mode or close to it. Other advantages include the secured data delivery and two-way data dissemination.

There is a hybrid data-integration method which consists of some of the methods mentioned above. This method is applied when an independent use of one method is impossible.  

The use of one of the methods mentioned within a manufacturing enterprise doesn’t allow the integration problem to be fully resolved for independent corporate-data warehouses. It depends on such factors as independence, productivity, dissemination ability of corporate warehouses, etc. As a result, a method is requested to combine all the methods mentioned and to allow integrating corporate warehouses which are physically located in separate networks.

An option to resolve this issue is the use of a hybrid method of consolidation, federalization and dissemination (figure 1). The basis for this method relies on the metadata structure which combines different independent corporate warehouses. 

Fig.1. Hybrid Method for Integration of Corporate Data Warehouses.

The consolidation method is used to collect data from child warehouses, and the dissemination method – to integrate with warehouses which are geographically or physically located in different networks. To communicate to the last ones, the xml-upload facility may be used (fig. 2).

Figure. 2 Structure of Data Representation to Web-Resources

This facility is based on the algorithm for uploading data in xml-format, which allows the data to be updated and the structure of a corporate warehouse to be sinchronized. The xml-file itself includes the following:

- list of tables;

- list of table fields;

- key fields of tables;

- data loading method;

- data set.

Thus, the suggested method allows designing of a common data-storage structure due to the integration of independent data warehouses into one information space within a company. The xml-upload facility allows supporting the integration both with information systems, where the access to the database is denied, and with remote data warehouses, e.g. the Internet. It helps to track a product lifecycle and to promptly respond to possible delays and problems with the product output.

 

BIBLIOGRAPHY:

1.     Tanenbaum, Andrew S. Distributed systems: principles and paradigms / Tanenbaum, Andrew S.; Steen, Maarten van. – Saint-Petersburg: “Piter” Publishers, 2003. – 877 p.  

2.     Kudinov, Alexander. Data Warehouse as the Basis of Corporate Integration. Edition: PC Week/RE. Date: 2006, Intersoft Lab

3.     Dubova , Natalia. Systems for Control of Production Data. “Otkrytiye Systemy”, 1996 , #3

4.     Kevorkov, Sergey. Support of the Product Lifecycle / “Otkrytiye Systemy”,  12, 2005