It is obvious that there is no organization running without data. The data can be viewed as tangible assets of an organization just as any physical asset. So, they need to be stored and made available to those who need them when they need them. However, the data by themselves are useless. So, they must be put together to produce useful information.
In turn, information becomes the basis for relational decision making. To facilitate the decision-making process, a new development of database systems was developed called “data warehouse”.
The data warehouse can be generally described as a decision-support tool that collects its data from operational databases and various external sources, transforms them into information and making that information available to decision-makers (top managers) in a consolidated and consistent manner. (2:64)(4:82)
The data warehouse is not more than a database but separated from other databases like the operational database distributed database and text database. When did management start to utilize this powerful tool and why they seek to use it.
The data warehouse has been developed at the beginning of 1980s.
However, it was optimize to transform non-organized and lightly summarized data from the operational database into analytical tool that supports intelligent decision-making. (6:19)
The term DSS (Decision support system) database is used interchangeably with the data warehouse. On the other hand, other names for the operational database are transactional database and production database.
The data warehouse can be very simply defined as an integrated, subject-oriented, time variant and non-volatile database that provides support for decision-making (5:39) (6:19). The following four sections will explain what this definition means.
The data warehouse is a centralized database that integrates data from different sources (6:19) with diverse formats.
This integration of the data provides a unified view of the overall organizational situation. Data integration enhances decision-making and helps the manager to better understand the operations of the organization (6:19).
The data in DSS database are organized to provide answers to questions coming from different areas within the organization. They are arranged by topic such as sales, marketing, finance and so on. The DSS database contains specific subject for each topic like customer, product, region and so on. This form of data organization is different that of more process-oriented of the operational database system.
The data warehouse contains historical data over a long time. Those data reflect what happened last week, last month, the past five years and the like. (6:19)
Once the data enter the data warehouse, they are never removed or changed. Because the data warehouse represents the entire history of the organization, the data from operational database are always added to it. Since DSS data are never deleted and new data are periodically added, the data warehouse is always growing. That’s why the data warehouse must be able to have hardware that supports gigabytes and even terabytes size of databases.
THE DIFFERENCE BETWEEN OPERATIONAL DATABASE AND THE DATA WAREHOUSE
The operational database and the DSS database differ in the roles the do as well as the data characteristics for each one.
The transactional database is optimized to support transactions that represent daily operations (2:67). For example, during the registration period at KFUPM, each time a student adds, drops courses, or changes sections, he must be accounted for by the operational database system of the university. So, student data and course data are in frequent update mode.
On the other hand, the data warehouse is optimized to support data analysis and decision-making (2:64). Basically, it takes the summarized data from the operational database, filters them for analysis and decision making processes (2:64).
For instance, the manager of the admission and registration department may ask for the number of students at KFUPM taking ENGL-214 last summer. The data warehouse answers this query for him. Then, he would take decision whether to increase number of sections of this particular course or not.
Operational Data Vs. Warehoused Data
Transactional data and DSS data are different in the summarization level, transaction type, query activities and dimensionality.
The degree to which DSS data are summarized is very high when contrasted with the operational data (5:39).
For example, rather than storing thousands of sales transactions for a given store on a given day, the .