The Methods & Architecture Behind Data Warehousing

The method data warehousing vendors use to provide their service is a key issue. Vendors who let you run your own server’s license or those who host software tools on their servers are common options.

There are multiple options related to the architecture of the data warehousing system. The most popular is the hub-and-spoke architecture, which is a centralized data warehouse with dependent data marts. It is a sort of corporate information factory.

One other choice in data warehousing architecture is the data-mart bus architecture linking dimensional data marts, or a centralized data warehouse with no dependent marts and last but not least, a federated architecture.

Some organizations develop their own data marts, which are independent from one another. However, they have inconsistent data definitions and different dimensions, which make it difficult to analyze data across marts.

Data mart bus architecture implies building a first mart that uses dimensions and measures that will be used with the other marts that are developed, in order to obtain logically integrated marts. The data is organized in a star schema and this provides a dimensional view of the data.

The hub-and-spoke architecture focuses on building a scalable and maintainable infrastructure. It is developed in iterative manner. Dependent data marts obtain data from the warehouse and they can be developed for different departments or special purposes.

A similar data warehousing architecture is used for centralized data warehouses but without dependent data marts. Queries and applications access data from relational data and dimensional views.

Federated architecture allows data to be accessed from sources such as operational systems, data marts and data warehouses. It is logically or physically integrated by use of shared keys, global metadata, and distributed queries. It is an adequate solution for companies that have a complex decision support environment.