Data Flow in a Typical BI Stack

A person unfamiliar with Business Intelligence may find it hard to visualize how data flows through a typical BI stack. This post is aimed towards trying to ease that effort.

The diagram below represents a typical BI stack.

Let’s tackle each of the represented layers individually.

Data Sources

What makes BI really important is the sheer number of sources where data is coming from. These sources are typically providing data that is unstructured and is not in any consistent format. The data is usually dumped into an Operational Data Store, even though each organization could actually call it something different.

Operational Data Store

The ODS is designed to integrate data from various sources for further operations. In order to make it write-friendly, data in the ODS is unstructured, making for very fast write operations. The down side is that it is very unfriendly to data analytics and relational models.

Extract, Transform, Load

The name ETL pretty much describes exactly what these programs are supposed to do. They extract data from the source, transform it into a different format, and load it to the destination. The actual transformation of data, of course, is different for different stages of the BI stack.

Data Warehouse

After an ETL process, when the data ends up in a data warehouse, it is structured and indexed, making it friendly to relational models. At this stage, the warehouse still contains all the data coming in from all sorts of different sources, which means the sheer volume could potentially make data analytics a very slow and cumbersome process.

Data Mart

In order to ease the pain of data analysis described above, smaller marts are created for data that are targeted towards a specific business use case. Because these marts contain a subset of the entire data residing in the data warehouse, analysis is quick and painless.

Reporting Engine

Depending on the organizational preference, there is a wide range of reporting engines available off the shelf, both commercial and open source. The reporting engine should have access to all data marts at one end and should be accessible to all stakeholders at the other end.

Reports

Individual reports make up the UI front of the BI stack. There is a wide variety of reports that can be made available including, but not limited to, dashboards, drill down reports, self-serve reports, and even automated alerts that get pushed to stakeholders if the system is headed towards a crash.