Building Intelligent and Performing Enterprises
Building Intelligent and Performing Enterprises
 
Login or Register  
 
Business Performance and Information Excellence Practice

Ask a question Listing Page
Data Warehouse Architecture: Regarding the Staging Area and ODS
In general datawarehousing scenarios,there is a Staging Area first which extracts the data dump from source systems.Depending on the need and requirement,there will be an ODS in the second stage and then a datamart.Now,can there be a situation where in there is an ODS first and then a Staging Area.Is the scenario possible/recommended?What can be the driving forces/reasons/requirements which result in this type of architecture?

Want to build robust & scalable Data-warehouse Design?
Join Expert-Level Training Programs on Business Intelligence and Data-Warehouse

This answer can be divided into two parts- What are the various ODS types and methods of creation, and is there a scenario of ODS coming before a staging database.

Types of ODS

Operational Data Store from its very name, means a data store used for operational reasons, created by integrating data from diverse source systems. ODS can have following applications:

TYPE I- Online data store, used for transaction execution and system interface purpose-

These data stores have source system data replicated in the central data store. The source system exchange data with other systems through this data store, instead of exchanging point to point interface files. In this era of EAI, we have seen many such implementations. The advantage of these kind of data stores is that they can do more complex validations, before routing the data to other applications. This ODS becomes part of your production system landscape.

Other applications of this kind of data store architectures is to provide a common database for source systems to directly refer to. For example, you can have the source systems updating and referring to the sanitized master tables existing in the ODS (we will refer to this in our Master Data Management Section, which is still under authoring). There are situations where the source system is directly referring to or updating a table in an ODS. In this case one needs to take care that we systematically remove the concerned tables from the source system database (as they become redundant).

These kind of data stores don't do extensive transformation of the data as they pick it up and consolidate data from the source systems. The reason is that they work as routers and also any major transformation will require changes in the source systems in terms of interpreting the different structure from their own.

TYPE II- Online data stores, used for Servicing and Relationship

This is a similar application as mentioned above, however the focus is limited to getting single customer, process and master data view for the sake of stakeholder servicing (like customer, employee and Vendor servicing). The examples are customer relationship single view, or customer touch point single view. You can retrieve this single view during your in-bound or out-bound interactions with the customers. This online operational access, gives you the benefit of risk management, cross-sell, up-sell etc.

The similarity with TYPE I is that you pick operational data from the source system, you can write back on this ODS. The difference is that your focus is not to execute transactions AND you can do extensive transformation. The purpose here is to do true data integration (Customer data integration, Vendor data integration, product data integration..). This kind of ODS has a strong linkage with Master Data Management. This ODS can ride on a Data Warehouse, but if you are also doing aa write-back on the ODS (for example adding customer risk score..), Data warehouse is not the place, as data warehouse does not (and should not allow) write-back. One option (if you want to ride on BI environment) could be to use OLAP (as it allows write-backs), but you will need a visible change in the way OLAP is modeled.

TYPE III- For reporting

Technically it is not an ODS, but people use the term for this application as well. You can have a reporting data to churn out your operational reporting. It has replica of select data from the source systems. It generally has low-intervention transformation. By low-intervention transformation,we mean that you don't change the data which you got from the source systems, but create separate mapping tables. If you are generating your reporting from TYPE I and TYPE II ODS, it is acceptable as it is one out of many functions fulfilled by ODS.

Some De-mistifying comments

  • An ODS can exist without a BI environment. Operational BI is not exactly same as ODS. If you go into the history, ODS in its concept emerged before BI. In other words ODS may exist without a Data-Mart or a staging area of a Data warehouse.
  • An ODS may or may not do integration or transformation of Data. It can be just a router (as in TYPE I)
  • In case you are not riding on you BI environment for your ODS, you may have an ODS and a Data-Mart both existing in your environment, but not talking to each other.

Scenario Where ODS comes before staging area

ODS may come before staging area for TYPE I or TYPE II (as we don't consider TYPE III technically an ODS as it generally does not have sanitized and well-integrated data):

The context here is that a staging area takes ODS as one of the source systems. In this way ETL can benefit from the data integration, sanitization and transformation which an ODS might have been already doing. This is a move away from the traditional transaction systems to Staging area mapping. Therefore one has to have following cautions

  • the ODS should be mature, and stable and as good as being part of your production system.
  • ODS should ideally the source of most of your enterprise reporting, for the systems which are linked to the ODS. A scenario where you are generating reports from individual systems as well as ODS, one needs to reconsider the source. The reason for this comment is that reporting is typically done from a source which is more reliable and is considered 'system of record'.
  • Study the road-map of the ODS before you commit to this architecture. This is because, if ODS is planned to be retired after a year, you may reconsider your option.

ODS may come before staging area, if most of your data-mart is linked to that ODS:

if ODS fulfills most or all of the data needs of your Data-Mart, it will make sense to use it as the source system for your staging area. If ODS is one of five (say) source system, you may like to reconsider your options.

If you want to have a quicker data-mart implementation

This is the scenario, where you want to use ODS as a stop-gap source. This happens when you are in the midst of figuring out your production system road map OR in the drafting stage of your enterprise level data warehouse OR both. In this situation you can compromise on the robustness or the quality (as long as it is acceptable) and with the help off manual work-arounds and monitoring.


Quick Feedback- Was this information helpful ?
BiPM Support- Let us help you find what you are looking for-


 
Back