Institute for Building Intelligent and Performing Enterprises
Building Intelligent and Performing Enterprises
data quality practice kit
 
Login or Register  
 
Join Professional Network of Business Intelligence and Performance Management

Field Tips Listing Page
Documenting your data-integration system
There is no data-integration system, which can fully self-document the data integration flow and process.
This Field Tips is linked to:  Data Warehousing, Data Analysis/OLAP, BI platform Tools Evaluation, BI business intelligence end-to-end view, Metadata Management, Core Data Management Tools,

BUY→ BI Tools Evaluation || Data Quality Kit || Consulting

The reasons are:

  • Source system documentation: Its generally not possible to have a complete description of :
    • the source system tables
    • inter-linkages between the tables,
    • the same data-elements with different field names in different tables
  • The transformation rules: Many transformation rules are fairly complex, and cannot be implemented through standard options given in a tool and have to be programmed into the system.
  • The reasoning: While it might be possible for an ETL system to self-document the simple transformational rules, it cannot document the reasons and objectives behind the transformation. For example, why are you splitting a customer_ID (AA302456) into sub-components like Customer_type(AA), customer location (30) and customer_number(2456). The different reasons for this splitting could be:
    • Enabling the specific type of queries around customer type.
    • Two different tables could be having different field structure for customer type (AA v/s XXAA)
  • The documentation on risks related to the efficacy of ETL: During the design of an ETL system, one comes to know the limitations of the ETL. For example, you may not be able to achieve 100% perfect extractions or transformations given the:
    • limitations of data quality in source systems
    • limitations of the extraction flexibility,
    • the performance load due to a complex extraction query.
  • These limitations should always be documented, which give a more realistic view of the level of accuracy around the data and the output information.

  • The flow of ETL: A set of data goes sometimes go through multiple transformation routines before it reaches end-state and be ready for loading. A good documentation should be able to provide an end-to-end view of this entire flow so that one can understand the purpose behind this flow. This end-to-end view should be able to answer the following questions:
    • Why we are following X steps and not Y steps to do a transformation?
    • What is the completion criteria related to each step of transformation/Extraction?
  • Data Quality checks: An ETL system generally does not document the data quality checks, which need to be done and their reasoning.

Quick Feedback- Was this information helpful ?
BiPM Support- Let us help you find what you are looking for-

BUY→ BI Tools Evaluation || Data Quality Kit || Consulting

Tags    -     See all

Relevant Links to this page
Field Tips → Dimensional model has to be aligned to the Entity-Relationship → Field Tips → Always Use Conformed Dimensions → Field Tips → You may not be a able to have a perfect ETL → Field Tips → Handling Sparse Dimensional tables → Field Tips → Do not separate the parent and child line item data → Field Tips → Managing time-stamps across multiple time-zones → Field Tips → Recording events in multiple currencies → Field Tips → Handle different units of measure in the same fact table → Field Tips → Handling of Null foreign Keys in fact tables → Field Tips → Dimension Attributes as NULL → Field Tips → Don't rely too much on Meta Data Tools to enforce Business Intelligence → Field Tips → Don't wait for universal models for Data Marting → Field Tips → Add extra buffer for ETL phase → Field Tips → Homework before interviews is must (Business Requirements Phase in Data Warehouse) → Field Tips → Excel is the competition, which should be challenged → Field Tips → Avoid Pure MOLAP → Field Tips → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Consolidate Data-Marts → Field Tips → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Licensing & Maintenance Contracts → Field Tips → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Governance & Standards → Field Tips → Field Tips Series- Streamlining & reducing cost of Business Intelligence- Evaluate Open Source → Field Tips → Master Data Management- Making a Right Start → Field Tips → How to integrate stand-alone BI environments- Gradual Approach → Field Tips → Business owned applications are a reality- Manage it → Field Tips → New Data Standards- What about existing data and applications? → Field Tips → Handle Each Time-stamp in the Fact Table as a separate dimension → Field Tips → Keep Aggregates and Details data in different Fact tables → Field Tips → Some considerations for Infrastructure in Data Warehouse → Field Tips → For Core BI platform go for a single, established and robust player → Field Tips → Don't be guided only by the business requirements for your Business Intelligence → Field Tips → Using Synonyms and Views → 
 

Back