Building Intelligent and Performing Enterprises
Building Intelligent and Performing Enterprises
 
Login or Register  
 
Business Performance and Information Excellence Practice

Field Tips Listing Page
Documenting your data-integration system
There is no data-integration system, which can fully self-document the data integration flow and process.
This Field Tips is linked to:  Data Warehousing, Data Analysis/OLAP, BI platform Tools Evaluation, BI business intelligence end-to-end view, Metadata Management, Core Data Management Tools,

Want to get a 360 degree understanding on Data-Integration?
Join Expert-Level Training Programs on Business Intelligence and Data-Warehouse

The reasons are:

  • Source system documentation: Its generally not possible to have a complete description of :
    • the source system tables
    • inter-linkages between the tables,
    • the same data-elements with different field names in different tables
  • The transformation rules: Many transformation rules are fairly complex, and cannot be implemented through standard options given in a tool and have to be programmed into the system.
  • The reasoning: While it might be possible for an ETL system to self-document the simple transformational rules, it cannot document the reasons and objectives behind the transformation. For example, why are you splitting a customer_ID (AA302456) into sub-components like Customer_type(AA), customer location (30) and customer_number(2456). The different reasons for this splitting could be:
    • Enabling the specific type of queries around customer type.
    • Two different tables could be having different field structure for customer type (AA v/s XXAA)
  • The documentation on risks related to the efficacy of ETL: During the design of an ETL system, one comes to know the limitations of the ETL. For example, you may not be able to achieve 100% perfect extractions or transformations given the:
    • limitations of data quality in source systems
    • limitations of the extraction flexibility,
    • the performance load due to a complex extraction query.
  • These limitations should always be documented, which give a more realistic view of the level of accuracy around the data and the output information.

  • The flow of ETL: A set of data goes sometimes go through multiple transformation routines before it reaches end-state and be ready for loading. A good documentation should be able to provide an end-to-end view of this entire flow so that one can understand the purpose behind this flow. This end-to-end view should be able to answer the following questions:
    • Why we are following X steps and not Y steps to do a transformation?
    • What is the completion criteria related to each step of transformation/Extraction?
  • Data Quality checks: An ETL system generally does not document the data quality checks, which need to be done and their reasoning.

Quick Feedback- Was this information helpful ?
BiPM Support- Let us help you find what you are looking for-


Relevant Links to this page
Field Tips → Dimensional model has to be aligned to the Entity-Relationship → Field Tips → Always Use Conformed Dimensions → Field Tips → You may not be a able to have a perfect ETL → Field Tips → Handling Sparse Dimensional tables → Field Tips → Do not separate the parent and child line item data → Field Tips → Managing time-stamps across multiple time-zones → Field Tips → Recording events in multiple currencies → Field Tips → Handle different units of measure in the same fact table → Field Tips → Handling of Null foreign Keys in fact tables → Field Tips → Dimension Attributes as NULL → Field Tips → Don't rely too much on Meta Data Tools to enforce Business Intelligence → Field Tips → Don't wait for universal models for Data Marting → Field Tips → Add extra buffer for ETL phase → Field Tips → Homework before interviews is must (Business Requirements Phase in Data Warehouse) → Field Tips → Excel is the competition, which should be challenged → Field Tips → Avoid Pure MOLAP → Field Tips → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Consolidate Data-Marts → Field Tips → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Licensing & Maintenance Contracts → Field Tips → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Governance & Standards → Field Tips → Field Tips Series- Streamlining & reducing cost of Business Intelligence- Evaluate Open Source → Field Tips → Master Data Management- Making a Right Start → Field Tips → How to integrate stand-alone BI environments- Gradual Approach → Field Tips → Business owned applications are a reality- Manage it → Field Tips → New Data Standards- What about existing data and applications? → Field Tips → Handle Each Time-stamp in the Fact Table as a separate dimension → Field Tips → Keep Aggregates and Details data in different Fact tables → Field Tips → Some considerations for Infrastructure in Data Warehouse → Field Tips → For Core BI platform go for a single, established and robust player → Field Tips → Don't be guided only by the business requirements for your Business Intelligence → Field Tips → Using Synonyms and Views → 
 
Back