Real-life Customer Cases using Data Vault and Data Warehouse Automation
-
Upload
patrick-van-renterghem -
Category
Data & Analytics
-
view
669 -
download
0
Transcript of Real-life Customer Cases using Data Vault and Data Warehouse Automation
1
Real life customer cases using Data Vault and Data Warehouse automation
Dirk Vermeiren – Partner Tripwire Solutions
Historical – Milestone projects
Health Sector1 2009
Data Vault
Rule 1 – Do not implement a Data Vault DWH without DWH Automation
¡ Why ? ¡ You have 3 to 4 times more objects than 3NF, meaning
much more manual development work. ¡ Data Vault objects have generic logic per type (HUB,
SATELLITE & LINK) and there are lots of them. ¡ Therefore code generation can be used to deliver
higher development speeds.
Rule 2 – Do not create a DV model that holds the single version of the Truth in your first layer of your DWH
¡ Why ? ¡ Single version of the Truth :
¡ Is defined by business and changes faster over time than the source systems
¡ The single version of the truth is a myth. ¡ As business definitions changes over time you will
also get multiple versions of the truth over time
Rule 3 – Do not limit what you record in the DV based on user requirements
¡ Why ? ¡ End users can not predict everything they need and
they want to be able change their mind on what is needed.
Historical – Milestone projects
Health Sector1 2009
Data Vault
Health Sector2 2010
Data Vault DWH Automation 1.0
Healthcare Sector Project 2 ¡ Create a foundation layer the holds :
¡ the Single version of the Facts = Stores data in Source system format ¡ Atomic level data ¡ All data from source except for Interface or other technical tables ¡ All History of change ¡ Integrates data across sources ¡ Use a data Vault modeling which is flexible and resilient to change. ¡ Use etl-generation = OWB OMB-code generation
¡ Important : Reuse of investment of existing ETL-tool is important so the automation tool should generate Mappings, not replace the existing tool.
¡ Create a presentation layer ¡ Generate the incremental logic from foundation to presentation layer. ¡ Manual development. ¡ That structures the data in a way end users understands it.
Data Flow – HC Sector Project 2
Rule 4 – Do not automate incremental logic towards PL
¡ Why ? ¡ Generic increment logic can not take in account that
there are driving tables, which means all tables are driving tables in the load logic and this has a huge impact on the performance.
¡ Exception : ¡ Use Engineered systems to run this logic ¡ Use Engineered systems & in Memory technology
to virtualize the Presentation Layer.
Rule 5 – Do not implement all business logic From Foundation layer to Presentation layer
¡ Atomic level objects that do not exist in the source should not be placed in the presentation layer
¡ Why ? ¡ They are typically used in business logic to build
multiple Presentation Layer object ¡ Best to persist them before the PL, otherwise you
have to implement the logic to load them, multiple times
Historical – Milestone projects
Health Sector1 2009
Data Vault
Health Sector2 2010
Data Vault DWH Automation 1.0
Bank project
¡ Introduced new DV features in DWH Automation tool ¡ Transactional Links ¡ Same as Logic ¡ Splitting Satellites over Multiple Satellites ¡ More customization so more customer standards
could be supported
Historical – Milestone projects
Health Sector1 2009
Data Vault
Health Sector2 2010
Data Vault DWH Automation 1.0
The Agile Information Factory
i
Architecture & Approach q Innovation
ü Supports all new concepts in Information management
q Delivers Value
ü Agility
ü Cost Reduction
q Best Practices ü Reuse of approach &
solutions
q Oracle Platform ü Uses Integrated
Software/Hardware stack of Oracle
DWH-Automation Solution 3.0
• Tripwire DWH Foundation Accelerator
Analysis
Source Analysis Automation
• Tripwire DWH Foundation Accelerator
Development
Etl-Code
Generation
• Tripwire DWH Foundation Accelerator
Testing
Automated
Data Validation
• Redbridge Lifecycle Management
Automated Release Management
• Oracle Enterprise Metada Management
Impact Analysis Enterprise Metadata
Management
Raw and Business DV
¡ In the foundation layer there are actually 2 persistent layers (typically stored in 1 schema)
¡ RDV : Raw integration – none to simple business key integration – the data does not represent common business rules
¡ BDV : Business Data Vault
¡ Business rules are applied
¡ Business key integration takes place
¡ New Business Concept introduced
¡ Data Virtualization of exiting business concepts in the Raw Vault – Do not persist objects that already exist in the Raw Data Vault
Foundation Area : The internal Layers
Multiple speed Implementation
¡ The Raw and the Business Data Vault area can be built at different speeds because : ¡ The RAW or Source based Data Vault is :
¡ A technical implementation based on source systems and only requires a source analysis = Single version of the fact
¡ Data Warehouse automation can be used as the target structure is a direct representation of the source.
¡ The Business Data Vault is : ¡ A Business based implementation that requires functional and technical
analysis to understand business requirements = Single or multiple versions of the truth
¡ New Business Concepts can be created (New Hubs) but implementation experience show typically link tables between existing Source Business concepts (source based Hubs) support requirements for 90%.
¡ The multiple speed approach supports better functional and technical analysis when the raw data vault data is already available.
Rule 6 – Put the right business logic in the right layer.
¡ If you do not standardize than you will have to document everything
¡ Supports the multiple speed approach
¡ Increases the ability to change without high impact. ¡
Parameters to define where to place which Business logic : ¡ Stability : Is this logic likely to change a lot over time
¡ Scope : Enterprise wide, Departmental, User specific
¡ Type : Conditional, Calculation, Aggregation, Data Quality Check, …
¡ Result : Factual, Master Data
For questions : Piet De Windt [email protected] +32 473 99 99 89
Everything you need to build something exceptional