Real-life Customer Cases using Data Vault and Data Warehouse Automation

22
1 Real life customer cases using Data Vault and Data Warehouse automation Dirk Vermeiren – Partner Tripwire Solutions

Transcript of Real-life Customer Cases using Data Vault and Data Warehouse Automation

Page 1: Real-life Customer Cases using Data Vault and Data Warehouse Automation

1

Real life customer cases using Data Vault and Data Warehouse automation

Dirk Vermeiren – Partner Tripwire Solutions

Page 2: Real-life Customer Cases using Data Vault and Data Warehouse Automation
Page 3: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Historical – Milestone projects

Health Sector1 2009

Data Vault

Page 4: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Rule 1 – Do not implement a Data Vault DWH without DWH Automation

¡ Why ? ¡ You have 3 to 4 times more objects than 3NF, meaning

much more manual development work. ¡ Data Vault objects have generic logic per type (HUB,

SATELLITE & LINK) and there are lots of them. ¡ Therefore code generation can be used to deliver

higher development speeds.

Page 5: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Rule 2 – Do not create a DV model that holds the single version of the Truth in your first layer of your DWH

¡ Why ? ¡ Single version of the Truth :

¡ Is defined by business and changes faster over time than the source systems

¡ The single version of the truth is a myth. ¡ As business definitions changes over time you will

also get multiple versions of the truth over time

Page 6: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Rule 3 – Do not limit what you record in the DV based on user requirements

¡ Why ? ¡ End users can not predict everything they need and

they want to be able change their mind on what is needed.

Page 7: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Historical – Milestone projects

Health Sector1 2009

Data Vault

Health Sector2 2010

Data Vault DWH Automation 1.0

Page 8: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Healthcare Sector Project 2 ¡  Create a foundation layer the holds :

¡  the Single version of the Facts = Stores data in Source system format ¡  Atomic level data ¡  All data from source except for Interface or other technical tables ¡  All History of change ¡  Integrates data across sources ¡  Use a data Vault modeling which is flexible and resilient to change. ¡  Use etl-generation = OWB OMB-code generation

¡  Important : Reuse of investment of existing ETL-tool is important so the automation tool should generate Mappings, not replace the existing tool.

¡  Create a presentation layer ¡  Generate the incremental logic from foundation to presentation layer. ¡  Manual development. ¡  That structures the data in a way end users understands it.

Page 9: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Data Flow – HC Sector Project 2

Page 10: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Rule 4 – Do not automate incremental logic towards PL

¡ Why ? ¡ Generic increment logic can not take in account that

there are driving tables, which means all tables are driving tables in the load logic and this has a huge impact on the performance.

¡ Exception : ¡ Use Engineered systems to run this logic ¡ Use Engineered systems & in Memory technology

to virtualize the Presentation Layer.

Page 11: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Rule 5 – Do not implement all business logic From Foundation layer to Presentation layer

¡ Atomic level objects that do not exist in the source should not be placed in the presentation layer

¡ Why ? ¡ They are typically used in business logic to build

multiple Presentation Layer object ¡ Best to persist them before the PL, otherwise you

have to implement the logic to load them, multiple times

Page 12: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Historical – Milestone projects

Health Sector1 2009

Data Vault

Health Sector2 2010

Data Vault DWH Automation 1.0

Page 13: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Bank project

¡ Introduced new DV features in DWH Automation tool ¡ Transactional Links ¡ Same as Logic ¡ Splitting Satellites over Multiple Satellites ¡ More customization so more customer standards

could be supported

Page 14: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Historical – Milestone projects

Health Sector1 2009

Data Vault

Health Sector2 2010

Data Vault DWH Automation 1.0

Page 15: Real-life Customer Cases using Data Vault and Data Warehouse Automation

The Agile Information Factory

i

Architecture & Approach q  Innovation

ü  Supports all new concepts in Information management

q  Delivers Value

ü  Agility

ü  Cost Reduction

q  Best Practices ü  Reuse of approach &

solutions

q  Oracle Platform ü  Uses Integrated

Software/Hardware stack of Oracle

Page 16: Real-life Customer Cases using Data Vault and Data Warehouse Automation

DWH-Automation Solution 3.0

• Tripwire DWH Foundation Accelerator

Analysis

Source Analysis Automation

• Tripwire DWH Foundation Accelerator

Development

Etl-Code

Generation

• Tripwire DWH Foundation Accelerator

Testing

Automated

Data Validation

• Redbridge Lifecycle Management

Automated Release Management

• Oracle Enterprise Metada Management

Impact Analysis Enterprise Metadata

Management

Page 17: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Raw and Business DV

Page 18: Real-life Customer Cases using Data Vault and Data Warehouse Automation

¡  In the foundation layer there are actually 2 persistent layers (typically stored in 1 schema)

¡  RDV : Raw integration – none to simple business key integration – the data does not represent common business rules

¡  BDV : Business Data Vault

¡  Business rules are applied

¡  Business key integration takes place

¡  New Business Concept introduced

¡  Data Virtualization of exiting business concepts in the Raw Vault – Do not persist objects that already exist in the Raw Data Vault

Foundation Area : The internal Layers

Page 19: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Multiple speed Implementation

¡  The Raw and the Business Data Vault area can be built at different speeds because : ¡  The RAW or Source based Data Vault is :

¡  A technical implementation based on source systems and only requires a source analysis = Single version of the fact

¡  Data Warehouse automation can be used as the target structure is a direct representation of the source.

¡  The Business Data Vault is : ¡  A Business based implementation that requires functional and technical

analysis to understand business requirements = Single or multiple versions of the truth

¡  New Business Concepts can be created (New Hubs) but implementation experience show typically link tables between existing Source Business concepts (source based Hubs) support requirements for 90%.

¡  The multiple speed approach supports better functional and technical analysis when the raw data vault data is already available.

Page 20: Real-life Customer Cases using Data Vault and Data Warehouse Automation

Rule 6 – Put the right business logic in the right layer.

¡  If you do not standardize than you will have to document everything

¡  Supports the multiple speed approach

¡  Increases the ability to change without high impact. ¡ 

Parameters to define where to place which Business logic : ¡  Stability : Is this logic likely to change a lot over time

¡  Scope : Enterprise wide, Departmental, User specific

¡  Type : Conditional, Calculation, Aggregation, Data Quality Check, …

¡  Result : Factual, Master Data

Page 21: Real-life Customer Cases using Data Vault and Data Warehouse Automation
Page 22: Real-life Customer Cases using Data Vault and Data Warehouse Automation

For questions : Piet De Windt [email protected] +32 473 99 99 89

Everything you need to build something exceptional