Building Data-Centric Businesses

109
Daniel Aragao & Simon Hope

Transcript of Building Data-Centric Businesses

Page 1: Building Data-Centric Businesses

Daniel Aragao & Simon Hope

Page 2: Building Data-Centric Businesses

Daniel Aragao Simon Hope@dear_dr_dan @mapbutcher

Page 3: Building Data-Centric Businesses

REALESTATE.COM.AU

6BMarket Cap

11MAustralian Properties

55MVisits in September

4.7MApp Downloads …and counting

Page 4: Building Data-Centric Businesses

3,500PEOPLE

13COUNTRIES

34OFFICES

TECHNOLOGY &

SOCIAL JUSTICE

Page 5: Building Data-Centric Businesses

• In the beginning…

• Organising our Data

• Implementation approaches

• Hipster Batches

• Reactify

• Bring Your Own Data

• Finding the Data

• What we have learned so far

THIS IS WHAT THE STORY IS ABOUT

Page 6: Building Data-Centric Businesses

SORRY… IT’S OK TO LEAVE NOW

• Nope, we didn’t create a new Hadoop

• No hardcore Data Science

• There are some implementation details

• REA embraced the Cloud. AWS everywhere

• Under construction

Page 7: Building Data-Centric Businesses

IN THE BEGINNING…

Page 8: Building Data-Centric Businesses
Page 9: Building Data-Centric Businesses
Page 10: Building Data-Centric Businesses
Page 11: Building Data-Centric Businesses
Page 12: Building Data-Centric Businesses
Page 13: Building Data-Centric Businesses
Page 14: Building Data-Centric Businesses
Page 15: Building Data-Centric Businesses
Page 16: Building Data-Centric Businesses
Page 17: Building Data-Centric Businesses
Page 18: Building Data-Centric Businesses
Page 19: Building Data-Centric Businesses
Page 20: Building Data-Centric Businesses
Page 21: Building Data-Centric Businesses
Page 22: Building Data-Centric Businesses
Page 23: Building Data-Centric Businesses
Page 24: Building Data-Centric Businesses
Page 25: Building Data-Centric Businesses
Page 26: Building Data-Centric Businesses
Page 27: Building Data-Centric Businesses
Page 28: Building Data-Centric Businesses
Page 29: Building Data-Centric Businesses
Page 30: Building Data-Centric Businesses

ORGANISING OUR DATA

Increasingly, content is being distributed through searchand social platforms... There’s less visiting of publishers as destinations.

Jeff Weiner, CEO, Linkedin

Page 31: Building Data-Centric Businesses

Data sources

Data warehouse

PROBLEM…

Page 32: Building Data-Centric Businesses

STRATEGY…

Page 33: Building Data-Centric Businesses

STRATEGY…

Page 34: Building Data-Centric Businesses

STRATEGY…

Page 35: Building Data-Centric Businesses

Data Warehouse

StagingSSIS Dim Fact

PROBLEM…

Page 36: Building Data-Centric Businesses

Data Warehouse

StagingSSIS Dim Fact

PROBLEM…

Star schema leaky details

Page 37: Building Data-Centric Businesses

No Data Warehouse

StagingSSIS Dim Fact

STRATEGY…

Page 38: Building Data-Centric Businesses

STRATEGY…

Data Warehouse Facade

StagingSSIS Dim Fact

Page 39: Building Data-Centric Businesses

???

WHAT’S IN THE BOX?

Page 40: Building Data-Centric Businesses

Good things come in small packages services

THE HIPSTER BATCH

???

Hipster Batch

Page 41: Building Data-Centric Businesses

Hipster Batch

THE HIPSTER BATCH

• Small and short lived

• Decoupled via flat files via S3

• Single purpose

• Idempotent

• Polyglot

• Minimal runtime dependencies

• Discoverable

Page 42: Building Data-Centric Businesses

SNS, SQS

Data

A ‘TYPICAL’ IMPLEMENTATIONHipster Batch

Page 43: Building Data-Centric Businesses

SNS, SQS

ASG, ECS, Lambda

Data

A ‘TYPICAL’ IMPLEMENTATIONHipster Batch

Page 44: Building Data-Centric Businesses

SNS, SQS

ASG, ECS, Lambda

KMS

Data

A ‘TYPICAL’ IMPLEMENTATIONHipster Batch

Page 45: Building Data-Centric Businesses

Logs

SNS, SQS

ASG, ECS, Lambda

KMS

Data

A ‘TYPICAL’ IMPLEMENTATIONHipster Batch

Page 46: Building Data-Centric Businesses

Logs

SNS, SQS

ASG, ECS, Lambda

KMS

Cloudwatch

Data

A ‘TYPICAL’ IMPLEMENTATIONHipster Batch

Page 47: Building Data-Centric Businesses

Logs

SNS, SQS

ASG, ECS, Lambda

KMS

Cloudwatch

S3 buckets

Data

A ‘TYPICAL’ IMPLEMENTATIONHipster Batch

Page 48: Building Data-Centric Businesses

Hipster Batch

HIPSTER BATCH DOES SCIENCE

• Behavioural models for targeted marketing

• Recommendation engine

• External channels

Page 49: Building Data-Centric Businesses

Hipster BatchSCIENCE!

Page 50: Building Data-Centric Businesses

x 20

Hipster Batch

Stats models

SCIENCE!

Page 51: Building Data-Centric Businesses

x 20

API

Hipster Batch

Stats models

SCIENCE!

Page 52: Building Data-Centric Businesses

API

x 20

API

Hipster Batch

Stats models

SCIENCE!

Page 53: Building Data-Centric Businesses

API

x 20

API

Hipster Batch

Stats models

SCIENCE!

Page 54: Building Data-Centric Businesses

API

x 20

API

Hipster Batch

Stats models

GoogleNowAPI

SCIENCE!

Page 55: Building Data-Centric Businesses

From legacy to reactive

REACTIFY

Reactify

???

Page 56: Building Data-Centric Businesses

Reactify

http://www.reactivemanifesto.org

REACTIFY

• Manage Data flow with messages

• Protect consumers and care about isolation

• Resilience is important and Data replication is just fine

• Demand is elastic - and your components should be too

Page 57: Building Data-Centric Businesses

Reactify

Listings

Data coupling

No resilience or elasticity

Coupling

PROBLEM…

Page 58: Building Data-Centric Businesses

Reactify

Listings

SOLUTION…

Page 59: Building Data-Centric Businesses

Reactify

Listings Reactify

SOLUTION…

Page 60: Building Data-Centric Businesses

Reactify

Listings Reactify

SOLUTION…

Page 61: Building Data-Centric Businesses

Reactify

Listings ReactifyHipster Batch

SOLUTION…

Page 62: Building Data-Centric Businesses

Reactify

Listings ReactifyHipster Batch

Shielded consumers

IsolationDecoupled

SOLUTION…

Page 63: Building Data-Centric Businesses

Reactify

Listings

IMPLEMENTATION…

Page 64: Building Data-Centric Businesses

Reactify

ListingsRESTAPI

IMPLEMENTATION…

Page 65: Building Data-Centric Businesses

Reactify

ListingsRESTAPI

IMPLEMENTATION…

Page 66: Building Data-Centric Businesses

Reactify

ListingsRESTAPI Dynamo

Event Maker

Event Differ

IMPLEMENTATION…

Page 67: Building Data-Centric Businesses

Reactify

ListingsRESTAPI Dynamo

Event Maker

Event Differ

Kinesis

2

IMPLEMENTATION…

2

Page 68: Building Data-Centric Businesses

• Exposes current state only

• Stream of change notifications

• Hypertext Application Language - HAL

• Clear entity types

• Linking over embedding

• Cacheable and discoverable

REST API

REACTIFY REST API

Page 69: Building Data-Centric Businesses

REST API

https://feeds.listings.realestate.com.au/combined-listings/120449689

Page 70: Building Data-Centric Businesses

REST API

https://feeds.listings.realestate.com.au/combined-listings/120449689

Page 71: Building Data-Centric Businesses

REST API

https://feeds.listings.realestate.com.au/combined-listings/120449689

Page 72: Building Data-Centric Businesses

REST API

https://feeds.listings.realestate.com.au/combined-listings/120449689

Page 73: Building Data-Centric Businesses

REST API

Event Maker

https://feeds.listings.realestate.com.au/combined-listings/-/changes

Page 74: Building Data-Centric Businesses

REST API

Event Maker

https://feeds.listings.realestate.com.au/combined-listings/-/changes

Page 75: Building Data-Centric Businesses

REST API

Event Maker

https://feeds.listings.realestate.com.au/combined-listings/-/changes

Page 76: Building Data-Centric Businesses

REST API

Event Maker

https://feeds.listings.realestate.com.au/combined-listings/-/changes

Page 77: Building Data-Centric Businesses

Reactify

Event Differ

Page 78: Building Data-Centric Businesses

Reactify

Event Differ

Page 79: Building Data-Centric Businesses

Reactify

Event Differ

Page 80: Building Data-Centric Businesses

Reactify

Event Differ

Page 81: Building Data-Centric Businesses

The octopus in the box

— Did you use that data set? — Errr… No, we have another one

BRING YOUR OWN DATA

Page 82: Building Data-Centric Businesses

BRING YOUR OWN DATA - BYOD

• Allow data to flow freely

• Help the business to get what they need when they need it

• Self-service

Page 83: Building Data-Centric Businesses

BYOD

Page 84: Building Data-Centric Businesses

BYOD

CSV

Page 85: Building Data-Centric Businesses

BYOD

CSV

x 5

Page 86: Building Data-Centric Businesses

BYOD

CSV

x 5

Smarts on datatypes

Page 87: Building Data-Centric Businesses

BYOD

CSV

x 5

TableauServer

Smarts on datatypes

Page 88: Building Data-Centric Businesses

BYOD

CSV

x 5

TableauServer

Smarts on datatypes

Page 89: Building Data-Centric Businesses

BYOD

CSV

x 5

TableauServer

Audit, auth, share…

Smarts on datatypes

Page 90: Building Data-Centric Businesses

These were the implementation approaches, now to…

FIND THE DATA

Meaningful, automated, and easy-to-search metadata

Page 91: Building Data-Centric Businesses

WE TRIED

Page 92: Building Data-Centric Businesses

SNS, SQS

ASG, ECS, Lambda

KMS

Cloudwatch

Logs

MORE THAN DATAHipster Batch

Page 93: Building Data-Centric Businesses

SNS, SQS

ASG, ECS, Lambda

KMS

Cloudwatch

Logs

MORE THAN DATAHipster Batch

Page 94: Building Data-Centric Businesses

SNS, SQS

ASG, ECS, Lambda

KMS

Cloudwatch

Logs

Dataz

Ancestry

MORE THAN DATAHipster Batch

Page 95: Building Data-Centric Businesses

SNS, SQS

ASG, ECS, Lambda

KMS

Cloudwatch

Logs

Dataz

Ancestry

Metadata

MORE THAN DATAHipster Batch

Page 96: Building Data-Centric Businesses

Ancestry

Page 97: Building Data-Centric Businesses

Ancestry

Page 98: Building Data-Centric Businesses

Ancestry

Page 99: Building Data-Centric Businesses

Ancestry

Page 100: Building Data-Centric Businesses

Ancestry

Page 101: Building Data-Centric Businesses

RESTAPI

METADATA PIPELINE

Producers

Page 102: Building Data-Centric Businesses

RESTAPI

Ancestry

Ancestry

Ancestry

METADATA PIPELINE

Producers

Page 103: Building Data-Centric Businesses

RESTAPI

Ancestry

Ancestry

Ancestry

METADATA PIPELINE

Producers

Page 104: Building Data-Centric Businesses

RESTAPI

Ancestry

Ancestry

Ancestry

METADATA PIPELINE

Producers

Scrapy

Page 105: Building Data-Centric Businesses

RESTAPI

Ancestry

Ancestry

Ancestry

METADATA PIPELINE

Producers

Scrapy

Page 106: Building Data-Centric Businesses

RESTAPI

Ancestry

Ancestry

Ancestry

METADATA PIPELINE

Producers

Scrapy

Page 107: Building Data-Centric Businesses

WHAT WE HAVE LEARNED SO FAR

• Consumers create the last-mile data as needed

• We must work with external, independent delivery channels

• Push quality back to source/producer systems

• Data belongs to the entire organisation, not to a single team

Page 108: Building Data-Centric Businesses

I’ll give you my Data Warehouse when you can pry it from my cold dead hands.

Page 109: Building Data-Centric Businesses

THANK YOU

Daniel Aragao Simon Hope@dear_dr_dan @mapbutcher

REALESTATE.COM.AU