1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center...

57
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1

Transcript of 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center...

Page 1: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1

Page 2: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 2

Oracle NoSQL Database Technical Overview

Ralf Lange

Global ISV & OEM Sales

Page 3: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 3

Agenda

NoSQL Background

– How we got here

– Landscape and choices

Oracle NoSQL Database

– Overview

– Use cases

Early Adopter Feedback & Strategic Direction

Page 4: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 4

Agenda

NoSQL Background

– How we got here

– Landscape and choices

Oracle NoSQL Database

– Overview

– Use cases

Early Adopter Feedback & Strategic Direction

Page 5: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 5

NoSQL and Big Data Where did it come from?

SQL

JDBC,

ODBC

General

Purpose

Managed

Schemas

Security,

Backups

Analytics

Distributed

Processing

Distributed,

Replicated

File System

Driver

Application

NoSQL databases

Flexible

Schemas

Sharded,

Replicated

Database

High Speed,

Simple Ops

More Flexible Schema

Management

Globally Distributed,

“Always On” data

Competitive Advantages

of “Fast Data”

Lower TCO,

commodity HW scale-out

Page 6: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 6

Oracle NoSQL Database Where is it used?

Simple Data

Management

Globally Distributed,

“Always On” data

Competitive Advantages

of “Fast Data”

Lower TCO,

commodity HW scale-out

ERP

EAM

Inventory

Control

Accting

& Payroll

Process

Mgmt

Business

Analytics

CRM

Driver

Application Real Time

Event

Processing

Distributed,

Web-scale

Applications

Online

Gaming

Mobile Data

Management

Sensor Data

Capture

Page 7: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 7

The NoSQL Landscape

NoSQL

Columnar & Key/Value

• Keyspaces, Tables & Records

• Key-based access

• Limited Transactions

• Broad set of use cases

Document

• Collections

• Document-based access

• JSON & XML

• “Objects as documents” use cases

Graph

• Interconnected graphs

• Relatedness-based access

• Properties and Graphs, RDF

• Specific use cases

• Developer- centric APIs

• Flexible schemas

• Partitioned/sharded data

• Horizontally scalable

• High Availability via Replication

• Integrated with Hadoop

Page 8: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 8

Choose the RIGHT storage option for the job

Hadoop Distributed File

System (HDFS) Oracle NoSQL Database Oracle Database

File System Key-Value Database Relational Database

No inherent structure Simple data structure Complex data structures, rich SQL

High volume writes High volume random reads and

writes High volume OLTP with 2-PC

Limited functionality,

roll-your-own applications

Simple get/put high speed storage,

flex configuration

Security, Backup/Restore, Data life

cycle mgmt, XML, etc.

Batch Oriented Real-Time, web-scale specialized

applications

General purpose SQL platform,

multiple applications, ODBC, JDBC

Page 9: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 9

NoSQL Technology Choices What’s Really Important?

Technical Feature Importance Why

Storage Model Not really Will merge over time

Specific Features Somewhat Application requirements?

Performance Somewhat Rapid changes, YMWV

Integration Critical Long term, Repetitive cost

Reliability/Support Critical Early products, Product

direction

Predictability Critical Production reqs & SLAs

Page 10: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 10

Agenda

NoSQL Background

– How we got here

– Landscape and choices

Oracle NoSQL Database

– Overview

– Use cases

Early Adopter Feedback & Strategic Direction

Page 11: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 11

Big Data Architecture

Data Warehouse Data Reservoir +

Oracle Big Data Connectors

Oracle Data Integrator

Oracle

Advanced

Analytics

Oracle

Database

Oracle Spatial

& Graph

Oracle NoSQL

Database

Cloudera Hadoop

Oracle R Distribution

Oracle Industry

Models

Oracle GoldenGate

Oracle Data Integrator

Oracle Event Processing

Oracle Event Processing

Apache Flume

Oracle GoldenGate

Oracle Advanced

Analytics

Oracle Database

Oracle Spatial

& Graph

Oracle Industry

Models

Oracle Data Integrator

Oracle NoSQL Database

Where does NoSQL fit?

Page 12: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 12

Big Data & NoSQL Primer

1. Big Data != Hadoop

2. NoSQL != Hadoop

3. NoSQL != HDFS

4. NoSQL DB ~= HBase, but better

5. Big Data > Hadoop + HDFS

Page 13: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 13

Simple Data Model

Distributed, Replicated data

Transparent load balancing

Elastic configuration

Simple administration

Enterprise-ready Integration

Commercial grade software and support

Characteristics

Oracle NoSQL Database Scalable, Highly Available, Key-Value Database

Application

Storage Nodes Datacenter B

Storage Nodes Datacenter A

Application

NoSQL DB Driver

Application

NoSQL DB Driver

Application

Page 14: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 14

Features

Release

3.0

Oracle NoSQL Database Scalable, Highly Available, Key-Value Database

Application

Storage Nodes Datacenter B

Storage Nodes Datacenter A

Application

NoSQL DB Driver

Application

NoSQL DB Driver

Application

Key-value, JSON & RDF data

Large Object API

BASE & ACID Transactions

Data Center Support

Online Rolling Upgrade

Online Cluster Management

Table data model

Secondary Indices

Secondary Zones (Data Centers)

Security

Page 15: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 15

Scalability Architecture – Applications View

Elastic Shards

(split, add, contract)

Store

Shard

M

Shard

M

R

Shard

M

R R

Application

NoSQL Driver

R R

R

Writes to elected

node

Reads from any

node in system

Page 16: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 16

Automatic election of new Master

Rejoining nodes automatically

synchronize with the Master

Isolated nodes can still service reads

All nodes are symmetric

Automatic Failover

Features - Failover

Replication factor = 5

Rep

Node

Master

Rep

Node

Replica

Rep

Node

Replica

Rep

Node

Replica

Rep

Node

Replica

New Master

Page 17: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 17

Features – Flexible Data Model Simple data model – key-value pair (major+minor-key paradigm)

Simple operations – read/insert/update/delete, RMW support

Major key: hashed to a Shard (partition), Minor key Btree within a Shard

Raw Key/Value and JSON schema APIs supported

Key-Value pairs

userid

address subscriptions

email id phone # expiration date

Major key:

Minor key:

Value:

Strings

Byte Array

Value Options: Key-Value JSON RDF Triples Tables/Rows

picture

.jpg

Page 18: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 18

Features – Fexible Data Model

Benefits

– Lower barrier to adoption, shorter time to market

– Simplified application modeling

– Uses familiar table concepts

Features

– Layered on top of distributed key-value model

– Compatible with Release 2.0 JSON schemas

– Supports table evolution, retains flexible client access

Sets foundation for future capabilities

NoSQL DB Table Model

Page 19: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 19

Features – Fexible Data Model

table create -name Users

add-field -name userid -type integer

add-field -name lname -type string

add-field -name fname -type string

add-field -name email -type string

primary-key -field userid

shard-key -field userid

exit

plan add-table -name Users -wait

Simple Table Example

Can be specified as a JSON string

Must be proper subset of primary-key

By default shard-key == primary-key

Page 20: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 20

Features – Fexible Data Model Simple Table Example

userId lname fname email

Table

Shard Key

Users

Value Primary Key

Page 21: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 21

Features – Fexible Data Model

table create -name Users

add-field -name userid -type integer

add-field -name lname -type string

add-field -name fname -type string

add-field -name email -type string

primary-key -field userid

exit

plan add-table -name Users –wait

table create -name Users.Folders

add-field -name foldername -type string

add-field -name msgcount -type integer

add-field -name favorite –type boolean

-default 'F'

primary-key -field foldername

exit

plan add-table -name Users.Folders -wait

Nested Table Example

Page 22: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 22

Features – Fexible Data Model Nested Table Example

UserId lname fname email

Table

Users

Users.

Folders

[Value] Primary Key

[Value]

UserId Foldername msgcount favorite

Primary Key Shard Key

Page 23: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 23

Efficient storage and retrieval of large

objects

Client side streaming interface for low

memory consumption

Server side splitting and distribution of

object chunks across nodes for better

read/write latency

Automatic partial LOB detection

Parallel Streaming Interface

Features – Large Object Support

Large

Object

NoS

QL D

B D

river

Applic

ation

Shard 2

Shard N

Shard 1

Page 24: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 24

Configurable Durability per operation

Configurable Consistency per operation

ACID by default

Transaction scope is single API call

Records share same major key

Multiple operations supported

Greater Flexibility

Features – Configurable Transactions

Page 25: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 25

Storage nodes have indication of “capacity”

System allocates replicas per storage node

Intelligent Master/Replica load balancing

Ensures distribution of replicas

Efficient use of system resources

Reduces operator-caused configuration

errors

Ensures Data Center integrity

Automated Resource Planning

Features – Smart Topology Management

Application

Smart Topology Driver

Page 26: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 26

Web-based read-only console

Command Line Interface (CLI)

Manage Topology

Manage Objects & Configuration

– Tables, Schemas, Security, Users

Monitor Performance

SNMP and JMX monitoring support

Configuration & Monitoring

Features – Simple Administration

Page 27: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 27

Increase Data Capacity

– Add more storage nodes

– New shards automatically created

Increase Data Throughput

– More shards = better write throughput

– More replicas/shard = better read throughput

On Demand

Features -- Elasticity

NoSQL DB Driver

Application

Master

Replica

Replica

StorageNode StorageNode StorageNode

Shard-1

Master

Replica

Replica

Shard-2

On-Demand Cluster Expansion

Page 28: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 28

Supports heterogeneous storage topology

Replicas move from over-utilized to under-utilized storage nodes

Number of shards and replication factor remain unchanged

Improve Performance

Features – Automatic Rebalancing

Storage Node 1 Storage Node 2 Storage Node 3

Represents a partition

Page 29: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 29

Flexible configuration

Metro-Local Quorum

– Low latency writes, HA

Seondary Read-Only Zones

– Analytic workloads

– Oracle Reporting

– Asynchronous replication

Topology Aware Client Driver

Provides business continuity and distributed workload management

Availability Zones

Features – Data Center Support

DC1 DC2 DC3

Metropolitan Zones

Reports

Batch Analytics

Page 30: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 30

Configurable enforcement

Authentication

– User/Password

– Configurable client time-outs

– Oracle Wallet integration

– Internal components self-authenticate

Encryption over the wire

– All channels SSL encrypted

Data Access Protection

Features – Security

Store

Shard

M

Shard

M

R

Shard

M

R R

Application

NoSQL Driver

R R

R

Username

Password

SSL

Page 31: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 31

0

2,5

5

7,5

10

12,5

15

17,5

72 (24x3) 144 (48x3) 216 (72x3) Tim

e t

o U

pg

rad

e (

min

)

Total Nodes (Shards x Rep. Factor)

Online Rolling Upgrade

We did do it!

Admin commands available to

describe safe upgrade order

Scripted available hands-free

upgrade experience

Read/Write availability

throughout the upgrade process

What’s the Big Deal

Features – Online Rolling Upgrades Ever tried to upgrade a 200 node system while it’s active?

Page 32: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 32

Query NoSQL data from Oracle Database

Access NoSQL data from Hadoop for DW and analytics

Share data with Coherence for extensible in-memory cache grid

Persist history & event streams for processing with OEP

Store & query RDF data using Oracle RDF for NoSQL

Features – Integration Oracle NoSQL Database: Integrated out of the box

Page 33: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 33

NoSQL Integration

Available with Oracle NoSQL DB Enterprise Edition

Oracle Database SQL access to NoSQL Database data

Steps:

1. Create NoSQL DB table formatter (use sample template)

2. Define External Table in SQL

3. Define Configuration file (use sample XML template)

4. Use NoSQL Database Publish utility

5. Use SQL to access NoSQL data

Oracle External Tables

Page 34: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 34

OEP for event-driven &

streaming applications

Oracle NoSQL DB accessed

from KV Cartridge

NoSQL DB data directly

accessible via CQL queries

Features

NoSQL Integration Oracle Event Processing

Page 35: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 35

NoSQL Integration

• Unified content metadata for federated resources

• Validate semantic and structural consistency

Social Media Analysis

Analyze social relations

using curated metadata

- Blogs, wikis, video

- Calendars, IM, voice

Semantic

Metadata Layer

Find related content & relations by navigating connected entities

“Reason” across entities

Text Mining & Entity Analytics

RDF – Typical Use Cases

Page 36: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 36

RDF Graph for NoSQL DB Enterprise Edition

W3C standards compliance

Horizontally scalable graph operations

Develop with Apache Jena open source Java

APIs

Query with Apache Jena Joseki SPARQL end

point web services

Inference with Apache Jena & open source

reasoners

Use tools for query, visualization, and

ontology engineering from open source &

commercial 3rd parties with Apache Jena

Benefits

• SPARQL graph queries

•Apache Jena Java APIs

• Apache Joseki SPARQL end point

• W3C RDFS and OWL

• Plug-in architecture

Key Capabilities

Load / Storage

Query

Reasoning

•RDF data on key/value store

•ACID & BASE consistency

• Fast distributed load

Page 37: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 37

Reliability & Support

Decades of widespread, reliable deployment experience

15+ years of mission-critical non-relational database technology

Oracle Support available for both Enterprise and Community Edition

Oracle NoSQL Database: Enterprise-Grade Software & Support

Page 38: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 38

0

1

2

3

4

0

200.000

400.000

600.000

800.000

1.000.000

1.200.000

1.400.000

6 (2x3) 12 (4x3) 24 (8x3) 30 (10x3)

Ave

rag

e L

ate

nc

y (

ms

)

Th

rou

gh

pu

t (o

ps/s

ec)

Cluster Size

Mixed Throughput

Throughput (ops/sec) Write Latency (ms)

Read Latency (ms)

•1.25M ops/sec

• 2 billion records

• 2 TB of data

• 95% read, 5% update

• Low latency

• High Scalability

Benchmark Results - YCSB (Yahoo Cloud Scalability Benchmark)

Page 39: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 39

Predictability Oracle NoSQL Database: Designed for Predictability - Insert Test

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

0,2 1,8 3,5 5,2 6,8 8,5 10,2 11,8 13,5 15,2 16,8 18,5

Th

rou

gh

pu

t (o

ps

/se

c)

Time (minutes)

Insert Performance

Oracle NoSQL … Other NoSQL …

Page 40: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 40

Predictability Oracle NoSQL Database: Designed for Predictability

0

1

2

3

4

5

6

7

8

0

10.000

20.000

30.000

40.000

50.000

60.000

70.000

80.000

144 (48x3)

20% 40% 60% 80% 216 (72x3)

216* (72x3)

Ave

rag

e L

ate

nc

y (

ms

)

Th

rou

gh

pu

t (o

ps

/se

c)

Nodes (Shards x RF)

95/5 Read/Update Throughput

Throughput (ops/sec) Read Latency (ms) Update Latency (ms)

Page 41: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 41

Oracle NoSQL Database When it really matters

Integration

Predictability

Reliability & Support

Page 42: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 42

Agenda

NoSQL Background

– How we got here

– Landscape and choices

Oracle NoSQL Database

– Overview

– Use cases

Early Adopter Feedback & Strategic Direction

Page 43: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 43

Typical NoSQL Use Cases

Sensor Data Management

Real-Time Event Processing

Web-Scale Transactions, Personalization

Page 44: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 44

Objectives

Solution

Benefits

Smartphone personalization & advertizing

Improve revenue by increasing of market

segmentation outside of the carrier

Oracle NoSQL database to store user

access profiles (address, preferences,

purchase history, context, etc )

“Global” ID for every user to allow Oracle

NoSQL database low latency lookup

Segmentation analysis, Ad generation and

recommendations

Enterprise support

Flexible schema for changing profiles

Low latency and high availability

Ease of management and administration

NoSQL DB Driver

Application

Retail Partners

Customer Profiles

Mobile Consumers

Custom Campaign

Billing

Use Case: Scalable Xactions & Personalization Real-Time Profiling & Mobile Marketing

Page 45: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 45

Use Case: Scalable Xactions & Personalization Customer Loyalty Program, Coupon Redemption

Objectives

Scalable customer loyalty portal

New multi-channel consumer model

Improve operational efficiency

Solution

Personalized multi-channel coupon

generation and redemption

Cross-promote affiliated vendors

Scale system with customers and

participating retailers

NoSQL DB Driver

Application

Retail Partners

Customer Profiles

End Customers

Available Coupons Market Segmentation

Frank Puechl Senior Data Architect

PAYBACK

“Oracle NoSQL DB handles high volumes of

customer loyalty operations every day, minimizing

the load to our OLTP Oracle RAC Database.”

Page 46: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 46

Use Case: Real-time Fraud Scoring Financial Services coordinated theft prevention

Objectives

Solution

Benefits

Application Data Ingestion

Tra

nsaction A

uth

orization

Pro

cessor

Combine data sources for complex scoring

Detect, alert analyst with low latency

Handle burst seasonal transaction volumes

Oracle NoSQL Database for fraud

prevention rules

Oracle NoSQL Database customer profile

and historic management

Oracle Database for statistics and fraud

modeling-related data

Simple data model, flexible transactions

Scalable, Low Latency data management

Easy configuration and administration

Enterprise Support

NoSQL DB Driver

Page 47: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 47

Use Case: Real-time Events & Transactions Innovative In-Play Online Betting

Objectives

Scalable in-play sports betting platform

Increase new business revenue

Improve operational efficiency

Solution

Match in-play bets with incoming events

Promote interaction between customers

Scale system with customers and events

Feeds MySQL database for revenue

tracking and operational reporting

James Anthony Chief Technology Officer

Passoker

“Oracle NoSQL Database enabled the rapid,

scalable processing of incoming XML, ensuring

high available and guaranteed event ordering.”

NoSQL DB MySQL

Accounting &

Operations

Event Capture

& Store Customers

Real-Time, In-Play Sports Betting

Providers

XML App

Page 48: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 48

Use Case: Sensor Data Management Large Scale Sensor Data Capture and Analysis

Objectives

Solution

Benefits

• Increase scalability of data storage

• Deliver higher concurrency analytic data access

• Scale data loading independently from analysis

• Commercial support for mission critical system

• Oracle NoSQL database for high speed storage

and range based extraction of time series data.

• Oracle NoSQL database for agile schema, replaced

HDF5 storage format, kept analysis client program

• Oracle Big Data Appliance for efficient

manageability and lowest TCO

• Hadoop post processing and RDBMS connectivity

to Enterprise systems

• Improve scale of storage for flight test sensor data

• Increase concurrency of access to data for analysis

• Improve system availability for analysts by allowing

simultaneous data ingestion and analysis

Big Data Appliance

NoSQL DB Driver

Event Ingestion and Extraction

NoSQLDB/ Oracle RDBMS

Hadoop/ Oracle RDBMS

Oracle or Any third parties

SQL/Data Analytics Tools

Page 49: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 49

Oracle NoSQL Database

Predictability

Reliability & Support

Integration

When you need:

Web-Scale Transactions, Personalization

Sensor Data Management

Real-Time Event Processing

For Applications that do:

Page 50: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 50

Appendix Contents

• Licensing and Support

• Resources

• Background material

Page 51: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 51

Oracle NoSQL DB Licensing

Enterprise Edition

– Closed Source. Standard Oracle License.

Community Edition has all of the basic

functionality and APIs. Gets you started.

Enterprise Edition for large, production,

multi-data center, Oracle integration-

centric customers and/or non-AGPL

compliant customers.

Community -or- Enterprise Edition

Page 52: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 52

Oracle NoSQL Database Subscription Model

Oracle NoSQL Database Community Edition

– Open Source AGPL Edition

Support is now available for Community Edition

– Price is $2,000/year per server

– No upfront license fee

– Provides full Oracle support policy response

– Purchase online via the Oracle Store

Offers affordable support option for startups

Provides Oracle expertise for production deployment

New business-friendly support service

The Store

https://shop.oracle.com/

Page 53: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 53

Oracle NoSQL DB Resources - External

Oracle.com

www.oracle.com/goto/bigdata

www.oracle.com/goto/nosql

Oracle Technology Network:

http://www.oracle.com/technetwork/products/nosqldb/overview/index.html

Downloads, Documentation, Tutorials, White Papers are on OTN

Page 54: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 54

NoSQL Terminology

Topology

– System resources available to NoSQL Database (storage), allocation and state of those

resources.

Storage Nodes

– Physical or Virtual system with CPU, Memory and Disk. Has one or more Replicas, based on

defined “capacity” value (ie. Capacity=3 implies that the system can manage 3 replicas).

– Each Storage Node is managed by a local Storage Node Agent.

Replica or Replication Nodes

– Storage (log files) containing a copy of a data set. Replicas are allocated to Storage Nodes.

Replicas can be either a Master (read/write) or a Replica (read only).

Replication Factor

– The number of copies or replicas of the data set that are automatically maintained by the system.

Default Replication Factor = 3, implying a Master and two Replicas.

Server Storage – Physical Allocation

Page 55: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 55

NoSQL Terminology

Keyspace or Datastore

– All of the records in the NoSQL Database.

Shards

– A group of partitions, managed as a single unit of storage. Also called a “replication group”.

Partitions

– A collection of records.

– Smallest unit of data migration when expanding or rebalancing the NoSQL DB.

Records

– Key-Value pairs, where the Key is a multi-part string, and the Value can contain simple or

composite values.

– Records are managed within Partitions based on the hash function of the major key component.

– All records containing the same major key are managed within the same partition.

Server Storage – Logical Allocation

Page 56: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 56

NoSQL Terminology

JSON

– JavaScript Object Notation, an open standard format for

defining flexible data record structure and content. Not

dependent on JavaScript.

AVRO

– Apache project. A data serialization system that produces

compact and efficient record representation. Works with JSON

objects.

BSON

– Binary JSON serialization format used primarily by MongoDB.

Serialization/De-Serialization

– Application process to convert an application object into/from

its binary storage format.

Application

{

Name: “John”,

Age: 35

Address: “CA”

}

JSON

Page 57: 1 Copyright © 2012, Oracle and/or its affiliates. All ... · Title: Big Data in the Data Center Author: Jean-Pierre Dijcks Subject: Big Data Architecture Keywords: Big Data, hadoop,

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 57

JSON Data Format

Why Avro?

– Compact, highly efficient serialization

– Synergy with Hadoop

Schema

– DDL allows schema creation through Avro JSON definition

– Supports serialization from/to JSON strings

Schema evolution

– Easy to use mechanism for schema evolution

– Schema versions can be opaque to readers

Avro based Serialization/De-serialization