Presentazione VMware @ VMUGIT UserCon 2015

40
VMware Virtual SAN Duncan Epping Chief Technologist Office of CTO Storage & Availability The story up to version 6.1, and a hint of what is coming

Transcript of Presentazione VMware @ VMUGIT UserCon 2015

Page 1: Presentazione VMware @ VMUGIT UserCon 2015

VMware Virtual SAN

Duncan Epping Chief Technologist Office of CTO Storage & Availability

The story up to version 6.1, and a hint of what is coming

Page 2: Presentazione VMware @ VMUGIT UserCon 2015

Agenda

1 Introduction

2 Virtual SAN, what is it?

3 Virtual SAN, a bit of a deeper dive

4 Version 6.1 specifics

5 Virtual SAN, the future

2

Page 3: Presentazione VMware @ VMUGIT UserCon 2015

The Software Defined Data Center

3

Compute Networking Storage

Management

• All infrastructure services virtualized: compute, networking, storage

• Control of data center automated by software (management, security)

• Unified platform for existing and new apps, delivered to many devices

• Any size / scale!

Page 4: Presentazione VMware @ VMUGIT UserCon 2015

Goodbye SAN/Server Huggers

4

Page 5: Presentazione VMware @ VMUGIT UserCon 2015

5

Hardware evolution started the

storage revolution

Page 6: Presentazione VMware @ VMUGIT UserCon 2015

What is our goal?

CHOICE SIMPLICITY COST

$

3

PERFORMANCE

AND SCALABILITY

Page 7: Presentazione VMware @ VMUGIT UserCon 2015

The Hypervisor is the Strategic High Ground

7

Object Storage x86 - HCI SAN/NAS

VMware vSphere

Page 8: Presentazione VMware @ VMUGIT UserCon 2015

Storage Policy-Based Management – App centric automation

8

Overview

• Intelligent placement

• Fine control of services at VM level

• Automation at scale through policy

• Change policy when required

• Attach new policy

Virtual Machine Storage policy

Reserve Capacity 40GB

Availability 2 Failures to tolerate

Read Cache 50%

Stripe Width 6

Storage Policy-Based Management

vSphere

Virtual SAN Virtual Volumes

Virtual Datastore

Page 9: Presentazione VMware @ VMUGIT UserCon 2015

Storage Policy Based Management – What does it look like?

9

If the storage can satisfy the VM Storage Policy, the VM Summary tab in the vSphere client will display the VM as compliant.

If not, either due to failures, or other reasons, the VM will be shown as non-compliant.

Page 10: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN,

what is it?

10

Page 11: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN, what is it?

11

Hyper-Converged Infrastructure

Distributed, Scale-out Architecture

Integrated with vSphere platform

Ready for today’s vSphere use cases

Software-Defined Storage

vSphere & Virtual SAN

Page 12: Presentazione VMware @ VMUGIT UserCon 2015

VSAN is the Most Widely Adopted HCI Product Over 2000 customers within 15 months and growing rapidly

Page 13: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN Use Cases

13

VMware vSphere + Virtual SAN

End User

Computing Test/Dev

ROBO Staging Management DMZ

Business

Critical Apps DR / DA

Page 14: Presentazione VMware @ VMUGIT UserCon 2015

Hyper-Converged Consumption Models

14

VMware Virtual SAN Software Software-Defined Storage Foundation for Hyper-Converged

VMware Hyper-Converged Infrastructure

Virtual SAN

Ready Nodes

• Certified server configuration

• 90+ configuration options

EVO:RAIL

• Pre-integrated and pre-configured appliance

• Rapidly deploy and manage

EVO SDDC Suite

• Pre-integrated and pre-configured data-center

• Advanced automation with rapid deployment

Page 15: Presentazione VMware @ VMUGIT UserCon 2015

Tiered Hybrid and All-Flash Options

15

All-Flash

90K IOPS per Host +

sub-millisecond latency

Caching

Writes cached first, Reads from capacity tier

Capacity Tier Flash Devices

Reads go directly to capacity tier

SSD PCIe Ultra DIMM

Data Persistence

Hybrid

40K IOPS per Host

Read and Write Cache

Capacity Tier SAS / NL-SAS / SATA

SSD PCIe Ultra DIMM

Virtual SAN

Page 16: Presentazione VMware @ VMUGIT UserCon 2015

Production – Typical Design

• In production most used is a 2U server platform

• 256 / 384GB and 2 x 10 core Intel (v3) CPU

• Two disk groups is typical per server

• Per disk group:

– Disk controller SAS Expander

– 400GB SSD for read/write cache

– 6 magnetic disks per disk group

• Most customers use SAS

– Those who use NL-SAS buy more flash!

• Note, more than 8 disks? SAS Expander!

• 2 x 10GbE

16

Page 17: Presentazione VMware @ VMUGIT UserCon 2015

Yes… really simple!

Virtual SAN is a cluster level feature similar to:

– vSphere DRS

– vSphere HA

– Virtual SAN

Deployed, configured and manage from vCenter through the vSphere Web Client

– Radically simple

• Configure VMkernel interface for Virtual SAN

• Enable Virtual SAN by clicking Turn On

17

Page 18: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN,

a bit of a deeper dive

18

Page 19: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN, what it is not... and what it is

It’s not a distributed file system, it’s an object store!

• Object Tree with Branches

• Each Object has multiple Components

• This allows you to meet availability and performance requirements

• You can view it as “Distributed RAID” using 2 techniques:

– Striping (RAID-0)

– Mirroring (RAID-1)

• Data is distributed based on policy

19

RAID-1

Mirror Copy Mirror Copy

ESXi Host ESXi Host

stripe-1b

stripe-1a

stripe-2b

stripe-2a

RAID-0 RAID-0

witness

VMDK Object

Page 20: Presentazione VMware @ VMUGIT UserCon 2015

Define a policy first…

Virtual SAN currently surfaces five unique storage capabilities to vCenter Server

20

What If APIs

Page 21: Presentazione VMware @ VMUGIT UserCon 2015

Assign it to a new or existing VM

When the policy is selected, Virtual SAN uses it to place / distribute the VM to guarantee availability and Performance

21

Page 22: Presentazione VMware @ VMUGIT UserCon 2015

Number of Failures to Tolerate

• Defines the number of hosts, disk or network failures a storage object can tolerate.

• For “n” failures tolerated, “n+1” copies of the object are created and “2n+1” host contributing storage are required!

22

esxi-01 esxi-02 esxi-03 esxi-04

Virtual SAN Policy: “Number of failures to tolerate = 1”

vmdk

~50% of I/O

vmdk witness

~50% of I/O

RAID-1

Page 23: Presentazione VMware @ VMUGIT UserCon 2015

Number of Disk Stripes Per Object

• The number of HDDs across which each replica of a storage object is distributed. Higher values may result in better performance.

23

esxi-01 esxi-02 esxi-03 esxi-04

VSAN Policy: “Number of failures to tolerate = 1” + “Stripe Width =2”

stripe-1a stripe-2a witness

RAID-0 RAID-0

stripe-1b stripe-2b

RAID-1

Page 24: Presentazione VMware @ VMUGIT UserCon 2015

Fault Domains, increasing availability through awareness

• Create fault domains to increase availability

• Four defined fault domains

FD1 = esxi-01, esxi-02 FD3 = esxi-05, esxi-06

FD2 = esxi-03, esxi-04 FD4 = esxi-7, esxi-08

• To protect against one rack failure only 2 replicas are required and a witness across 3 failure domains!

24

FD2 FD3 FD4

vmdk vmdk witness

RAID-1

esxi-01

esxi-02

esxi-03

esxi-04

esxi-05

esxi-06

esxi-07

esxi-08

FD1

Page 25: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN,

6.1 specifics

25

Page 26: Presentazione VMware @ VMUGIT UserCon 2015

26

VSAN 5.5 March 2014

VSAN 6.0 March 2015

All Flash

64 host Clusters

x2 Hybrid Performance

VSAN Snapshots

VSAN Clones

Rack Awareness

VSAN 6.1 September 2015

Stretched Cluster + ROBO

vROps Management Pack

Replication - 5 Minutes RPO

Health and Performance Monitoring

Root Cause Analysis & Guided Remediation

Page 27: Presentazione VMware @ VMUGIT UserCon 2015

New Flash Hardware Devices Supported

High Density Flash Devices

• NVMe allows for greater parallelism to be utilized by both hardware and software and as a result various performance improvements

• ULLtraDIMM™ SSDs connect flash storage to the memory channel via DIMM slots, achieving very low (<5us) write latency

27

Ultra DIMM

vSphere & Virtual SAN

Page 28: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN – Stretched Cluster

28

Active-Active data centers

• Virtual SAN cluster split across 2 sites!

• Site-level protection with zero data loss and near-instantaneous recovery

• Support for up to 5ms RTT latency between data sites

– 10Gbps bandwidth expectation

• Witness VM can reside anywhere

– 200ms RTT latency

– 100Mbps bandwidth required at most

• Automated failover

• Requires Virtual SAN Advanced license

witness

5ms RTT, 10GbE

Today

VMware vSphere & Virtual SAN

Page 29: Presentazione VMware @ VMUGIT UserCon 2015

Stretched Clusters, some things you need to know

• We introduced the concept of site locality

– Latency can be up to 5ms, you don’t want to incur that on every read

– In order for reads to come from “site local” cache, define DRS rules!

• We support both L2 and L3 for the Virtual SAN network

– Keep in mind you need multicast routing for L3 between the data sites

– The witness does not require multicast any longer

29

Page 30: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN – Disaster Recovery

30

Today

witness

5ms RTT, 10GbE

VMware vSphere & Virtual SAN

Site Recovery Manager

vSphere & Virtual SAN

Site Recovery Manager

• Replication between Virtual SAN datastores enables RPO as low as 5 minutes

– Exclusively available to Virtual SAN 6.x, leverages vSphere Replication

• Leverage Site Recovery Manager for disaster recovery orchestration

• Stretched across metro distance, replicated across geo!

Page 31: Presentazione VMware @ VMUGIT UserCon 2015

ROBO Deployments

• As of VSAN 6.1 we support 2 host clusters for ROBO deployments

– Both hosts only used to store “data”

• Extension of VSAN Stretched Cluster solution

• Each of the nodes will be a Fault Domain (FD)

• One witness host needed per VSAN cluster

– Witness node can be an ESXi VM

– 500ms RTT latency

• All sites managed centrally by one vCenter instance

• Patching and software upgrades performed centrally through vCenter Server

31

witness

witness

witness

vSphere & VSAN

vSphere & Virtual SAN

vSphere & VSAN

vSphere & VSAN

Page 32: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN Witness Appliance

32

• Witness appliance:

– ONLY supported with Stretched Cluster and ROBO

– ONLY stores meta-data NOT customer data

– is not able to host any virtual machines

– can be re-created in event of failure

• Appliance requirements:

– at least three VMDK’s

– Boot disks for ESXi requires 20GB

– Capacity tier requires 16MB per witness component

– Caching tier is 10% of capacity tier

– Both tiers on witness could be on MDs

• The amount of storage on the witness is related to number of components on the witness

Witness Appliance

vESXi

Page 33: Presentazione VMware @ VMUGIT UserCon 2015

Advanced Monitoring and Troubleshooting with VROps

33

• Introduced with Virtual SAN 6.1

• Comprehensive global view across

multiple Virtual SAN cluster

• Hundred of KPIs simplified to an easy

to consume dashboard

• Smart alerts deliver insight and

information – correlate symptoms

across the stack

• Supported with VROps Standard!

Page 34: Presentazione VMware @ VMUGIT UserCon 2015

Advanced Troubleshooting with VSAN Health Check Plug-in

34

• Cluster Health

• Network Health

• Data Health

• Limits Health

• Physical Disk Health

Page 35: Presentazione VMware @ VMUGIT UserCon 2015

Virtual SAN,

the future...

35

Page 36: Presentazione VMware @ VMUGIT UserCon 2015

RAID-5 and RAID-6 over the network

• With “FTT=1” availability RAID-5

– 3+1 (4 host minimum)

– 1.33x instead of 2x overhead

• 20GB disk normally takes 40GB, now just ~27GB

36

RAID-5

ESXi Host

parity

data

data

data

Beta coming!

ESXi Host

data

parity

data

data

ESXi Host

data

data

parity

data

ESXi Host

data

data

data

parity

• With “FTT=2” availability RAID-6

– 4+2 (6 host minimum)

– 1.5x instead of 3x overhead

• 20GB disk normally takes 60GB, now just ~30GB

Page 37: Presentazione VMware @ VMUGIT UserCon 2015

Deduplication and compression for Space Efficiency

• Deduplication and compression per disk group level, up to 8x data reduction

– Will be called “Space Efficiency”

• Space Efficiency enabled on a cluster level

• Deduplicated when de-staging from cache tier to capacity tier

– Fixed block length deduplication (4KB Blocks)

• Compressed after deduplication

– 4KB unique becomes 2KB or 1KB

– Optimizing still cost vs benefit!

37

Beta Beta coming!

esxi-01 esxi-02 esxi-03

vmdk vmdk

vSphere & Virtual SAN

vmdk

Page 38: Presentazione VMware @ VMUGIT UserCon 2015

Serviceability: Performance Monitoring

38 Disk Group Level Disk Level Cluster Level Host Level

Page 39: Presentazione VMware @ VMUGIT UserCon 2015

VMware Virtual SAN: Generic Object Storage Platform

VMware vSphere

Virtual SAN

VMFS Block File Rest

Page 40: Presentazione VMware @ VMUGIT UserCon 2015