SteelEye 표준 제안서

SteelEye 소개

SteelEye Protection Suite

About SIOS Technology Corp.

1997 2001 2005 2007 ～

Open Source Solution

JavaValu

e f

or

Busi

ness

High Availability

Cloud#1 Public Cloud Service(NikkeiBP)

#2 HA Software

#1 APAC Best Partner Award

(RedHat)

SIOS Business Area

•1997: Company Founded•2003: Business partnership with Red Hat•2004: IPO on the JStock Exchange•2005: Acquired SteelEye Technology, Inc

1. About SIOS Technology

10 년 이상 Open Source 환경의 사업기반을 통해 , 위 환경에서 최적화된 이중화 솔루션사업과 Cloud 서비스로 사업영역을 확대

•Established in 1999 as SteelEye Technology, part of SIOS Technology (publicly traded in Japan) since 2005•Provides Best-In-Class High Availability, Data Replication and Disaster Recovery solutions•Over 35,000 licenses installed worldwide•Strategic Relationships with HP, IBM and SAP•Multi-time award winner for Linux High Availability with RedHat and Novell certified solutions•Microsoft Gold Certified Partner

2. About SteelEye Solution

Long terms of Focus on Ensuring Availability and Architecture Consistency

1992AT&T

Bell Lab’s Cluster R&D

NCR1996

Spinout to NCR R&D in South Carolina

SteelEye1999

Combine Cluster& Data Replication

SIOS2006

Software for Innovation Open Solutions

1. 제안 배경

2. SteelEye 소개

별첨 5. DR 정책 수립과 DR 고객 사례

CONTENTS

3. SteelEye 구성 방안

별첨 1. SteelEye 아키텍쳐

4. Reference

5. 결론

별첨 2. H/A 와 유사 솔류션 비교

별첨 4. vAppKeeper 소개

별첨 3. 데이터 복제 방식 비교

별첨 6. Heartbeat 구성과 IO Fencing

1. 제안 배경

4. 변화에 따른 당면 과제•RDB 는 RAC 를 쓰기 위해 고비용의 Oracle 이 계속 강세로 갈것인가 ?•High-End 대용량 SAN Storage 가 Scale-Out 개념의 환경에 적합한가 ?•RDB 이외의 Open Source 기반 S/W 들의 이중화는 어떻게 할것 인가 ?•과거의 이중화나 DR 구축 방안들은 새로운 환경에 적합한가 ?

새로운 환경에 적합한 이중화 및 DR Solution 필요성 대두

•H/W, OS, S/W Full One Vendor•고가 , Permanent License

Host

•H/W, OS One vendor•S/W Multi vendor•개발 , Support 각 Vendor•고가 , Permanent License

Unix Linux

•H/W, OS, S/W Full Multi vendor•Open Source•개발 , Support Multi vendor•저가 , Subscription License

1. IT Trend 변화

•‘Linux’ 와 ‘ x86’ 서버 성능 , 가격 , 안정성 검증 완료•‘ 가상화’와 ‘ Cloud’ 필요성 대두 효율화 , Scale-out, 상면 , 저전력 , 친환경

2. Linux 환경으로의 변화 배경

•UNIX X86, Linux, 가상화 , Cloud 환경•Oracle DB Open Source RDB•Vendor S/W Open Source S/W

3. IT 인프라 환경 변화 x86, 가상화 ,Cloud 환경을 지원하는가 ? 다양한 Linux 배포판을 지원하는가 ? Non-shared storage 도 지원하는가 ? 다양한 이중화 대상 S/W 를 지원하는가 ? 저비용 / 고효율 이중화 /DR 구축 가능한가 ?

1. 제안 배경

2. SteelEye 소개

Server

Data

Instance

Server

Data

Instance

Server

Data

Instance

Server

Instance

H/A

Replication

Active Stand byActive Stand by

H/A

2. SteelEye 소개 기본 개념

모니터링 Resource (Application, Server, Storage, Data, Network 등 ) 을 주기적으로 감시하여 , 장애발생시 자동으로 Fail-over 하여 서비스를 복구

SteelEye 는 Shared Storage 환경”의 H/A Cluster 및 Shared Nothing 환경에서 Replication 을 통한 H/A Cluster 두가지 구성 제공

Shared Storage Cluster Shared Nothing Cluster

Fail-over Fail-over

2. SteelEye 소개 Shared Storage vs. Shared Nothing

Shared Storage Cluster Shared Nothing Cluster

• LAN or WAN recovery 환경도 가능• Shared storage 의 single point of failure

제거• DR 구성에 적합• 기존 Storage Replication 대비 비용 절감• 복제된 데이터도 H/A Automated failover

protection 의 한 구성 요소로 관리

• Fibre Channel SAN, iSCSI or NAS 필요• 동일 Data Center 내에서만 가능• 데이터 정합성 보장을 위한 I/O fencing 은

SCSI 3 PR 기본 제공 ( 추가 fencing 구성 가능 )

• 여러 storage type 지원• 여러 Multi-Path solution 지원• Storage 에 Single Point of failure 존재

Server

Data

Instance

Server

Data

Instance

Server

Data

Instance

Server

Instance

H/A

Replication


H/A

Fail-over Fail-over

항목 비교 설명

비용Shared Nothing

우수

Shared Storage 로 이중화 구성시 , SAN Switch 및 외장 Storage 로 공유환경을 구성하여야 하므로 , Local Disk 나 DAS 로 Storage 를 구성하는 환경에 비해 Storage 구성비용이 상대적으로 고가

Component이중화

Shared Nothing 우수

Shared Storage 환경으로 이중화 구성시 , 서버 장애는 대비가 되지만 , Storage

장애 시 서비스를 Fail-over 할 수 없는 SPOF(Single Point of failure) 가 존재

Active 노드의Write 성능

Shared Storage 우수

Replication 을 Async 로 구성 시 는 성능이 동일하나 , Sync 방식으로 구성 시 ,

Standby 노드 까지 Write 가 완료 되어야만 , Active 노드의 Write 가 완료되는 구조이므로 , Active 노드의 Write 작업에 일부 성능 저하

Active 노드의Read 성능

동일 Read 는 Active 노드 단독으로만 처리하기 때문에 영향 없음

DR 구성 Shared Nothing Only Replication 을 통한 DR 구성

ReplicatedStorage

활용

임시 테스트환경

Shared Nothing Only

Standby 노드로의 복제를 임시 중단하고 , Standby 시스템을 테스트용으로 활용 가능하다 . 테스트 완료 후 복제를 재개하면 , 전체 스토리지 볼륨을 복제하는 것이 아니고 , 테스트 시에 변경된 블럭과 복제가 중단된 블록만 다시 Sync 하여 , 빠른 시간 안에 HA Standby 로 복귀가 가능

Rolling Patch작업

Shared Nothing Only

복제 구성 시 , OS 나 DB 같은 시스템 S/W 가 설치된 볼륨은 복제를 하지 않고 ,

데이터 영역만 복제 구성을 합니다 . OS, DB 등의 S/W 영역에만 변경이 일어나는 Patch 와 같은 작업 시 일부 절체 시간의 중단만으로 , Active 노드를 변경하면서 작업이 가능

2. SteelEye 소개 Shared Storage vs. Shared Nothing

2. SteelEye 소개 Scalable Availability

All configurations supported across both physical and virtual servers

Single Node Monitoring &

Recovery

Two Node LAN Failover Cluster with

Shared Storage

Two Node LAN Failover Cluster with

Data Replication

N-Node WAN Failover Cluster with Data Replication (DR)

Hybrid Shared Storage Cluster with WAN replication (DR)

SteelEye Protection SuiteSteelEye Protection Suite

ApplicationRecovery

Kits

ApplicationRecovery

KitsLifeKeeperLifeKeeper DataKeeperDataKeeper

LifeKeeper: Server 및 Application 의 장애 감지를 통한 자동 fail-over 를 담당하는 H/A Cluster 모듈 DataKeeper: Real-time, High performance 의 Data volume Replication 모듈로 LifeKeeper 와 연동 ARK: Application 의 장애 감지 및 fail-over 를 위한 Built-in 된 Knowledge 모듈로 LifeKeeper 와 연동

2. SteelEye 소개 Product 구성

Combining High Availability with efficient Data Replication to ensure Business Continuity for your Mission Critical Apps!

다양한x86 환경

지원

우수한복제 성능

다양한 Resource

지원

다양한 Enterprise Linux 배포판 지원다양한 가상화 , Cloud 환경 지원Shared Storage 외 Local, DAS Storage 지원각 스토리지 밴더의 multipath 드라이버 지원

LAN/WAN 환경에서의 Host-based ReplicationSync/Async/Periodic 모드 복제 지원Block 단위 Volumn/LUN 복제로 대용량 파일 처리에 적합Fail-over 시 자동 Source/Target 변경각각 다른 설정의 Multi-target 지원Dirty block 을 bitmap 으로 관리하여 full resync 방지복제 대역폭 제한 및 9 단계의 압축 전송 지원

30 여개의 주요한 Application 에 최적화된 knowledge module각 리소스 타입별 최적화된 기동 / 정지 , 상태 check 제공리소스 타입별로 2 level(quick/deep check) health check 제공

구성 및운영

편의성각 리소스 type 별 wizard 를 통한 리소스 등록 및 관리Java 기반 GUI 및 CLI 제공비즈니스 변화에 따른 노드 증설 , 변경 및 축소 용이함클러스터 상태 모니터링을 위한 SMTP/SNMP trap 지원

Shared Storage 및 Shared Nothing 환경 지원1:1, 1:N, N:1, DR, cross standby 구성 지원Virtual, Physical 간의 자유로운 이중화 구성 지원

다양한 구성

2. SteelEye 소개 Key feature

•Linux RHEL, SLES,

OEL, CentOS, Asianux

•Windows 2003,2008, 2012

•Citrix XenServer•MS Hyper-V•Red Hat KVM•OracleVM•Vmware ESX

•Shared Storage SAN, iSCSI, NAS

•Non-Shared Storage

Internal Disk, DAS, Fusion IO

Storage Type

Virtualizations

O/S

Supported Environment

2. SteelEye 소개 Support Environ-ment

ApplicationRecovery

Kits

•Apache•Samba•NFS•SW Raid(md)

•SAP•WebSphere MQ•Exchange•Any Custom App

•Oracle•MySQL•PostgreSQL•Sybase•DB2•MSSQL

•DMMP•NAS•EMC

PowerPath•Hitachi HDLM•IBM SDD•Data

Replication

Storage

ApplicationsServices

Databases

2. SteelEye 소개 Support ARK

2. SteelEye 소개

GUI 를 통한 리소스 등록 및 관리 가능

각 리소스 타입별 관리 메뉴 제공

각 리소스 타입별 설정 마법사 제공

각종 로그 조회 및 리소스 상태 관제 가능

GUI

3. SteelEye 구성 방안

Server

Data

Instance

Server

Data

Instance

Server

Data

Instance

Server

Instance

Replication


3. SteelEye 구성 방안 기본 구성

모든 구성에서 Server 는 Physical, Virtual 모두 가능즉 , PP, PV, VP, VV 모두 가능

Shared Storage Shared Nothing

H/A H/A

Data2

Server

Active

Data1

Sync

Server

Active

Data2

Server

Data1

Instance2

Standby

Instance2

Instance1

Instance1

Sync Data2

Server

Active

Server

Active

ServerData1

Instance2

Standby

Instance2

Instance1

Instance1

Shared Nothing Shared Storage

3. SteelEye 구성 방안 N:1 구성

Server

Active/Standby

Data2

Sync

Data1

Instance2Instance1

Server

Data2Data1

Instance2Instance1

Sync

Active/Standby

Shared Nothing

Shared StorageServer

Active/Standby

Data2Data1

Instance2Instance1

Server

Instance2Instance1

Active/Standby

3. SteelEye 구성 방안 Crose Standby

Server

Data

Instance

Active

Server

Instance

Standby

DataSync

Server

Instance

DR

Data

Async

InstanceShared Nothing

Shared Storage

Server

Data

Instance

Active

Server

Instance

Standby

Server

Instance

DR

DataAsync

Instance

3. SteelEye 구성 방안 DR 구성

H/A

H/A H/A

H/A

4. Reference

http://www.cgbest.co.kr/main/main.jsp

http://www.gilhospital.com/

http://www.nis.go.kr/svc/index.do?method=content&cmid=10200

5. 결론

SteelEye Protection Suite10 년이상 검증된 Architecture 의 Consistency

x86(Linux, Windows), 가상화 , Cloud 환경에 최적화

Open Source 를 포함한 다양한 Linux 배포 버전을 지원

다양한 Resource 들을 Script 작성 기반이 아닌 지능화된 Application 감시 모듈

다양한 환경 구성 (1:1, N:1, DR, cross standby, Shared Storage/Shared Nothing)

Block 기반 복제로 빠른 성능 및 DB 이외의 다양한 형태의 Replication 지원

설치 , 구성 , 운영 작업에 직관적인 Wizard 형태의 GUI 제공

HA Fail-over, Data Replication, DR 을 하나의 솔류션으로 구축

Business 요구사항 변경에 따른 유연한 확장 / 변경 가능

storage-based DR/Replication 보다 유연하고 저가의 구축 가능

별첨 1. SteelEye 아키텍처

LifeKeeperConfiguration

Database (LCD)

LifeKeeperCommunicationsManager (LCM)

RecoveryDirection / ActionResource Monitoring

Application Recovery Kit

LifeKeeper core

LifeKeeper GUI client

LifeKeeper Recovery Actionand Control Interface

(LRACI)

LifeKeeperAlarm Interface

LCD Interface(LCDI)

LifeKeeper Node

To LCM on another node

LifeKeeper GUI server

별첨 1. SteelEye 아키텍처 LifeKeeper 아키텍쳐

bitmap file

bitmap file

Active 의 디스크와 리모트의 디스크는nbd 와 software RAID 를 통해 복제

별첨 1. SteelEye 아키텍처 DataKeeper 아키텍쳐

1

2

34

5

Write I/O

bitmap file

bitmap file

6

Read I/O

1) Source 서버 Write 발생2) Source 서버의 Bitmap File Dirty

3) Target 서버의 Disk 에 Write 수행4) Source 서버의 Disk 에 Write 수행5) Source 서버의 Bitmap File clear

6) Write 완료

별첨 1. SteelEye 아키텍처 Replication Workflow(Sync)

2

34

5

bitmap file

bitmap file

6

1) Source 서버 Write 발생2) Source 서버에 Bitmap File Dirty

3) Target 서버에 Write 요청 전달4) Source 서버의 Disk 에 Write 수행5) Source 서버의 Bitmap File clear

6) Write 완료

Write I/ORead I/O

1

별첨 1. SteelEye 아키텍처 Replication Workflow(Async)

별첨 2. H/A 와 유사 솔류션의 비교

Server

Data

Instance

Server

Data

Instance

Data

1. CDC 2. Storage Replication

• CDC(Change Data Capture) 솔류션을 통해 Source 노드의 변경사항을 Target 에서 DML실행으로 동기화하는 방식

• Async 방식이고 , ( 일부 ) 데이터가 논리적으로 동일한거지 , 물리적으로 같은 DB 라고 보기 어려워 , 이중화로 활용 어렵다

• 부분 복제 , 집계성 복제를 통한 별도의 Read/Write 가능한 DB 로 활용에 유리

• EMC 의 BCV, Hitachi 의 SI 같은 Storage Replication 이용한 Storage 이중화 방식

• 복제성능이 빠르고 , 안정성이 검증되어 주로 백업부하 분산용과 복구용으로 유리

• 고가의 Enterprise Storage 와 해당 벤더의 고가의 복제솔류션 필요

• 자동 Fail-over 구성 안되고 , 일반적으로 특정 시점 기준으로 Sync 되도록 구성

Active

Server

Active

Instance

Backup / Test

Server

Instance

Active’

Data’ReplicationCDC


Server

Data

Instance

Server

Data

Instance

Server

Data

Instance

3. Oracle: RAC 4. Oracle: Active Data Guard

Server

Instance

• Storage 이중화가 안되어 있음• Active-Active 가 가능한 유일한 방법으로

고가의 Unix 의 Oracle 환경에서 유리• Fail-over 시간이 짧거나 무중단 서비스 가능• Oracle 만 가능

RAC

ADG

Active Read OnlyActive Active

• Oracle 복구 방식으로 Block level 동기화 방식• 속도 빠르고 Target 노드를 ReadOnly 로

읽기부하 분산 및 백업 부하 분산용으로 활용 가능

• Read Only ReadWrite 전환 포함한 Fail-over 자동화 구성 안됨

• Oracle 만 가능


Server

Data

Instance

Server

Data

Instance

Server

Data

Instance

5. H/A Solution - Server Only 6. H/A Soution - Server+Data

Server

Instance

• Server 나 Instance 장애 시 중단 후 자동 Fail-over 되어 서비스가 재개되는 구조

• 평상시에 Stand by 서버가 유휴이므로 , 상대적으로 저가인 Linux 나 가상화 환경에서 유리

• Storage 이중화가 안되어 있음• DB 이외의 모든 서비스에 활용 가능

H/A

Replication


• Server, Instance, Storage 에 장애 시 중단 후 자동 Fail-over 되어 서비스가 재개되는 구조

• 평상시에 Stand by 서버가 유휴이므로 , 상대적으로 저가인 Linux 나 가상화 환경에서 유리

• DB 이외의 모든 서비스에 활용 가능

H/A


이중화 구성 방안

이중화 Component활용범위

Fail-over자동화

주 용도Server

In-stance

Data

CDC - - - DB Ⅹ

부분 복제 , 집계성 복제를 통한 타시스템 IF 나 읽기부하 분산용 및 별도의 Read/Write 가능한 DB 로 활용에 유리

Storage Replica-tion

- - - ALL Ⅹ백업부하 분산 및 빠른 복구를 위한 1 차 백업용으로 유리

Oracle: RAC O O ⅩOracl

eO

( 무중단 )RAC + ADG 로 구성시 고가의 Oracle 환경에서 장애복구 ,

읽기부하 분산 , 백업부하 분산용으로 유리Oracle: ADG △ △ O

Oracle

Ⅹ

HA – Server Only O O Ⅹ ALLO

( 중단후 ) 상대적으로 저가인 Linux 및 가상화 환경에서 DB 를 포함한 여러 이중화 환경 구성에 유리HA –

Server+DataO O O ALL

O( 중단후 )


별첨 3. 데이터 복제 방식 비교

별첨 3. 데이터 복제 방식 비교CDC 방식 Log Apply 방식 File 단위 복제

Block 단위 Volumn/LUN 복제

설명

Active node 의 DML 을 Log 에서 추출하여 Target node 에서 SQL execution하는 방식

Active 노드에서 발생한 DB복구를 위한 Log 를 Target node 에서 Log apply(=recovery) 를 하는 방식

Active node 에서 변경된 파일을 Tar-get node 에 전송하는 방식

Active node 에서 변경된 Block 만을 Target node 로 전송하는 방식

적용가능범위

DB 에만 사용 가능 DB 에만 사용 가능Raw Device 를 제외한 모든 File 에 사용 가능

Raw Device 를 포함한 모든 데이터 복제에 사용 가능

솔류션•SharePlex•Oracle Golden Gate•MySQL Replica 등

• Oracle Active Data Guard – Physical mode

• Cubrid Replication

BCVDataKeeper 등

동기화방식

Async Async/Sync Async/SyncAsync/Sync

성능 느림 중간 중간 빠름

전송량 적음 보통 많음 적음

비교

• 성능이 느린 경우가 많다 .• 솔류션에 따라 읽기 정합성이

순간 불일치 난다 .• 데이터 불일치 상태를

모니터링 하기 어렵다 .• 동기화에 문제가 없으면

논리적으로 동일한 데이터이지만 , 물리적으로 동일하지 않다

• DB 벤더에서 제공하는 가장 안정적인 DB 복제 방식

• 복제 중간에 복제가 중단되면 , 이후에 복제를 따라가기 위해서는 중간의 모든 로그를 Apply 해야만 한다 . 복제 Target 을 읽기전용과 같은 용도로 사용 가능

• DB 처럼 I/O 의 단위가 파일단위로 Write 가 일어나지 않는 경우 적용이 어렵다 .

• 인프라적으로 가장 빠르고 안정적으로 복제를 하는 방식이다 .

• 물리적으로 동일하기 때문에 , Fail-over 나 DR 구축 용으로 가장 안정적인 복제 방식

별첨 4. vAppKeeper 소개 VMWare HA 의 한계

• VMware HA strength lies in protection against hardware failures• 80% of unplanned outages are the result of application failures, mis-con-

figurations and other operational errors• Application awareness is essential to maximizing application availability


• Cost-effective high availability solution for applications that do not require multi-node clustering– No standby resources (hardware, software, etc.) necessary– Less complex to deploy and manage than multi-node clus-

tering– Plugin allow management and monitoring through the

vSphere Client• Allows customer to fully leverage VMware tools and automa-

tion without compatibility issues

vAppKeeper Brings Application Awareness to your VMware HA


vMotion HA DRS

vSphere Client w/ vAppKeeper Plug-in

VMware HA Application Monitoring API

Ap p

OS OS OS OS

SMC

vA K Ap p vA K Ap p vA K Ap p vA K

HTTP

Browser-Based vAppKeeper UI


VMware HA

• Monitors physical host for failure

• Monitors virtual machine for failure

• Can monitor VMware Tools heartbeat to identify OS failure

vAppKeeper

• Monitors the health and applications (A) and their dependencies (D)

• Withholds heartbeat to instruct VMware HA to respond to an application failure (restart, VMotion)


Visibility• vSphere Client dashboard and

granular application hierarchy views

Flexible Management Options• Brower-based user interface• Command-line interface• Multi-level policy• Temporal recovery logic

별첨 4. VMWare HA vs. vAppKeeper vs. SPS

구성방안

이중화 구성SPOF /감지불가

자원

이중화비용

비고VMWare HA vAppKeeper

LifeKeeper+ ARK

DataKeeper

구성 1 ○•Storage•Application•VM OS

0 • VM HA 만을 사용하는 경우 가상화 서버에 대한 HA 만을 지원

• 가상화 서버내에서 수행되는 Application 장애 감지를 위해서는 vAppKeeper 필요

구성 2 ○ ○•Storage•VM OS

5

구성 3 ○ •Storage 10 • VM HA 와 SPS 를 같이 사용하는 경우 SPS 가 기 구성된 Standby node 로 Fail-over 를 하게되면 , VM HA 는 장애난 Ac-

tive 노드를 자동으로 기동하여 ,

빠른 Fail-back 이 가능하게 된다 .• 즉 , SPS 를 사용하게 되더라도

VM HA 를 같이 사용하는게 이중화 측면에서는 유리하다

구성 4 ○ ○ •Storage 10

구성 5 ○ ○ - 20

구성 6 ○ ○ ○ - 20vAppKeeper 는 vSphere 환경의 Linux 버전만 지원

별첨 5. DR 정책 수립과 DR 고객 사례

DR 정책 수립 RPO & RTO

RPO(Recovery Point

Objective)

What is the point of data revovery?

RTO(Recovery Time Objective)

How much time does it take to restart?

１ Day1Week2Weeks ０

Normal Operation Stop ReStart

Disaster Time

??Days/Hours/Minuties

DR 정책 수립 Investment VS. RPO/RTO

Investment

RPO(Recovery Point Objective)

What is the point of data revovery?

RTO(Recovery Time Objective)How much time does it take to

restart?

Disaster

It requires more investment if PRO/PRT closer to the Disater.

The Investment V.S. DR Solution

RTO

Investment

Back Up

Replication

Cold Site

RPO

Warm Site

Hot Site

RPO’s cost depends on

How to protective system

HAClustering

RTO’s cost depends on

How to build back-up site

DR 정책 수립

SteelEye 는 Replication 을 포함한 HA Cluster 로 Hot Site 로 DR 을 구성

DR Case Study – Central Bank of Russia

• Customer’s payment system involves over

• 600 bank offices• 1100 commercial banks• 2400 side offices.

• Complicated daily account routines generate a tremendous flow of electronic payment documents

• over a billion electronic filings every year

• Ensuring data aggregation and integrity back to a central storage vault was mission critical

• Ensure availability of key Oracle databases and customer applications while avoiding the single points of failure of shared storage

• Minimize cost

Challenge:

DR Case Study – Central Bank of Russia

• SPS for Linux was implemented to ensure same and seamless failover of Oracle databases and Custom applications

• Twin clusters were built at primary/backup Data Centers with continuous data replication between nodes. LifeKeeper’s scalability allows the customer to grow to geographically disperse clusters with ease.

• The LifeKeeper cluster eliminates data loss due to any hardware or software faults and provides high availability for applications and processed data, even in the event of a total datacenter loss

• Customer was able to easily and quickly meet their goals with minimal expense

Solution

Results

“The SteelEye LifeKeeper solution allowed us to considerably improve availability of the payment data gathering system and to eliminate data loss by its replication while receiving data from remote data sources.“

-A. L. Danilov, Chief Engineer, Central Bank of Russia (ICI BR)

DR Case Study – U.S. Navy fleet

• Customer needed to ensure continuous availability of infrastructure services needed to support combat systems

• Solution needed to support commercially available hardware and software running on RedHat Linux

• Required a solution that supported multi-site cluster configurations to protect against hardware losses

• Required the ability to cascade application recovery across prioritized nodes

Challenge:

DR Case Study – U.S. Navy fleet

• SPS for Linux Multi-Site Cluster was implemented to ensure NFS services are high available to support mission critical systems.

• At each location the customer implemented a 3-node hybrid cluster:

• 2-nodes with iSCSI shared storage at primary data center (IBM Blades and DS3500 storage)

• Replicated 3rd node at backup data center

• The LifeKeeper multi-site cluster solution provides high availability for mission critical combat systems, even in the event of a total datacenter loss

• Less expensive and more flexible than storage-based replication solutions. Customer needed to re-use existing mixed hardware/storage.

Solution

Results

"As a leading alternative in reliable, flexible high availability clustering solutions for Linux, SPS for Linux Multi-Site Cluster Edition enables us to provide a disaster recovery infrastructure with the capability to support the critical weaponry of the U.S. Navy fleet."

-Craig Black, GTS program manager for the CPS project

RequestRequest

Replication for heavy data such as local map and the pictures of parking space

Protection for code/encryption files

Replication of Data ONLY

Easy to set up 　

１． High C/P

２． Block-level replication & groupware support　３． Low cost

４． Easy UI

DataKeeperDataKeeper

Restart in short time ５． Replication

DR Case Study – Paraca Why DataKeeper?

DR Case Study – Paraca

DataKeeperDataKeeper Other ApplicationOther Application

Data Compression

1 2 3 4 5 6 7

About 3

For long distance replication, DataKeeper could

keep throughput by different compression ratio

1 2 3

9

8 9

Data Compression

Why DataKeeper?

DR Case Study – Paraca

DataKeeperDataKeeper Other ApplicationOther Application

For long distance replication, DataKeeper could

Copy data by block level

File levelBlock level

ReplicationReplication

Why DataKeeper?

DR Case Study – NewStar in England

WANWAN

12km460km

5,500km

Exchange

Exchange

Exchange

Exchange

WANWAN

Bermuda

Ireland London A

London B

Configuration

http://ejje.weblio.jp/content/Bermuda

Case Study – K 사 구성도

6TB

2TB

20TB ECM Web

ECM DB

ECM Search

EMC VNX7500SAN Storage

6TB

2TB

20TB ECM Web

ECM DBECM Search

EMC VNX5300SAN Storage

KRASN130P

xxx 동 ( 전산실 )

xxx 동 (DR Center)

Replication

Replication

VIP:X.X.X.X

VIP:X.X.X.X

VIP:X.X.X.X

HP DL380G8• CPU: (2)2.50GHz

(12Core)• RAM: 72GB• Disk1:

300Gx2(R1)• Disk2:

300Gx2(R1)• Disk3: STORAGE• OS: Win2008R2



300Gx2(R1)• Disk2:

300Gx2(R1)• Disk3: STORAGE• OS:

Win2008R2En• DB: SQL2012Std



300Gx2(R1)• Disk2:

300Gx2(R1)• Disk3: STORAGE• OS:

Win2008R2En• DB: SQL2012Std

ECM WEB

ECM DB

ECM 검색

섬 네일

PDF 변환

ECM WEB

ECM DB

ECM 검색

ECM 개발

Virtual MachineCPU : 2CoreRAM: 10GBOS: Win2008R2



300Gx4(R5)• Disk2: STORAGE• OS: Win2012

이중화 제외 서비스



Virtualization Host Server


Active Standby

개발사 제품명 용도SIOS SteelEye Service 와 Data 를 이중화 및 복제 솔루션

Microsoft Hyper - V 윈도우 서버 가상화 솔루션으로 물리머신에 여러 대의가상 서버를 올려 주는 제품

구분

총 가용량(RAW)

사용량(TB)

잔여공간

NL-SAS

88 TB 67 TB 13 TB

SAS

20 TB 14 TB 5.9 TB

구분

총 가용량(RAW)

사용량(TB)

잔여공간

SAS

36 TB 28TB 8TB

Server

Data

Instance

Active

Server

Instance

Standby

DataSync

Server

Instance

DR

Data

Async

InstanceShared Nothing

Shared Storage

Server

Data

Instance

Active

Server

Instance

Standby

Server

Instance

DR

DataAsync

Instance

SteelEye 구성 방안 DR 구성

H/A

H/A H/A

H/A

별첨 6. Heartbeat 구성과 IO Fencing

HeartBeat3(선택 )

SteelEye RHCS

Heartbeat Line 복수 개 설정가능(Service N/W, iSCSI망 구성 가능 )

기본 Heartbeat Line 1 개HeartBeat N/W 장애 시 Service망을 사용

TTY(Serial cable) heartbeat 도 구성 가능 지원 안 함

Unicast 방식의 heartbeat Multicast 방식의 Heartbeat

HeartBeat 구성 Split-brain 예방

Data

Server Server

Service N/W

HeartBeat N/W

iSCSI N/W

SerialCable

• 별도의 N/W 대역에 복수개의 HeartBeat구성을 권장

• Service N/W 에 HeartBeat 이중화 구성 권고 Service 가 정상적인 상황에서의 Split-brain 상황 방지

• 물리적으로 Serial Cabling 이 가능한경우 TTY HeartBeat 추가 구성 권장 HeartBeat process 이중화 이득

• iSCSI Storage 사용시 HeartBeat 추가 구성 권장 Disk Heartbeat 과 동일 효과

HeartBeat4 (선택 )

HeartBeat1 ( 필수 )

HeartBeat2 ( 필수 )

I/O Fencing I/O Fencing

I/O Fencing 필요성

SCSI Conflict 를 인지한 Active Node 를 Reboot 으로 I/O Fencing

SCSI PR3 ( 기본 제공 )

Quorum Witness Server (옵션 ) STONITH (옵션 )

제 3 의 Witness Server 에서 Active status 확인을 통해 불필요한 Fail-over 방지

SCSI Conflict 를 인지한 Active Node 의 전원 Off 수행 SCSI Conflict 감지로 인한 reboot 수행을 위한 추가 방안

Shared Storage 환경에서 필요

I/O Fencing Fencing Chart

Configuration

Split-brain

Hung Server

SCSIReserva-

tion(Default)

Quo-rumWit-ness

Server

Watch-dog

STONITH

●

● ●

● ●

● ● ●

● ●

Most Reliable Least Reliable

SteelEye for DR Conclusion

SteelEye Protection Suite10 년이상 검증된 Architecture 의 Consistency

x86(Linux, Windows), 가상화 , Cloud 환경에 최적화

Open Source 를 포함한 다양한 Linux 배포 버전을 지원

다양한 Resource 들을 Script 작성 기반이 아닌 지능화된 Application 감시 모듈

다양한 환경 구성 (1:1, N:1, DR, cross standby, Shared Storage/Shared Nothing)

Block 기반 복제로 빠른 성능 및 DB 이외의 다양한 형태의 Replication 지원

설치 , 구성 , 운영 작업에 직관적인 Wizard 형태의 GUI 제공

HA Fail-over, Data Replication, DR 을 하나의 솔류션으로 구축

Business 요구사항 변경에 따른 유연한 확장 / 변경 가능

storage-based DR/Replication 보다 유연하고 저가의 구축 가능

SteelEye 표준 제안서

Technology

Transcript of SteelEye 표준 제안서