Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

66
StackStorm If-This-Than-That for DevOps automation © 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.

Transcript of Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Page 1: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm

If-This-Than-That for DevOpsautomation

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.

Page 2: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Page 3: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

@StackStorm@Stack_Storm

Page 4: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Agenda

What is StackStorm

Why we made it

Use cases

Why is it better than …

Page 5: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm: What is it?

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.

Page 6: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Page 7: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm is like …

7

for DevOps

Page 8: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

IFTTT,for DevOps

8

cat /opt/stackstorm/packs/st2-demos/rules/demo.yaml---

name: "sensu_crit_to_slack"pack: "st2-demos"description: "Post all critical alerts to the demoenabled: truetrigger:

type: "sensu.event_handler"criteria:

trigger.check.status:pattern: 2type: "equals"

trigger.check.name:pattern: "demo_.*"type: "matchregex"

action:ref: "slack.post_message"parameters:

message: >[ALERT]{{trigger.client.name}}{{trigger.check.output}}

channel: "#demos"

Page 9: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

9

Page 10: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

10

Trigger Action

Rule

Page 11: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Ingredients

11

IT Domains

Config mgmtStorageNetworking ContainersCloud InfraMonitoring

ActionsSensors

WorkflowsRules

Ops Support

Trigger Call

Page 12: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Automation Example

12

Automation

EngineerService

Monitoring IncidentManagement

Event: “low disk on web301”

Web301 is “low disk”

Resolve known cases, fast. Is it

/var/log? Clean up!

Unknown problem, need

a human

Wake up, buddy. Something real

is going on…

Page 13: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm: Why we made it?

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.

Page 14: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Opalis, now Microsoft SC Orchestrator2004-2008

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.

Event-driven automation with workflows,for Enterprise IT

Page 15: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

VS

VMware, 2008 - 2013

Page 16: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

VS

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 16

OpenStack VMWare

Page 17: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

DevOps Tools

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 17

VS

Enterprise Suites

Page 18: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

VS

18

Event-driven automation with workflows,for Cloud and DevOps

DevOps Tools Enterprise Suites

Page 19: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

19

What can be it automated?

Page 20: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

20

Got StackStorm !

Every automation looks like a nail

Page 21: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

What SHOULD be automated?From: Practice of Cloud System Administration, by Thomas Limoncelli

Page 22: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

What HAS BEEN automated with StackStorm

• Security checks

– On malware detection in a VM, isolate network port on a switch

• App blue-green deployment

– On Jenkins tests passed, bring new vmclaster, deploy and configure app, set loadbalancer to send % of traffic to new app, monitor, roll forward, or back out

• Networking

– On BGP peer goes down: collect troubleshooting data, post on slack & create JIRA ticket

– On Link aggregation member error, check load, if capacity of rest of LAG bundle enough, disable link with error

• OpenStack– orphan VM clean-up: On orphans

detected, shut down, email owner, keep for few days, delete

– VM evacuation on HW failures: On host RAID failure, get list of impacted VMs, email VM owners, evacuate VMs, create JIRA ticket for hardware replacement.

• Service remediation:– Cassandra “node down” recovery: On ring

node dying, deploy new node, configure, add to the ring.

– Remediating RabbitMQ, Galera cluster, MySQL, and more…

22

Page 23: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

23

Auto-Remediation

FB auto-remediates 98% alarms,can you?

Page 24: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

“Auto-Remediation & Automation at Facebook” @ Auto-Remediation Meetup SFhttps://www.meetup.com/Auto-Remediation-and-Event-Driven-Automation/events/236704012/

Facebook FBAR:43 % Problem Fixed51 % False positives94 % Automated

Page 25: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

“Auto-Remediation & Automation at Facebook” @ Auto-Remediation Meetup SFhttps://www.meetup.com/Auto-Remediation-and-Event-Driven-Automation/events/236704012/

FBAR & StackStorm are friendsStackStorm is inspired by FBARStackStorm and FB FBAR collaborating since 2014

Page 26: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

“Sleep Better at Night: OpenStack Cloud Auto-Healing” @ OpenStack Summit Barcelona

Mirantis: Auto-remediating 2,000 node OpenStack cluster at Symantec with StackStorm

Page 27: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

“Sleep Better at Night: OpenStack Cloud Auto-Healing” @ OpenStack Summit Barcelona

Mirantis: Auto-remediating 2,000 node OpenStack cluster at Symantec with StackStorm

User: Symantec (Mirantis)Use case: OpenStack cluster remediationPresented by Mirantis at OpenStack Barcelona

StackStorm at Symantec

Page 28: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Engineer Wakes up

Logs in and ACK

Checksrunbook

Studiesthe alert

Fixes theproblem

Runs diagnostics

PagerDuty

Alert

2:02 AM 2:07 AM 2:15 AM2:10 AM 2:30 AM2:20 AM2:00 AM

On-call, Without Automation

Source: “Winston: Helping Netflix Engineers Sleep at Night” @ Qcon ‘16 SFhttps://goo.gl/lHzq4r

Page 29: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

FalsePositive

Winston

2:00 AM

2:05 AM

2:05 AM

2:15 AM

AssistedDiagnostics

Fixed theproblem

On-call With Winston

Source: “Winston: Helping Netflix Engineers Sleep at Night” @ Qcon ‘16 SFhttps://goo.gl/lHzq4r

Page 30: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Benefits

• Reduce MTR (Mean Time to Resolution)

• Avoid failures (fixing on computer time, not human time)

• Reduce risk of human error (no fat fingers)

• Positive team impact

– Avoid pager fatigue and team burn-out

– Turn from reactive to proactive (break reactive vicious cycle)

– Capture operational knowledge – as code

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 30

Page 31: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

31

Network Automation

Now supporting Multiple Vendors

Page 32: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Page 33: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Device supportProven approach from cloud compute space

33

NAPALM

AUTOMATION

Packs

PYTHON

BINDINGS

INTEGRATION

PACKS

DEVICE / OS

INTERFACES

Some workflows [1] leverage unique device functions, so they call actions of the device’s integration pack. Other workflows [2] need to be abstracted and treat devices alike (e.g. IXP provision on mixture of SLX and MLX). So they use “unified” Napalm pack.

device drivers

Integration packs expose device configurations and operations as st2 actions. VDX and SLX packs will expose most/all of device functionality. MLX is “best effort”. Napalm integration pack provide “unified” device actions, just like libcloud does in “compute” domain.

While integration packs can call device interfaces directly, python bindings provide reuse, and abstract device OS versions. PyNOS binds VDX via netconf. SLX and MLX don’t exist yet.Napalm is Open source project that exposes a unified python API to interact with different network devices via device drivers.

Devices expose various interfaces: RESTCONF, NETCONF, CLI/TELNET, SNMP.

VDX MLX SLX

[1] [2]

Other vendors

Page 34: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

34

Including legacy, business apps …

Integration

Page 35: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

“Innovation at Dimension Data: Optimizing Operations with Event Driven Automation” https://stackstorm.com/2016/12/15/dimension-data-devops-beyond-deployment/

Dimension Data (SP, part of NTT)

Page 36: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

“Innovation at Dimension Data: Optimizing Operations with Event Driven Automation” https://stackstorm.com/2016/12/15/dimension-data-devops-beyond-deployment/

• Integrate IT systems & tools• Security automation• Legacy Run-book replacement• Automation-aaS to end-users• Top st2 contributors

Dimension Data (SP, part of NTT)

Page 37: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Page 38: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

38

IoTFun stuff

Page 39: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Page 40: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

40

Serverless

Grab StackStorm & DIY

Page 41: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm is like …

41

AWS Lambda AWS Step Functions

OpenSource, for DIY Serverless

Page 42: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm is like …

42

ActionsSensors

WorkflowsRules

IT Domains

Config mgmtStorageNetworking ContainersCloud InfraMonitoring Ops Support

Step Functions

AWS Lambda

Page 43: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Serverless with Swarmfor Genomic Annotation ComputingDmitri Ziminehttp://github.com/dzimine/serverless@dzimine

Image by Miki Yoshihito, Creative Commons license

Page 44: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Many Use Cases – One Platform

44

StackStorm automation platform

Ne

two

rk

Au

tom

atio

n

Ass

iste

dT

rou

ble

sho

otin

g

Au

toR

em

ed

iatio

n

IT p

roce

ss

inte

gra

tio

n

IoT

Inte

rne

t o

f T

hin

gs

Se

rve

rle

ss

CI/

CD

Co

ntin

uo

us

De

plo

ym

en

t

NF

V

Se

curi

tyO

rch

est

ratio

n

Ch

atO

ps

Page 45: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Why not…

45

Page 46: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Why not Scripts?

46

Page 47: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Why not Scripts?

47

• Simple to define, reason, visualize

• Transparent

– state is clear, execution is trackable: running, complete, failed steps

Page 48: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.48

Page 49: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Workflows Better in Operations

• Simple to define, reason, visualize

• Transparent

– state is clear, execution is trackable: running, complete, failed steps

• Reliable

– Workflows are long-running

– Crash tolerance

– “Restart from point of failure”

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 49

Page 50: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

50

Why not legacy Runbook Automation?

• Microsoft System Center Orchestrator

• HP Operation Orchestrator

• Cisco Process Orchestrator (CPO)

• VMWare vCO / vRealize

They do not DevOps

Page 51: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Infrastructure As Code

Leverage social coding and collaboration

OpenSource

Designed for Devops

Page 52: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Infrastructure As Code

Leverage social coding and collaboration

OpenSource

Designed for Devops

Page 53: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

53

Infrastructure as code

Case Study

• Service Catalog backed up by workflow

• Automate provisioning on VMW/OpenStack, 4 Data centers

• Before: CPO, operator updates via GUI, click and pray, x4

• After: StackStorm, dev -> code review -> staging -> QA-> prod

Page 54: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Infrastructure As Code

Leverage social coding and collaboration

OpenSource

Designed for Devops

Page 55: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

265 ContributorsSource: https://openhub.net/p/st2

Page 56: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

256 = 100,000,0002

Contributors

Page 57: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm & BWC Usage

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 57

0

500

1000

1500

2000

2500

3000

Sep Oct Nov Dec Jan Feb

Installations/month- StackStorm- BWC

Page 58: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm Exchange

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 58

Page 59: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

StackStorm Exchange

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 59

Page 60: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 60

Page 61: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 61

Page 62: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Take away:* Try it* Use it* Contribute to it

62

Page 63: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

63

Page 64: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

• Use StackStorm.

Try it, find automation, nail POC. Let us know, good & bad.

curl -sSL https://stackstorm.com/packages/install.sh | bash -s

docs.stackstorm.com/install

• Commit code. Become a “community maintainer”It is not hard (2 days?). We help & support.

• Spread the word

Blog. Tweet. Talk. Mention. Bug. Github Star!

64

Contribute! Everything counts

Page 65: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

Thank You!

@Stack_Stormhttp://github.com/StackStorm/st2 Star 1,869

Dmitri Zimine@dzimine

Page 66: Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~

OpenSource Apache 2.0

• Github: github.com/StackStorm/st2

• Twitter: Stack_Storm

• IRC: #stackstorm on FreeNode

• stackstorm.slack.com on Slack

• www.stackstorm.com

© 2016 BROCADE COMMUNICATIONS SYSTEMS, INC. 66

StackStorm Brocade Workflow Composer

Commercial Edition

• Enterprise features

• Priority support

• brocade.com/bwc

• docs: bwc-docs.brocade.com

• Network lifecycle automation suite