Cern Cloud Architecture - February, 2016

18

Transcript of Cern Cloud Architecture - February, 2016

CERN Cloud Architecture

Ops Midcycle - High Performance Computing with OpenStack - Manchester, 2016

Belmiro Moreira [email protected] @belmiromoreira

What is CERN?

3

CERN Cloud – LHC and Experiments

4

CMS detector

https://www.google.com/maps/streetview/#cern

CERN Cloud – AMS

5

OpenStack at CERN by numbers

6

~ 5500 Compute Nodes (~140k cores) •  ~ 5300 KVM •  ~ 200 Hyper-V

~ 2800 Images ( ~ 44 TB in use)

~ 2000 Volumes ( ~ 800 TB allocated) ~ 2200 Users ~ 2500 Projects

> 17000 VMs running

Number of VMs created (green) and VMs deleted (red) every 30 minutes

OpenStack timeline at CERN

7

ESSEX 5 Apr 2012

FOLSOM 27 Sep 2012

GRIZZLY 4 Apr 2013

HAVANA 17 Oct 2013

ICEHOUSE 17 Apr 2014

JUNO 16 Oct 2014

Havana February 2014

Icehouse October 2014

KILO 30 Apr 2015

“Hamster” Oct 2013

“Guppy” Jun 2012

“Ibex” Mar 2013

Grizzly Jul 2013

Juno April 2015

LIBERTY

Kilo October 2015

CERN production infrastructure

•  Evolution of the number of VMs created since July 2013

OpenStack timeline at CERN

8

Number of VMs running Number of VMs created (cumulative)

Infrastructure Overview •  One region, two data centres, 33 Cells •  HA architecture only on Top Cell •  Children Cells control plane are usually VMs running in the shared infrastructure •  Using nova-network with custom CERN driver / Neutron in one cell •  2 Hypervisor types (KVM, HyperV) •  Scientific Linux CERN 6; CERN Centos 7; Windows Server 2012 R2 •  2 Ceph instances •  Keystone integrated with CERN account/lifecycle system •  Nova; Keystone; Glance; Cinder; Heat; Horizon, Ceilometer; Rally; Magnum; Neutron •  Deployment using OpenStack puppet modules and RDO

9

Architecture Overview

10

Nova Compute Cell

Nova Top Cell

Nova Compute Cell

Nova Compute Cell

Load Balancer Ceph

Glance

Cinder

Heat

Ceilometer

Horizon

Keystone

DB infrastructure

(...)

Geneva Data Centre Budapest Data Centre

Ceph

DB infrastructure

Nova Compute Cell

Nova Compute Cell

Nova Compute Cell

(...)

Neutron

Magnum

Cells

11

AVZ_A

AVZ_B

HyperV

GVA

GVA

KVM

GVA

KVM

AVZ_C

WIG

KVM

WIG

KVM

AVZ_A

KVM

WIG

KVM

Project: uuid1

Nova Deployment at CERN

12

nova-cells

rabbitmq Top cell controller API node

nova-api

rabbitmq

nova-cells

nova-api

nova-scheduler

nova-conductor

nova-network

Child cell controller

Compute node

nova-compute

rabbitmq

nova-cells

nova-api

nova-scheduler

nova-conductor

nova-network

Child cell controller

Compute node

nova-compute

DB

(...)

Load Balancer

DB DB

Keystone Deployment at CERN

13

Load Balancer

DB Service

Catalogue DB

Keystone

Service Catalogue

(Exposed to Users) (Dedicated to Ceilometer)

Keystone

Active Directory

Glance Deployment at CERN

14

Load Balancer

DB

Glance-api

Glance-registry

Glance node

(Exposed to Users)

Glance-api

Glance-registry

Glance node

(Only used for Ceilometer calls)

Ceph Geneva

Cinder Deployment at CERN

15

Load Balancer

DB

Cinder-api

Cinder-volume

Cinder node

Cinder-scheduler

rabbitmq

Ceph Geneva

Ceph Budapest

NetApp

Ceilometer Deployment at CERN

16

nova-compute

ceilometer-compute

Hbase

Ceilometer Notification

Agent Ceilometer

Pulling Collector

Ceilometer Notification Collector

Ceilometer UDP

Collector

Mysql MongoDB

Ceilometer API

Cell rabbitmq

notifi

catio

ns

Ceilometer rabbitmq

Ceilometer API

samp

le RP

C

samp

le UD

P

Aodh Evaluator & Notifier

HEAT

ceilometer-central-agent

Compute node

Aodh API

Challenges •  Capacity increase to 200k cores by Summer 2016 •  Live Migrate ~5000 thousands of VMs

•  Upgrade ~800 compute nodes from SLC6 to CC7 •  Retire old servers

•  Migrate to Neutron •  Identity Federation with different scientific sites •  Scale Magnum and containers deployment

17

[email protected] @belmiromoreira

http://openstack-in-production.blogspot.com