Integrating OpenStack to Existing infrastructure

Post on 12-Jan-2015

1.205 views 0 download

description

Hui Cheng's Thurs, 1 pm session

Transcript of Integrating OpenStack to Existing infrastructure

Integrating OpenStack to

Existing Infrastructure

1

Cheng, Huifreedomhui@gmail.com

2012-04-19

AgendaBackground

● Who We Are

● Infrastructure & Platform

● Challenges

Integration Challenges

● Network Deployment

● Security Consideration

● Load Balancer

● Swift Evaluation

Our Contributions

● Billing

● Monitoring

2

Who Are We

Sina.com• Largest infotainment web portal in China• Provides various on-line services, like news, Finance, video, email, blog hosting, etc.• Operates first PaaS cloud computing platform

Sina Weibo• twitter-like microblog service• over 300m users• huge influence on China's society

We are building a reliable, scalable and secureinfrastructure and platform to support our business.

3

Infrastructure & Platform

Physical ServersTraditional Operation

Virtualization Platform(IaaS)●VM Management System(VMMS) → Sina Web Service(SWS)●VMMS is private solution developed in-house●SWS is based on OpenStack

Application Platform(PaaS)●Virtual Host → Sina App Engine(SAE)●SAE provides both Public and Private Service.●Proved to be Efficient and Robust

4

Sina App Engine

• No. 1 Public PaaS Platform in

China launched in Nov 2009

• PHP, Python, Java and Ruby

Support

• Numbers

160,000+ developers

200,000+ apps on SAE

800 million page views per day

20+ Services

• SAE Cloud Storage Service is replaced by Swift

• Deploy SAE on OpenStack

5

Challenges

SAE meets the majority of business needs, but does not cover all, especially for web games

Customers require full stack of cloud computing

We Choose OpenStack as our IaaS solution

8

Why Choose OpenStack

100% Python & Open Source

9

OpenStack Deployment

nova-computenova-network

nova-api

nova-computenova-network

glance

Swift

dashboard

keystone

schedule

RabbitMySQL

Sina SSO

10

Nova Network

Networking is the biggest challenges for IaaS

Network Topology:

• VLAN

• FlatDHCP

• FlatDHCP & Multihost

11

Network Topology --- VLAN

Drawback:• Pre-allocate network for future projects• Traffic bottleneck in the NAT gateway

Capability:• Accessibility of VMs within one tenant• Isolation of VMs from different tenants• VM is able to access public network• VM can be accessible from public network• Isolation between virtual network and

internal network

12

Network Topology(Flat)Capability:• Accessibility of all VMs in the fixed IP range• VM is able to access public network• VM can be accessible from public network• Full isolation between virtual network and

internal network

Drawback:Tenant isolation lessensTraffic bottleneck in the NAT gateway

13

Network Topology(Flat & Multihost)

Capability:• Accessibility of all VMs in the fixed IP range• VM is able to access public network• VM can be accessible from public network

Bonus:• Totally distributed architecture avoid

single-point failure.• Multiple gateway eliminates NAT bottleneck• High throughout between OS regions

Drawback:• Tenant isolation lessens• Need security facility(SWS-filter) to protect

intranet

If security problems were solved, this would be our best choice!

14

Security in OpenStackStatic filters --- Layer 2 Filter

MAC, IP, and ARP spoofing protection Not configurable Defined in /etc/libvirt/nwfilter/*.xmlImplemented by ebtables ebtables -t nat --list

Security Group --- Layer 3 Filter

Role-based firewall One security group is a RoleIngress filtering Target is the instance Source can be CIDR or another groupImplemented by iptables See details: iptables -t filter -n -L Whitelist mechanism(ACCEPT rules)

15

Security Enhancement

SWS Filter

Prevent Intranet Penetration• Intranet is the internal network outside of

OpenStackEgress filtering• Target is internal network• Source is instances in OpenStackImplementation• Whitelist mechanism(ACCEPT rules)• On the top of nova-filter-top Forward

Chain

Rational• SWS filter is managed by cloud manager • Only explicit authorized packets can reach Internal network C • Packet should be controlled within Compute Node

16

Security Enhancement

Security Group VS SWS Filter

17

Load BalancerDesign

Load Balance • Dispatch request• Support multiple routing algorithm• Health check

Acceleration• Reality: narrow bandwidth between ISPs• Building fiber channels from ISPs to pivot• Given the same endpoint within user’s ISP

IPv4 Shortage• Reality: dozens of public IPs support

hundreds of VMs• IPv4 has been exhausted• IPv6 is not realistic yet in China

18

Unicom Others ISPMobileTelecom

Pivot

Smart DNS

DNS Acceleration Design

High speed fiber channel

Public Network

Load BalancerLayer 7 Load BalancerConsideration:1. dispatch request by Host header2. nginx module

19

Load BalancerLayer 4 Load BalancerConsideration:1. dispatch request by TCP port2. lvs + haproxy

20

Swift Evaluation

Extremely Durable and Highly Available Superior Scalability Linear Growth of Performance Symmetric Architecture No Single-failure Simple & Reliable

21

Swift Evaluation

Load Balancer

Proxy Server

Object Server

Container Server

Account Server

Zone1

Proxy Server

Object Server

Container Server

Account Server

Zone2

Proxy Server

Object Server

Container Server

Account Server

Zone3

Proxy Server

Object Server

Container Server

Account Server

Zone4

Proxy Server

Object Server

Container Server

Account Server

Zone5

PUT abc.pngGET abc.png

• 1 Zone = 1 Physical Server with 12x2T disk• Write/Read applies quorum protocol

22

Swift Evaluation

Physical Deployment

disk1 disk2 disk3 disk4

sda

……

sdb sdc

disk5

sdd

disk12

sdk

Storage NodesOS installation

Swift packagesProxy Server

Account ServerContainer Server

Object Server

23

raid 1

Swift EvaluationPerformance issueCPU utilization rate up to 100% even without request

Audit:swift-account-auditor : 1.5mswift-account-replicator: 9.5m

swift-container-auditor: 8.4mswift-container-replicator: 9.3mswift-container-updater: 19.0m

swift-object-updater: 0.1 sswift-object-replicator: 10.5 hoursswift-object-auditor: 48.3 hours

Testing environment:Nodes: 5 x Dell R510CPU: Intel® Xeon® E5360Memory: 12GBReplica: 3

No. of Objects: 150,000,000No. of Accounts: 120,000No. of Containers: 160,000

24

Result:Periodic scanning all partitions, calculating checksum and synchronization

25

●Biling & Monitoring

RDBMS

Compute

Network

Storage

Dashboard

RPCDatabase

Client

BillingMonitoring(Metering)

NoSQL

26

●Kanyun: Monitoring system

Aggregator

API daemon

Compute

Network

Storage

Worker

Worker

Responds to client requestCalculates/

stores metrics

Retrieve usage info

DashboardRDBMS

Billing

NoSQLhttp://github.com/lzyeval/kanyun

27

●Dough:Billing system

Farmer API daemon

Compute

Network

Storage

Monitoring(Metering)

Collector

CollectorSubscribe orunsubscribeproducts /Query info

Dispatch jobs

Check status /Retrieve usage /

Create purchases

Dashboard

RPCDatabase

Client

RDBMS

http://github.com/lzyeval/dough

Q & A

28