Integrating OpenStack to Existing infrastructure
-
Upload
laurabeckcahoon -
Category
Technology
-
view
1.205 -
download
0
description
Transcript of Integrating OpenStack to Existing infrastructure
AgendaBackground
● Who We Are
● Infrastructure & Platform
● Challenges
Integration Challenges
● Network Deployment
● Security Consideration
● Load Balancer
● Swift Evaluation
Our Contributions
● Billing
● Monitoring
2
Who Are We
Sina.com• Largest infotainment web portal in China• Provides various on-line services, like news, Finance, video, email, blog hosting, etc.• Operates first PaaS cloud computing platform
Sina Weibo• twitter-like microblog service• over 300m users• huge influence on China's society
We are building a reliable, scalable and secureinfrastructure and platform to support our business.
3
Infrastructure & Platform
Physical ServersTraditional Operation
Virtualization Platform(IaaS)●VM Management System(VMMS) → Sina Web Service(SWS)●VMMS is private solution developed in-house●SWS is based on OpenStack
Application Platform(PaaS)●Virtual Host → Sina App Engine(SAE)●SAE provides both Public and Private Service.●Proved to be Efficient and Robust
4
Sina App Engine
• No. 1 Public PaaS Platform in
China launched in Nov 2009
• PHP, Python, Java and Ruby
Support
• Numbers
160,000+ developers
200,000+ apps on SAE
800 million page views per day
20+ Services
• SAE Cloud Storage Service is replaced by Swift
• Deploy SAE on OpenStack
5
Challenges
SAE meets the majority of business needs, but does not cover all, especially for web games
Customers require full stack of cloud computing
We Choose OpenStack as our IaaS solution
8
Why Choose OpenStack
100% Python & Open Source
9
OpenStack Deployment
nova-computenova-network
nova-api
nova-computenova-network
glance
Swift
dashboard
keystone
schedule
RabbitMySQL
Sina SSO
10
Nova Network
Networking is the biggest challenges for IaaS
Network Topology:
• VLAN
• FlatDHCP
• FlatDHCP & Multihost
11
Network Topology --- VLAN
Drawback:• Pre-allocate network for future projects• Traffic bottleneck in the NAT gateway
Capability:• Accessibility of VMs within one tenant• Isolation of VMs from different tenants• VM is able to access public network• VM can be accessible from public network• Isolation between virtual network and
internal network
12
Network Topology(Flat)Capability:• Accessibility of all VMs in the fixed IP range• VM is able to access public network• VM can be accessible from public network• Full isolation between virtual network and
internal network
Drawback:Tenant isolation lessensTraffic bottleneck in the NAT gateway
13
Network Topology(Flat & Multihost)
Capability:• Accessibility of all VMs in the fixed IP range• VM is able to access public network• VM can be accessible from public network
Bonus:• Totally distributed architecture avoid
single-point failure.• Multiple gateway eliminates NAT bottleneck• High throughout between OS regions
Drawback:• Tenant isolation lessens• Need security facility(SWS-filter) to protect
intranet
If security problems were solved, this would be our best choice!
14
Security in OpenStackStatic filters --- Layer 2 Filter
MAC, IP, and ARP spoofing protection Not configurable Defined in /etc/libvirt/nwfilter/*.xmlImplemented by ebtables ebtables -t nat --list
Security Group --- Layer 3 Filter
Role-based firewall One security group is a RoleIngress filtering Target is the instance Source can be CIDR or another groupImplemented by iptables See details: iptables -t filter -n -L Whitelist mechanism(ACCEPT rules)
15
Security Enhancement
SWS Filter
Prevent Intranet Penetration• Intranet is the internal network outside of
OpenStackEgress filtering• Target is internal network• Source is instances in OpenStackImplementation• Whitelist mechanism(ACCEPT rules)• On the top of nova-filter-top Forward
Chain
Rational• SWS filter is managed by cloud manager • Only explicit authorized packets can reach Internal network C • Packet should be controlled within Compute Node
16
Security Enhancement
Security Group VS SWS Filter
17
Load BalancerDesign
Load Balance • Dispatch request• Support multiple routing algorithm• Health check
Acceleration• Reality: narrow bandwidth between ISPs• Building fiber channels from ISPs to pivot• Given the same endpoint within user’s ISP
IPv4 Shortage• Reality: dozens of public IPs support
hundreds of VMs• IPv4 has been exhausted• IPv6 is not realistic yet in China
18
Unicom Others ISPMobileTelecom
Pivot
Smart DNS
DNS Acceleration Design
High speed fiber channel
Public Network
Load BalancerLayer 7 Load BalancerConsideration:1. dispatch request by Host header2. nginx module
19
Load BalancerLayer 4 Load BalancerConsideration:1. dispatch request by TCP port2. lvs + haproxy
20
Swift Evaluation
Extremely Durable and Highly Available Superior Scalability Linear Growth of Performance Symmetric Architecture No Single-failure Simple & Reliable
21
Swift Evaluation
Load Balancer
Proxy Server
Object Server
Container Server
Account Server
Zone1
Proxy Server
Object Server
Container Server
Account Server
Zone2
Proxy Server
Object Server
Container Server
Account Server
Zone3
Proxy Server
Object Server
Container Server
Account Server
Zone4
Proxy Server
Object Server
Container Server
Account Server
Zone5
PUT abc.pngGET abc.png
• 1 Zone = 1 Physical Server with 12x2T disk• Write/Read applies quorum protocol
22
Swift Evaluation
Physical Deployment
disk1 disk2 disk3 disk4
sda
……
sdb sdc
disk5
sdd
disk12
sdk
Storage NodesOS installation
Swift packagesProxy Server
Account ServerContainer Server
Object Server
23
raid 1
Swift EvaluationPerformance issueCPU utilization rate up to 100% even without request
Audit:swift-account-auditor : 1.5mswift-account-replicator: 9.5m
swift-container-auditor: 8.4mswift-container-replicator: 9.3mswift-container-updater: 19.0m
swift-object-updater: 0.1 sswift-object-replicator: 10.5 hoursswift-object-auditor: 48.3 hours
Testing environment:Nodes: 5 x Dell R510CPU: Intel® Xeon® E5360Memory: 12GBReplica: 3
No. of Objects: 150,000,000No. of Accounts: 120,000No. of Containers: 160,000
24
Result:Periodic scanning all partitions, calculating checksum and synchronization
25
●Biling & Monitoring
RDBMS
Compute
Network
Storage
Dashboard
RPCDatabase
Client
BillingMonitoring(Metering)
NoSQL
26
●Kanyun: Monitoring system
Aggregator
API daemon
Compute
Network
Storage
Worker
Worker
Responds to client requestCalculates/
stores metrics
Retrieve usage info
DashboardRDBMS
Billing
NoSQLhttp://github.com/lzyeval/kanyun
27
●Dough:Billing system
Farmer API daemon
Compute
Network
Storage
Monitoring(Metering)
Collector
CollectorSubscribe orunsubscribeproducts /Query info
Dispatch jobs
Check status /Retrieve usage /
Create purchases
Dashboard
RPCDatabase
Client
RDBMS
http://github.com/lzyeval/dough
Q & A
28