雲端與Big data

73
雲端 Big Data Kun-Ta Chuang (莊坤達), Ph.D. Assistant Professor National Cheng Kung University

Transcript of 雲端與Big data

雲端 與 Big Data

Kun-Ta Chuang (莊坤達), Ph.D. Assistant Professor National Cheng Kung University

∗ Before going into the discussion, we see videos talking about the future

2

Preliminaries

∗ Okay, we still watch a video before starting the discussion about ‘Cloud Computing’

3

What is Cloud Computing?

∗ Sure. We also start by watching a video!

4

What is Big Data?

∗ Cloud Computing and Big Data are the definite consequence of the internet age!

∗ We start the discussion from ‘Cloud Computing’

5

∗ What is Cloud Computing? ∗ We have different perspectives

from different sides ∗ According to wikipedia, "Cloud

computing is Internet-based ("Cloud") development and use of computer technology. "

Introduction to Cloud Computing

The NIST Cloud Definition Framework

7

Community Cloud

Private Cloud

Public Cloud

Hybrid Clouds

Deployment Models

Service Models

Essential Characteristics

Common Characteristics

Software as a Service (SaaS)

Platform as a Service (PaaS)

Infrastructure as a Service (IaaS)

Resource Pooling

Broad Network Access Rapid Elasticity

Measured Service

On Demand Self-Service

Low Cost Software

Virtualization Service Orientation

Advanced Security

Homogeneity

Massive Scale Resilient Computing

Geographic Distribution

∗ A new business opportunity? ∗ Is it far beyond distributed/grid/cluster computing? ∗ Or, just a new term?

∗ Is it a new Holy Grail? ∗ Web 3.0, new web-scale problem?

∗ Social, Location, Mobile

What is Cloud Computing?

I don’t understand what we would do differently in the light of cloud computing other than changing the wording of some of our ads Oracle’s CEO Larry Ellison

New philosophy? What we do in the past

In the Cloud Era

We don’t need to work here

The Rise of a New Era in IT

Mainframe

PC / Client-Server

Web Cloud

Each new era in computing brings a new application platform: for the Cloud era it is “PaaS”

COBOL

Unix Services

Application Servers

Platform as a Service

13

Money?

Where can we get money?

From Gartner (March, 2009)

∗ Let’s turn to review the history of the IC industry ∗ Do you think why Fabless Design Houses

are so strong in the past 10+ years?

It is a new Era, but Is it a new business model?

Design Manufacturing

DFM

CHIP

HW/SW

FE TCAD

BE TCAD

Manuf. TCAD

SiVL Sigma C

DesignWare Connect. IP

VMM

Virtual Platform

CATS Proteus

Analog IP (Phys)

Test Chips

Formality

Saber SysStudio

Magellan DC Ultra SysVerilog

VCS NTB VIP

Test

Star RCXT

IC Compiler

PrimeTime

Hercules

Power

HSIM

HSPICE

NanoSim

Libraries Yield

Mgmt

PrimeYield

Systems

Today: Global IC Market Systems $1.26T Computers Communications Consumer Industrial Military…

Embedded SW $2.5B

IP $1.4B

Semiconductors $269.9B Micros, DSP Memory ASIC, ASSP Analog Discrete

Silicon Wafers $11.4B

Chips

Front-End Manufacturing $21.9B Lithography/Mask Making CMP equipment Ion Implanters Deposition Etching and Cleaning Other

Back-End Manufacturing $6.6B Assembly Equipment Assembly Inspect. Dicing Bonding Packaging Int. Assembly Sys Total Test

Foundry Wafers $20.9B

Masks* $3.3B

EDA

$4.0 B

2008 Data (*2006) Source: VLSI Research, Gartner, IC Insights, SEMI, Information Network, Synopsys Estimates

A mature business

A mature business

∗ Is ‘Cloud Computing’ far beyond distributed/grid/cluster computing?

∗ Is it also mature?

∗ 鑑古知今

Cloud -- Not Just a New Term?

∗ Amazon AWS Marketplace

21

Do we have TSMC and Synopsys in the Cloud IT industry?

∗ We have TSMC and Synopsys, but we still need ASML, National Instruments

Look back

∗ VMWARE

23

∗ IaaS ∗ Infrastrature

∗ PaaS ∗ Platform

∗ SaaS ∗ Software

Cloud Hierarchy

25

Technology Hierarchy 應用

Social Computing, Enterprise, ISV,…

程式語言 Web 2.0 介面, Mashups, Workflows, …

控制 Qos Neqotiation, Ddmission Control,

Pricing, SLA Management, Metering…

虛擬化 VM, VM management and Deployment

User Level

User-Level Middleware

Core Middleware

System Level

26

Deployment models

Public cloud Community cloud

Hybrid cloud Private cloud

We talk about: Public Cloud - A cloud is available in pay-as- you-go to the general public

Utility Computing -- Pay as you go

∗ Hours purchased via cloud computing can be distributed non-uniformly in time

∗ Cloud computing offers economic benefits of elasticity and transference of risk

Utility Computing – the service being sold in public cloud Cloud Services = SaaS + Utility Computing

∗ No longer require the Large Capital ∗ Don’t concerned about Over-Provisioning or Under-

Provisioning for prediction ∗ 選課系統 ∗ Startup companies

∗ Companies with large batch-oriented tasks can be finish quickly ∗ More elasticity of resources

The spirit of ‘Pay as you go’

Example(Provision for peak load)

最高峰 :500servers 最低峰 :100servers 雲端需要24*300=7200(小時*伺服器) 傳統模式下需要500*24=12000(小時*伺服器)雲端可以節省約1.7倍的cost!!!

Example(Under-provision) Active user – People use the site regularly Defector – People abandon the sites Suppose 10% of active user become defector who receive poor service due to under-provision

31

∗ The appearance of infinite computing resource is available to overcome load surges

∗ The elimination of an up-front commitment by cloud users ∗ The ability to pay for use of computing resources on a short

term ∗ Remember: 要喝牛奶,你不必買頭牛

Cloud can help

∗ 30,000,000 users ∗ Based on Amazon AWS ∗ Django web framework ∗ PostgreSQL database ∗ Memory cache by Redis ∗ Merged by Facebook

Famous new Companies

Quoted from http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances-dozens-of

∗ Also based on Amazon AWS

Famous new Companies

34

Cloud Cost

∗ 在矽谷每個月租server x元, 頻寬x元 在台灣每個月租server 0.5~1x元,頻寬30~40x元!! --- 翟本喬

∗ 在美國租伺服器,每台每月169~229美元,可是流量超出我的預期…最後我的信用卡額度每個月3萬美金(約90萬台幣)才夠用 --- 陳士駿

∗ 在台灣會更慘,每個月90萬美金(2700萬台幣)

∗ Is Cloud-Service really cheaper?? ∗ Depend on your age/finance situations, you rent or buy

houses

Price

General Obstacles and Opportunities in Clouds

Top 10 Obstacles and Opportunities for Cloud Computing

∗ 1.Availability/Business Continuity

∗ Q: User/Organization worry about whether utility computing services will have adequate availability or company may even go out of business

∗ A:Multiple and different cloud computing providers

Top 10 Obstacles and Opportunities for Cloud Computing

∗ 2.Data Lock-In

∗ Q:The Storage API for cloud computing are still essentially proprietary, cannot easily extract by customers

∗ A: Standardize APIs ;Compatible SW to enable Surge of Hybird of Cloud Computing

Top 10 Obstacles and Opportunities for Cloud Computing

∗ 3.Data Confidentiality/Auditability

∗ Q: Cloud user face security threats both from outsides and insides the cloud Outside : any third-party , cloud vender

Inside : cloud user ∗ A: cloud user : virtualization ∗ cloud vender : user-level encryption ∗ any third-party : firewall

Top 10 Obstacles and Opportunities for Cloud Computing

∗ 4.Data Transfer Bottlenecks

∗ Q : The cost of data transfer is high and transfer rate ∗ is slow because data is in surprising size

∗ A: ship disks

Top 10 Obstacles and Opportunities for Cloud Computing

∗ 7.Bugs in large scale distributed systems

∗ Q:Bugs can’t appear in smaller configuration ,but appear in production data center

∗ A:Use distributed VMs

Top 10 Obstacles and Opportunities for Cloud Computing

∗ 10.Software Licensing

∗ Q : Cloud provisions pay more money

∗ A : Open source or pay-for-use license ∗ Why open source?? Cost issues in startup teams

Top 10 Obstacles and Opportunities for Cloud Computing

Question?

45

Talking about ‘Big Data’

∗ The number of smart phone will exceed 1 billion in 2014, as expected

New Data Source

Web-Scale Problems It is BIG DATA!

∗ Characteristics: ∗ Definitely data-intensive ∗ May also be processing

intensive ∗ Examples: ∗ Crawling, indexing,

searching, mining the Web ∗ Social Network ∗ Web 3.0 applications

∗ Twitter is the top 8 website

50 Quoted from http://www.alexa.com/topsites

∗ Wayback Machine has 2 PB + 20 TB/month (2006) ∗ Google processes 20 PB a day (2008) ∗ “all words ever spoken by human beings” ~ 5 EB ∗ NOAA has ~1 PB climate data (2007) ∗ CERN’s LHC will generate 15 PB a year (2008)

51

Web-Scale Problems It is BIG DATA!

640K ought to be enough for anybody.

http://archive.org/index.php

52 Quoted from “Nosql big data Hadoop with microsoft”

∗ We can capture the scale of 300GB, since we have a hard disk more than the size nowaday

53

What is the scale of BigData?

54

What is the scale of BigData?

Quoted from “Nosql big data Hadoop with microsoft”

55

What is the scale of BigData?

56 Quoted from “big data the next frontier for innovation competition and productivity”

57 Quoted from “big data the next frontier for innovation competition and productivity”

∗ They cannot be solved by a set of machines ∗ Many machines? ∗ Distributed/grid/cluster computing?

∗ We need huge machines! ∗ Less-communication between computers ∗ Less-synchronization systems

For Big Data Analytics

60

Big Data Initiative in US

61

Big Data is the trend Open Its Power!

Databases in the cloud era

Relational Database Performance

64

65

∗ Play as a web-services to provide Relation Database functionalities

∗ Solve (2) Data Lock-In Issues

Third-party Cloud Services

Snapshot of database.com

Snapshot of database.com

69

Traditional Database model is no longer workable!

70

∗ We have data and Computing Everywhere! ∗ New terms: M2M, Internet of Things

∗ The IT industry is growing but changing

∗ Software and Idea are more valuable than Hardware and Labor

∗ Small/Diverse/Open-Source Software is more beneficial

71

They are the future

∗ Cross-discipline will be the best way to evolve with the trend

∗ Good to touch Data-Driven Sciences ∗ Data Mining

∗ Since Software is the king, welcome to join us ∗ 9:00~12:00 Thursday ∗ 4204@CSIE Building ∗ Many Talks about software or big data processing from

experts in software industries such as Google, Yahoo!, Synopsys, Trend Micro

72

They are the future

∗ Taiwan Ready? ∗ Our Network environment? ∗ Our Software environment? ∗ Our Creation?

∗ No Matter you like it or not, the surge is coming

∗ Thinking Big for the new Opportunities!

73

Q & A