Introduction to Big Data Taming The Big Data Tidal Wave SNU IDB Lab. Big Data Team.
From Big Data to Big Value - Huawei - Building A Better ... · From Big Data to Big Value ......
Transcript of From Big Data to Big Value - Huawei - Building A Better ... · From Big Data to Big Value ......
2014年3月13日星期四
From Big Data to Big Value Infrastructure Needs
and Huawei Best Practice
1
Data-driven insight
Making better, more informed decisions, faster
Capture Store Process Insight
Raw Data
2
Data Landscape continues to evolve
BUSINESS PROCESS
Generated
STRUCTURED DATA
OLTP
MACHINE Generated
SEMI-STRUCTURED
DATA
Satellite Images
Bio-Informatics
M2m Log Files
Sensors
Video
Audio
HUMAN Generated
UNSTRUCTURED
DATA
Web Logs
Documents
Social
1990 2000 2008 2013
Data Volume
Captured and
processed
Data Velocity
Of ingest and time
sensitivity for
analysis
Data Variability
Data format
3
Big data analytics data flows
MPP DW
ERP
SCM
CRM
OLT
P
Capture Store Process Insight
ETL OLTP DB
Terabytes
MPP Data Store
Converged Compute & Storage
Ma
ch
ine
Exabytes
SAN
Web Logs NAS
Hum
an
Petabytes
4
Example For “Exabyte” requirement
"CERN is hitting the technology limits for resource-intensive simulations and analysis. Our collaboration with Huawei shows
an exciting new approach, where their novel architecture extends the capabilities in preparation for the Exascale data rates
and volumes we expect in the future." said Bob Jones, head of CERN OpenLAB
5
Infrastructure Needs
Scale-out distributed storage platforms
Bring the computation to the data
Can’t move Petabytes around network
High throughput streaming workloads
Batch oriented processing
Colum-oriented NOSQL and MPP databases
Flexible schemas, massive scale
Real time analytics requires massive flows
New platforms combine real-time with batch
Trigger on events and process historical data
6
Huawei Strategy on big data
“Build the Most Efficient Big Data Platform”
Intelligent Application Awareness
Multi protocol Interface
Openness and cooperation
Infrastructure is Key of Big Data
Scale out and X86 architecture, all IP based
Fully symmetric and distributed file system
Natively support Multi-workload
Integrated Storage, analysis and archiving functions
Data full life cycle management
Huawei Strategy
7
Huawei Enterprise-level Big data Platform
High Performance
Store and Archive
Query and Retrieval
for Structured Data
Analysis Processing
for Unstructured Data WORKLOAD
STANDARD
EXPOSURE
EB-level Storage
Resource Pool Mgmt
OCEANSTO
R BIG DATA
FRAMEWO
RK
TELECOM M&E BANKING GOVERMENT ENERGY
“HIGH SCALABILITY” DISTRIBUTED STORAGE SYSTEM
NFS/CIFS/HDFS SQL HTTP/S3 MR/HBASE
DISTRIBUTED
RAID
LOAD
BALANCE
QUOTA
MGMT
STORAGE
TIERING
MPP DB
ENGINE
ENTERPRISE
HADOOP ENGINE
OBJECT STORAGE
ENGINE
NATIVE
INTERFACE NATIVE
INTERFACE HDFS
• World Leading Performance and Scalability Storage Platform as the
Infrastructure.
• Natively Integrated HADOOP, MPP DB, OBJECT Engine, Efficient
Data Loading and Processing.
• End-To-End Data Protection and Life Cycle Mgmt.
8
OceanStor 9000 big data storage
No.1 Scalability
288 Nodes
No.1 Capacity
40 PB
No.1 Performance
5,000,000 OPS
Performance 3x 1,112,705
1,512,784 1,564,404
3,064,602
5,000,000
500,000
1,500,000
2,500,000
3,500,000
4,500,000
5,500,000
EMC Isilon NetApp FAS6240
Avere FXT3500
OceanStor N8000
OceanStor 9000
5,000,000 OPS
9
Customized Hadoop
Reliability improvements
Redundancy, Failover, SPoF elimination
Security/privacy improvements
Encryption of data and metadata, KERBEROS access control
Management simplification
GUI platform management tools, role-based admin
All Hadoop tools, such as HIVE, PIG, etc.
Enterprise-level hadoop platform
Innovative DR Solution
DR site up to 1000km
Special VM instances for Hadoop processing
10
Manager snapshot
Dashboard – Overall System Status
Service Management
Resource Management
11
“OceanStor” Big data PLATFORM High lights
NFS/CIFS
/HDFS SQL HTTP/S3 MR/HBASE
Multi-Workload Scale-Out Storage
Platform
Leading Storage Efficiency and
Scalability
End-To-End Data Protection
Enterprise-Level Hadoop Model
Native Integrated Hadoop/ MPP-
DB/Object
Unified Management DISTRIBUTED STORAGE SYSTEM
MPP DB
ENGINE
HADOOP
ENGINE
OBJECT
ENGINE
NATIVE
INTERFACE
NATIVE
INTERFACE
NATIVE
HDFS
DISTRIBUTED
RAID
LOAD
BALANCE
QUOTA
MGMT
STORAGE
TIERING
12
OceanStor 18000 Series:
No.1 Performance Enterprise Storage
16 Controller
No.1 reliability
1,005,893 SPC-1 IOPS ™
No.1 performance
7 PB
No.1 capacity
3 TB
No.1 cache
0.7 ms
No.1 stable delay
OceanStor 18500/18800 Secure and Trusted
Flexible and Efficient
20X data recovery speed,99.9999% availability, mission-
critical applications always on
Million-level IOPS,2X performance, the stable
microsecond latency , and 10X respond speed
13
Huawei IT business coverage
Distributed Cloud Data Center
Data Center Facilities
Servers Storage
Cloud
Computing C
on
ve
rge
d
Infra
stru
ctu
re
Applications
Ma
na
ge
me
nt
Big Data
14
Keep your competitive advantage
Big data is here
Big data presents new challenges to infrastructure
Be careful with an open source Hadoop
Implementing a robust foundation and careful selection of tools
can allow you to benefit from big data
Copyright©2014 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product
portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive
statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time
without notice.
HUAWEI ENTERPRISE ICT SOLUTIONS A BETTER WAY