Post on 31-Jan-2018
State-of-the-Art Analysis and Perspectives of China HPC Development and Applications
SIAM PP 2012, Savannah, Georgia, USA, 2/17/2012
张云泉(Yunquan Zhang) 中科院软件所并行软件与计算科学实验室(Institute of Software, Chinese Academy of Sciences) 计算机科学国家重点实验室(State Key Lab. of Computer Science) 合作者:孙家昶(Jiachang Sun) 袁国兴(Guoxing Yuan) 张林波(Linbo Zhang) zyq@mail.rdcps.ac.cn
Place photo here
中国大陆高性能计算机的发展与应用趋势分析与展望
Outline
• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development
trends of China • Summary
• First released on 2002. Becomes the De Facto Industry Standard of HPC ranking in China Mainland, widely adopted by researchers, users, vendors and government;
• One of procurement index of customers, cited by lots of technical reports and project proposals;
• Partial supported by National 863 plan on HPC computer and kernel software key project;
• Technical report based on TOP100 were selected as chapters of the Annual Progress Report of China Computer Science and Technology, edited by China CCF from 2005 to 2007 and 2009 to 2010.
• On 2004, Prof. David Keyes , presented a talk on China HPC development “Supercomputing in China” based the statistics data of China HPC TOP100 rank list. And 2008 again.
• The English version of TOP100 exchanged with editor of TOP500, Prof. Hans Meuer and Prof. Jack Dongarra
• The TOP500 and TOP100 website exchanged link with each other, TOP500 reported the release news of China HPC TOP100 for two years.
• Invited by NSF, we presented an invited speech on HPC in China workshop of SC2007, Reno,USA
• Invited Plenary Talk on ISC 2011
China HPC TOP100
2011年中国高性能计算机性能TOP100排行榜 2011 China HPC TOP100 Rank List
张云泉 孙家昶 袁国兴 张林波
Yunquan Zhang, Jiachang Sun, Guoxin Yuan, Linbo Zhang 中国软件行业协会数学软件分会
The Specialty Association of Mathematical & Scientific Software (SAMSS) 国家863高性能计算机评测中心
Evaluation Center of High Performance Computer, National 863 Plan 中国计算机学会高性能计算专业委员会
China HPC Technical Committee
注解 Remarks
• 数据只来源中国大陆地区 Data source from Mainland China only
• “Q”:本协会测试、抽查或部级鉴定会认可 Certificated by SAMSS • “T”: TOP500(http://www.top500.org)公布的数据 From TOP500 • “C”: 机器制造商 From Vendor • “U”: 商业公司的公开数据和用户填写的调查表 From Users • “S”: 从TOP500(http://www.top500.org)公布的同型号系统较大规模机器的Linpack值等比推
算出来的 Extrapolated from similar system on TOP500
• 对用户/厂商的数据,本协会只负责对其合理性进行检查,其真实性由填写调查表的用户/厂商负责 User is responsible for the accuracy of the data they provided. We just did sanity check
• 本排行榜将至少在每年10月底或11月上旬公布一次
The list is published in fall every year
2011 China HPC Top 10
Rank
厂商 Vendor 配置 Configuration
安装地点 Installation Site
安装年份 Year
应用领域(App. Area)
处理器核 Num of Proc
Linpack (Gflops)
Peak (Gflops)
效率 Efficiency
1 NUDT
Tianhe-1A/7168x2 Intel Hexa Core Xeon X5670 2.93GHz + 7168 Nvidia Tesla M2050@1.15GHz+2048 Hex Core FT-1000@1GHz/NUDT Private Network 80Gbps
National Supercomputing Center at Tianjin
2010 Supercompuiting Center
202752 2566000.00 4701000.00 0.546
2 NPCEC Sunway BlueLight /8575x16 Core Shenwei 1600@975MHz/QDR Infiniband
National Supercomputing Center at Jinan
2011 Supercompuiting Center
137200 795900.00 1070160.00 0.744
3 NUDT
Tianhe-1A-HN/2048x2 Intel Hexa Core Xeon X5670 2.93GHz + 2048 Nvidia Tesla M2050@1.15GHz/NUDT Private Network 80Gbps
National Supercomputing Center at Changsha
2011 Supercompuiting Center
53248 771700.00 1343200.00 0.575
4 Sugon Nebulae/Dawning TC3600 Blade/2560x (2 Intel Hexa Core X5650 + Nvidia Tesla C2050 GPU)/QDR Infiniband
National Supercomputing Center at Shenzhen
2011 Supercompuiting Center
52416 749200.00 1296320.26 0.578
5 IBM xSeries x3650M3/Intel Xeon X56xx 2.53 GHz/Giga-E
Network Company 2011 Internet
Service 113040 636985.00 1143965.00 0.557
6 IPE,CAS Mole-8.5 Cluster/320x2 Intel QC Xeon E5520 2.26 Ghz + 320x6 Nvidia Tesla C2050/QDR Infiniband
IPE, CAS 2010 Scientific Computing 33120 496500.00 1138440.00 0.436
7 Sugon Nebulae//Dawning TC3600 Blade/3040 x 2 Intel Hexa Core X5650/QDR Infiniband
Shenzhen Cloud Computing Center
2011 Cloud Computing 36480 342300.00 389168.64 0.880
8 IBM xSeries x3650M3/Intel Xeon X56xx 2.93 GHz/Giga-E Telecomm 2011 Industry 36336 204754.40 425856.00 0.481
9 IBM xSeries x3650M2 Cluster/Intel Xeon QC E55xx 2.53 GHz/Giga-E
Network Company 2011
Internet Service
34688 196228.00 351044.00 0.559
10 Sugon
Magic Cube/Dawning 5000A/1920x4 AMD QC Barcelona 1.9GHz/DDR Infiniband/WCCS+Linux
Shanghai Supercompuiting Center
2008 Supercompuiting Center
30720 180600.00 233472.00 0.774
China HPC TOP100 Authors with Tianhe 1A
数学软件分会孙家昶、袁国兴和张云泉等三人现场考察国防科技大学研制成功的千万亿次超级计算系统“天河一号A”
International Collaboration
国际TOP500作者之一 、美国田纳西大学Jack Dongarra教授和Beowulf之父LSU Thomas Sterling教授、数学软件分会副理事长迟学斌 研究员、秘书长 张云泉 研究员等现场考察天河1A
Outline
• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development
trends of China • Summary
China HPC TOP100 Performance Analysis
• 国防科大天河1A再次蝉联中国TOP100第一名Linpack性能2.56PFlops Tianhe 1A from National University of Defense Technology takes #1 again with Linpack performance of 2.56 PFlops
• 中国TOP100的总Linpack性能12PFlops,为2010年的1.9倍 •Total Linpack Perf. 12 PFlops, 1.9 times of 2010
• 全部机器的Linpack性能超过22.1TFlops The Linpack performance of all systems is above 22.1TFlops • Peak performance all exceeds
25.6TFlops • 排名前十的机器4套是CPU+GPU异构机群
The No.1, No.3 ,No.4 and No.6 of TOP10 are CPU+GPU heterogeneous cluster
• 共有97个(2010年98个)系统是机群 97 out of 100 are clusters
集群份额 Cluster Share in China HPC TOP100
Cou
nt
中国HPC TOP100制造商分析 Manufacturer Analysis
厂商 Manufacturer
系统 Systems
份额 Share
Rmax [TF/s]
Rpeak [TF/s]
平均效率 Efficiency
处理器核 Num of
Proc
国产机器
D
omestic
曙光Sugon 35 35% 2848.18 4544.56 61.40% 363864 浪潮Inspur 7 7% 306.93 535.39 60.50% 55748
神威Sunway 5 5% 1087.80 1404.71 84.34% 165512 国防科大NUDT 2 2% 3337.70 6044.20 56.00% 256000
中科院过程所IPE 1 1% 496.50 1138.44 43.60% 33120 联想Lenovo 1 1% 102.80 145.29 70.80% 12160
国产小计 Domestic Total 51 51% 8204.11 13812.59 62.90% 886404
引进机
器
Imp
ort
IBM 35 35% 3264.31 6020.59 57.60% 588524 HP 13 13% 509.51 927.77 57.60% 98056
Dell 1 1% 23.40 44.93 72.43% 6880 引进小计 Import Total 49 49% 3797.22 6993.28 57.50% 690900
总计 Total 100 100% 12001.33 20805.87 59.63% 1577304
中国HPC TOP100厂商份额趋势 Manufacturer Share Trend
Import
Dom
estic
中国HPC TOP100制造商机器数量份额图 Manufacturer Shares By Number of Systems
2011 China HPC TOP100 http://www.samss.org.cn
中国HPC TOP100制造商机器性能份额图 Manufacturer Share by Performance
2011 China HPC TOP100 http://www.samss.org.cn
HPC TOP100 Application Areas App Area # systems Share Linpack[GF/s] Peak [GF/s] Eficiency # of Proc
Internet Service 21 21% 2133.82 3963.18 53.30% 404568
Goverment 16 16% 763.91 1450.00 52.00% 155648
Education 9 9% 293.01 424.04 76.30% 30740
SC Center 8 8% 5333.40 8892.26 66.84% 502616
Telecomm 7 7% 474.31 923.01 53.20% 88192
Engeering 6 6% 541.98 1026.46 54.10% 95720
Scientific Computing 5 5% 742.70 1455.37 67.70% 56300
On-line Gaming 5 5% 388.62 682.08 57.00% 68648
Weather Forcasting 5 5% 202.46 236.82 85.20% 22064
Energy 4 4% 112.02 208.98 59.30% 13852
Cloud Computing 3 3% 436.35 571.11 63.60% 44300
Service Provider 2 2% 213.88 383.26 55.80% 37872
Power 2 2% 81.87 118.27 67.70% 13440
Semi-Conductor 2 2% 79.20 150.37 53.50% 15352
Bioinformatics 2 2% 78.93 147.76 53.00% 8480
Video 1 1% 46.38 81.79 56.70% 9600
Logistics 1 1% 31.03 58.40 53.10% 5840
Earthquake Engineering 1 1% 23.27 32.69 71.20% 3072
Total 100 100% 12001.33 20805.87 59.63% 1577304
中国HPC TOP100行业领域分析 Application Areas Analysis
• 领域数量有所增加 18 Number of application areas 18, increases than previous years
• 机器数量:前三个行业为互联网服务、政府和教育 Number of systems: Top 3 areas are internet service, government and education
• 机器性能:前三个行业超算中心、互联网服务和政府 Total Linpack performance: Top 3 areas are supercomputing center, internet service and government
• 主要用户:互联网服务、政府、超算中心和教育 Main users: internet service, government,supercomputing center and education
• 新用户:云计算、半导体 New users: cloud computing,semiconductor
中国HPC TOP100应用领域趋势 Application Area Trend
中国HPC TOP100行业应用领域机器系统份额图 Application Area System Shares
2011 China HPC TOP100 http://www.samss.org.cn
中国HPC TOP100行业应用领域机器系统份额图 Application Area System Shares
2011 China HPC TOP100 http://www.samss.org.cn
Outline
• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development
trends of China • Summary
China 863 Program • The National High-tech R&D Program (863
Program) • proposed by 4 senior Chinese Scientists and
approved by former leader Mr. Deng Xiaoping in March 1986
• One of the most important national science and technology R&D programs in China
• Now a regular national R&D program planed in 5-year terms, current we just finished the 11th five-year plan and at the begining the 12th five-year plan.
Overview of 863 key project on HPC and Grid
• “High performance computer and core software” • 4-year project, May 2002 to Dec. 2005 • 100 million Yuan funding from the MOST • More than 2Χ associated funding from local government,
application organizations, and industry • Outcomes: China National Grid (CNGrid)
• “High productivity Computer and Grid Service Environment” • Period: 2006-2010 • 940 million Yuan from the MOST and more than 1B Yuan
matching money from other sources
Major R&D activities
• Developing Petaflops Supercomputers
• Building up a grid service environment--CNGrid
• Developing Grid and HPC applications in selected areas
Two phase development • First phase: two 100TFlops machines
• Dawning 5000A for SSC • Lenovo DeepComp 7000 for SCCAS
• Second phase: three Petaflops machines • Tianhe 1A: NUDT/Inspur/Tianjin Supercomputing
Center • Dawning 6000: ICT/Dawning/South China
Supercomputing Center (Shenzhen) • Sunway Bulelight: National Engineering Center on
Parallel Computer/Shandong Supercomputing Center
Dawning5000A (2008)
• China surpassed Japan in HPC performance • ICT regained performance crown in China, following Machine-
757 (1983) and Dawning1000 (1995)
• Peak: 233.5TFlops • Linpack: 180.6TFlops (Eff. 77.34%) • Power: <800KW • MPI Latency: 1.6us • Top10, Nov 2008
Dawning 5000A • Constellation based on AMD
multicore processors • Low power CPU and high
density blade design • High performance InfiniBand
switch • 233.472TFlops peak
performance, 180.6TFlops Linpack performance
• The 10th in TOP500 in Nov. 2008, the fastest machine outside USA
Lenovo DeepComp 7000 Hybrid cluster
architecture using Intel multicore processors
Two sets of interconnect InInfiniBand Gb Ethernet
SAN connection between I/O nodes and disk array
145.965TFlops peak performance
106.5 Tflops Linpack performance
The 19th in TOP500 in Nov. 2008
Dawning Nebulae: 3PFlops (2010)
Ranked Top500 #2, Linpack 1.271PFlops
Dawning 6000 • Hybrid system
• Service unit (Nebulae) • 9600 Intel 6-core
Westmere processor • 4800 nVidia Fermi
GPGPU • 3PF peak performance • 1.27 Linpack
performance • 2.6 MW
• Computing unit • Domestic processor
Tianhe-IA • Hybrid system
• 14336 General purpose unit--Intel 6-core processors
• 7168 Acceleration unit—NVIDIA Fermi GPUs • 2048 Service unit—FT-1000 processors • 80Gbps NUDT Proprietary Th-Net(Hierarch Fat Tree) • Kylin Linux OS • MPI + OpenMP/Pthread + CUDA/OpenCL
• 4.7PFlops peak, 2.57PFlops Linpack(>50% Eff.)
• 262TB Mem. 2PB Storage, • Water cooling, 4.04MW (635.15MF/W) • 120 Compute,14 Storage,6
Communication • Installed on Aug., 2010 • TOP500 No.1 on Nov. 2010.
Quad cpu blade
TH-1A System
FT-1000
X5670
M2050
Chips
Twin GPU blade
Compute node
rack (16 x cn)
Cabinet (4 x rack)
On-line storage
TH-Net
(4CPU+2GPU)
From chips to Entire system
TH-1A software stack
Sunway Bluelight MPP Designed by National Engineering Center for Parallel Computer
Developed for the National Supercomputing Center(Shandong), Jinan, China
8704 CPUs, 1.07 Petaflops peak performance
Linpack 795.9TFlops, 74.37%. 741.06MFlops/W
Infin QDR 40Gbps, Power Consumption 1.07MW, Water Cooling
Multi-core(16) Processor SW1600 designed by China
Released on HPC China 2011@Jinan
Sunway Bluelight Architecture
• SW1600 CPU:16 Cores/975~1100MHz/124.8~140.8Gflops
• Fat Tree,QDR 4X10Gbps Infiniband,MPI latency 2us:
• SWCC/C++/Fortran/ UPC/MPICC/Mathematical Library
• Storage:2PB,Peak I/O:200GB/s,IOR(~60GB/s)
Parameters:
Outline
• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development
trends of China • Summary
Profile of user number
37%
20%
10%
8%
7%6% 2%2%
8%
Basic science research (Physics,Chemical, Astronomy, etc)Bio-medical research
New material, new energy research
Computing fluid dynamics
Engineering design, simulation andanalysisEnvironment science
Weather and climate forecasting
Petroleum exploration
Animation
Number of Users Profile on TH-1A
Profile of resource usage
24%
7%
7%
2%4%5%
8.2%
41.8%
Petroleum exploration
Bio-medical research
New material, new energy research
Environment science
Basic science research (Physics,Chemical, Astronomy, etc)Computing fluid dynamics
Weather and climate forecasting
Animation
Engineering design, simulation andanalysis
Resource Usage Profile on TH-1A
• Joint work • Shanghai Astronomical Observatory, CAS (SHAO), • Institute of Software, CAS (ISCAS) • Shanghai Supercomputer Center (SSC)
• Building a high performance parallel computing software platform for astrophysics research, focusing on the planetary fluid dynamics(thermal convection in the Earth’s outer core) and N-body problems
• New parallel computing models and parallel algorithms studied, validated and adopted to achieve high performance.
Parallel Computing Software Platform for Astrophysics
Software Architecture
Physical and Mathematical
Model
Parallel Computing
Model
Numerical Methods
MPI OpenMP Fortran C
100T Supercomputer
PETSc Aztec
Software Platform for Astrophysics
Web Portal on CNGrid
Fluid Dynamics N-body Problem
Improved Preconditioner
Improved Lib. for Collective
Comunication SpMV
FFTW GSL
Lustre
Software Development
Data Processing Scientific Visualiztion
• The early performace evaluation for Aztec code and PETSc code on Dawning 5000A is shown.
• For 80×80×50 mesh, the execution time of Aztec program is 4-7 times of the PETSc version, average 6 times;
• For 160×160×100 mesh, the execution time of Aztec program is 2-5 times of the PETSc version, average 4 times.
PETSc Optimized Version 1 (Speedup 4-6)
0
200
400
600
800
1000
1200
1400
1600
32 64 128 256 512 1024 2048
Runt
ime
(s)
Processor core
Mesh 160×160×100 (Dawning 5000A)
Aztec PETSc
0
50
100
150
200
250
300
350
400
16 32 64 128 256 512 1024 2048
Runt
ime
(s)
Processor core
Mesh 80×80×50 (Dawning 5000A)
Aztec Petsc
Method 1: Domain Decomposition Ordering Method for Field Coupling
Method 2: Preconditioner for Domain Decomposition Method
Method 3: PETSc Multi-physics Data Structure
PETSc Optimized Version 2 (Speedup 15-26)
Left: mesh 128 x 128 x 96 Right: mesh 192 x 192 x 128 Computation Speedup: 15-26
Strong scalability: Original code normal, New code ideal Test environment: BlueGene/L at NCAR (HPCA2009)
2012/2/16
Strong Scalability on Dawning 5000A
0
200
400
600
800
1000
1200
1400
1600
32 64 128 256 512 1024 2048 4096 8192
Tim
e(Se
cond
s)
Processor Cores
Dawning 5000A(160×160×100 mesh size)
Aztec Petsc
44
Strong Scalability rotm p linea r: 192x192x128
433.6
212.8
98.5
51.1
26.1
14.4
8.3
4.7
12.0
144.8
65.5
19.232.369.3157.7
257.1
13.523.8
344.7
1
10
100
1000
64 128 256 512 1024 2048 4096 8192
num b er of p rocessor core
Time(S)
BG /L
曙光5000A
深腾7000
2012/2/16
Strong Scalability on TianHe-1A
• A fully implicit shallow water atmospheric model(ISCAS) • Using 82,944 cores • Parallel efficiency 60% • #unknowns: 680M
• Petroleum seismic data processing(BGP) • GeoEast-lightning single/double-way wave
prestack depth migration software • using 85860 cores • 24.6TB data • 16hours
TianHe-1A Applications Case Study(CPU only)
TianHe-1A Application Case study(CPU+GPU) • Direct Numerical Simulation of Turbulent Flow(PKU)
• GPU-accelerated FFT solver (PKUFFT) • Taylor micro-scale Reynolds number up to 1164 • Grid resolution up to 143363
• 7168nodes, >3.2million cuda cores(>100,000 gpu cores) • 30TFlops(SP) /17TFlops(DP) FFT sustained performance(SP)
Jaguar
PKUFFT(With GPU)
MKL(Without GPU)
• High speed particle collision system simulation • Force calculation is accelerated by GPU • 21.9x speedup on a single GPU compared to a single CPU core • Excellent weak and strong scalability with up to 4096 nodes (106,496
cpu/gpu cores) for problems with up to 11.16 billion atoms • Embedded Atom Method potential. scale to the whole system is
expected
TianHe-1A Application Case study(CPU+GPU)
• Trans-scale Simulation of Silicon Deposition Process(IPECAS) • Scalable bond-order potential (BOP) for the molecular dynamics
simulation of crystalline silicon • 26 nm × 54 nm × 1560000 nm (1.56mm), 110.1 Billions Atoms • Peak Perf. 7.38PFlops(SP),(7168 (Tesla M2050 + 2-way 5670 Xeon)) • 1.17Pflops in SP plus 92.1Tflops in DP on 7168 GPUs and 86,016 CPU
cores, 5TB Mem. • 1.87Pflops in (SP) on 7168 GPUs (25.3% Peak) • 758 flop per step per atom, 44.53s per 1000 steps run.
TianHe-1A Application Case study(CPU+GPU)
1.56 mm 0.54 nm
Outline
• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development
trends of China • Summary
• Ten public supercomputing centers • Beijing(CAS), Tianjin, Shandong, Shanghai, Shenzhen,
Chengdu, Hu’nan, Wuhan, Guangzhou, Chongqing • Covering the developed areas of China • Growing in industry design and simulation
• Five private centers in mature fields • Petroleum, Meteorology, Aerospace, Defense,
Energy • Related to country security
• Four centers in emerging areas • Cyberspace security, Internet service, Sensing
China, Triple-play
Demand for petascale HPCs will be growing in the next 5 years
Performance Development Trend of China TOP100 HPC
Performance Development Trend of China HPC(1993-2011)
110
1001000
100001000001E+061E+071E+081E+091E+101E+111E+121E+13
1993
1995
1997
1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
2019
2021
2023
2025
Year
GFl
ops
No.1 Linpack
No.1 Peak
Total Perf.
Total Perf. Trends
No.1 Peak Trends
No.1 Linpack Trends
趋势和展望 (1) Trend & Outlook (1)
• 1993-2011发展 China HPC performance increase • 1993年到1996年发展平稳 1993-1996 Slow steady • 1996年到1999年第一次跨越式发展 1996-1999 Big jump • 1999年到2001年平稳发展期 1999-2001 Slow steady • 2001年到2005年另外一次快速发展时期 2001-2005 Another
period of big increase • 2005年到2007年重新进入平稳发展期 2005-2007 Slow steady
again • 2008年到2010年开始进入另外一个活跃发展周期,大约会持续2
到3年 After 2008, dramatic increase in the next 2-3 years • 2011年,开始进入一个平稳发展期,大约持续2到3年。 Slow
steady again in the next 2-3 years
趋势和展望 (2) Trend & Outlook (2)
过去的预测和(实际情况) Previous Predictions • 峰值100TFflops的机器在2007年到2008年间出现(2008年10月)
2007-2008: System with peak performance of 100TFlops (Reality: Oct 2008)
• 累计Linpack性能将在2008年到2009年间超过PFlops(2008年10月) 2008-2009: Total Linpack performance exceeds Pflops (Reality: Oct 2008)
• 峰值PFlops的机器将在2010年到2011年间出现(提前完成!) 2010-2011: System with peak performance of 1PFlops (Reality: Oct 2009)
• 累计Linpack性能将在2011年到2012年间达到10PFlops 2011-2012: Total Linpack performance reaches 10PFlops(Reality: Oct 2011)
趋势和展望 (3) Trend & Outlook (3)
未来的预测 Future Predictions • 峰值10PFlops的机器将在2012年到2013年间出现
2012-2013: System with peak performance of 10 PFlops • 峰值100PFlops的机器将在2014年到2015年间出现
2014-2015: System with peak performance of 100 PFlops • 累计Linpack性能将在2013年到2014年间达到100PFlops
2013-2014: Total Linpack performance reaches 100 PFlops
Outline
• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development
trends of China • Summary
• With correct strategies, China wins the HPC Olympic Games on 2011, and HPC is really helping science and economy development of China.
• HPC real application still lag behind the US, Euro and Japan.
• On TianHe-1A, several applications can scale up to 80,000 cpu cores.
• The growth rate of China HPC Perf. Is the fastest. • There will be at least 19 major petaflops
supercomputing centers within 5 years.
Summary
• First Petaflops supercomputer totaly powered by domestic processor designed by China has been released on HPC China 2011@JiNan;
• According to TOP100 predictions, 10 Petaflops peak performace supercomputer will appear before 2013;
• According to TOP100 predictions, 100 Petaflops peak performance supercomputer may appear before 2015;
Summary
Thank You
• Thanks Yutong Lu, Chao Yang, We Ge • Contact: Yunchuan Zhang, Ph.D. • Emails: zyq@mail.rdcps.ac.cn
samss@mail.rdcps.ac.cn