State-of-the-Art Analysis and Perspectives of China HPC ...84.34% 165512 国防科大NUDT 2 2%...

Post on 31-Jan-2018

232 views 0 download

Transcript of State-of-the-Art Analysis and Perspectives of China HPC ...84.34% 165512 国防科大NUDT 2 2%...

State-of-the-Art Analysis and Perspectives of China HPC Development and Applications

SIAM PP 2012, Savannah, Georgia, USA, 2/17/2012

张云泉(Yunquan Zhang) 中科院软件所并行软件与计算科学实验室(Institute of Software, Chinese Academy of Sciences) 计算机科学国家重点实验室(State Key Lab. of Computer Science) 合作者:孙家昶(Jiachang Sun) 袁国兴(Guoxing Yuan) 张林波(Linbo Zhang) zyq@mail.rdcps.ac.cn

Place photo here

中国大陆高性能计算机的发展与应用趋势分析与展望

Outline

• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development

trends of China • Summary

• First released on 2002. Becomes the De Facto Industry Standard of HPC ranking in China Mainland, widely adopted by researchers, users, vendors and government;

• One of procurement index of customers, cited by lots of technical reports and project proposals;

• Partial supported by National 863 plan on HPC computer and kernel software key project;

• Technical report based on TOP100 were selected as chapters of the Annual Progress Report of China Computer Science and Technology, edited by China CCF from 2005 to 2007 and 2009 to 2010.

• On 2004, Prof. David Keyes , presented a talk on China HPC development “Supercomputing in China” based the statistics data of China HPC TOP100 rank list. And 2008 again.

• The English version of TOP100 exchanged with editor of TOP500, Prof. Hans Meuer and Prof. Jack Dongarra

• The TOP500 and TOP100 website exchanged link with each other, TOP500 reported the release news of China HPC TOP100 for two years.

• Invited by NSF, we presented an invited speech on HPC in China workshop of SC2007, Reno,USA

• Invited Plenary Talk on ISC 2011

China HPC TOP100

2011年中国高性能计算机性能TOP100排行榜 2011 China HPC TOP100 Rank List

张云泉 孙家昶 袁国兴 张林波

Yunquan Zhang, Jiachang Sun, Guoxin Yuan, Linbo Zhang 中国软件行业协会数学软件分会

The Specialty Association of Mathematical & Scientific Software (SAMSS) 国家863高性能计算机评测中心

Evaluation Center of High Performance Computer, National 863 Plan 中国计算机学会高性能计算专业委员会

China HPC Technical Committee

注解 Remarks

• 数据只来源中国大陆地区 Data source from Mainland China only

• “Q”:本协会测试、抽查或部级鉴定会认可 Certificated by SAMSS • “T”: TOP500(http://www.top500.org)公布的数据 From TOP500 • “C”: 机器制造商 From Vendor • “U”: 商业公司的公开数据和用户填写的调查表 From Users • “S”: 从TOP500(http://www.top500.org)公布的同型号系统较大规模机器的Linpack值等比推

算出来的 Extrapolated from similar system on TOP500

• 对用户/厂商的数据,本协会只负责对其合理性进行检查,其真实性由填写调查表的用户/厂商负责 User is responsible for the accuracy of the data they provided. We just did sanity check

• 本排行榜将至少在每年10月底或11月上旬公布一次

The list is published in fall every year

2011 China HPC Top 10

Rank

厂商 Vendor 配置 Configuration

安装地点 Installation Site

安装年份 Year

应用领域(App. Area)

处理器核 Num of Proc

Linpack (Gflops)

Peak (Gflops)

效率 Efficiency

1 NUDT

Tianhe-1A/7168x2 Intel Hexa Core Xeon X5670 2.93GHz + 7168 Nvidia Tesla M2050@1.15GHz+2048 Hex Core FT-1000@1GHz/NUDT Private Network 80Gbps

National Supercomputing Center at Tianjin

2010 Supercompuiting Center

202752 2566000.00 4701000.00 0.546

2 NPCEC Sunway BlueLight /8575x16 Core Shenwei 1600@975MHz/QDR Infiniband

National Supercomputing Center at Jinan

2011 Supercompuiting Center

137200 795900.00 1070160.00 0.744

3 NUDT

Tianhe-1A-HN/2048x2 Intel Hexa Core Xeon X5670 2.93GHz + 2048 Nvidia Tesla M2050@1.15GHz/NUDT Private Network 80Gbps

National Supercomputing Center at Changsha

2011 Supercompuiting Center

53248 771700.00 1343200.00 0.575

4 Sugon Nebulae/Dawning TC3600 Blade/2560x (2 Intel Hexa Core X5650 + Nvidia Tesla C2050 GPU)/QDR Infiniband

National Supercomputing Center at Shenzhen

2011 Supercompuiting Center

52416 749200.00 1296320.26 0.578

5 IBM xSeries x3650M3/Intel Xeon X56xx 2.53 GHz/Giga-E

Network Company 2011 Internet

Service 113040 636985.00 1143965.00 0.557

6 IPE,CAS Mole-8.5 Cluster/320x2 Intel QC Xeon E5520 2.26 Ghz + 320x6 Nvidia Tesla C2050/QDR Infiniband

IPE, CAS 2010 Scientific Computing 33120 496500.00 1138440.00 0.436

7 Sugon Nebulae//Dawning TC3600 Blade/3040 x 2 Intel Hexa Core X5650/QDR Infiniband

Shenzhen Cloud Computing Center

2011 Cloud Computing 36480 342300.00 389168.64 0.880

8 IBM xSeries x3650M3/Intel Xeon X56xx 2.93 GHz/Giga-E Telecomm 2011 Industry 36336 204754.40 425856.00 0.481

9 IBM xSeries x3650M2 Cluster/Intel Xeon QC E55xx 2.53 GHz/Giga-E

Network Company 2011

Internet Service

34688 196228.00 351044.00 0.559

10 Sugon

Magic Cube/Dawning 5000A/1920x4 AMD QC Barcelona 1.9GHz/DDR Infiniband/WCCS+Linux

Shanghai Supercompuiting Center

2008 Supercompuiting Center

30720 180600.00 233472.00 0.774

China HPC TOP100 Authors with Tianhe 1A

数学软件分会孙家昶、袁国兴和张云泉等三人现场考察国防科技大学研制成功的千万亿次超级计算系统“天河一号A”

International Collaboration

国际TOP500作者之一 、美国田纳西大学Jack Dongarra教授和Beowulf之父LSU Thomas Sterling教授、数学软件分会副理事长迟学斌 研究员、秘书长 张云泉 研究员等现场考察天河1A

Outline

• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development

trends of China • Summary

China HPC TOP100 Performance Analysis

• 国防科大天河1A再次蝉联中国TOP100第一名Linpack性能2.56PFlops Tianhe 1A from National University of Defense Technology takes #1 again with Linpack performance of 2.56 PFlops

• 中国TOP100的总Linpack性能12PFlops,为2010年的1.9倍 •Total Linpack Perf. 12 PFlops, 1.9 times of 2010

• 全部机器的Linpack性能超过22.1TFlops The Linpack performance of all systems is above 22.1TFlops • Peak performance all exceeds

25.6TFlops • 排名前十的机器4套是CPU+GPU异构机群

The No.1, No.3 ,No.4 and No.6 of TOP10 are CPU+GPU heterogeneous cluster

• 共有97个(2010年98个)系统是机群 97 out of 100 are clusters

集群份额 Cluster Share in China HPC TOP100

Cou

nt

中国HPC TOP100制造商分析 Manufacturer Analysis

厂商 Manufacturer

系统 Systems

份额 Share

Rmax [TF/s]

Rpeak [TF/s]

平均效率 Efficiency

处理器核 Num of

Proc

国产机器

D

omestic

曙光Sugon 35 35% 2848.18 4544.56 61.40% 363864 浪潮Inspur 7 7% 306.93 535.39 60.50% 55748

神威Sunway 5 5% 1087.80 1404.71 84.34% 165512 国防科大NUDT 2 2% 3337.70 6044.20 56.00% 256000

中科院过程所IPE 1 1% 496.50 1138.44 43.60% 33120 联想Lenovo 1 1% 102.80 145.29 70.80% 12160

国产小计 Domestic Total 51 51% 8204.11 13812.59 62.90% 886404

引进机

Imp

ort

IBM 35 35% 3264.31 6020.59 57.60% 588524 HP 13 13% 509.51 927.77 57.60% 98056

Dell 1 1% 23.40 44.93 72.43% 6880 引进小计 Import Total 49 49% 3797.22 6993.28 57.50% 690900

总计 Total 100 100% 12001.33 20805.87 59.63% 1577304

中国HPC TOP100厂商份额趋势 Manufacturer Share Trend

Import

Dom

estic

中国HPC TOP100制造商机器数量份额图 Manufacturer Shares By Number of Systems

2011 China HPC TOP100 http://www.samss.org.cn

中国HPC TOP100制造商机器性能份额图 Manufacturer Share by Performance

2011 China HPC TOP100 http://www.samss.org.cn

HPC TOP100 Application Areas App Area # systems Share Linpack[GF/s] Peak [GF/s] Eficiency # of Proc

Internet Service 21 21% 2133.82 3963.18 53.30% 404568

Goverment 16 16% 763.91 1450.00 52.00% 155648

Education 9 9% 293.01 424.04 76.30% 30740

SC Center 8 8% 5333.40 8892.26 66.84% 502616

Telecomm 7 7% 474.31 923.01 53.20% 88192

Engeering 6 6% 541.98 1026.46 54.10% 95720

Scientific Computing 5 5% 742.70 1455.37 67.70% 56300

On-line Gaming 5 5% 388.62 682.08 57.00% 68648

Weather Forcasting 5 5% 202.46 236.82 85.20% 22064

Energy 4 4% 112.02 208.98 59.30% 13852

Cloud Computing 3 3% 436.35 571.11 63.60% 44300

Service Provider 2 2% 213.88 383.26 55.80% 37872

Power 2 2% 81.87 118.27 67.70% 13440

Semi-Conductor 2 2% 79.20 150.37 53.50% 15352

Bioinformatics 2 2% 78.93 147.76 53.00% 8480

Video 1 1% 46.38 81.79 56.70% 9600

Logistics 1 1% 31.03 58.40 53.10% 5840

Earthquake Engineering 1 1% 23.27 32.69 71.20% 3072

Total 100 100% 12001.33 20805.87 59.63% 1577304

中国HPC TOP100行业领域分析 Application Areas Analysis

• 领域数量有所增加 18 Number of application areas 18, increases than previous years

• 机器数量:前三个行业为互联网服务、政府和教育 Number of systems: Top 3 areas are internet service, government and education

• 机器性能:前三个行业超算中心、互联网服务和政府 Total Linpack performance: Top 3 areas are supercomputing center, internet service and government

• 主要用户:互联网服务、政府、超算中心和教育 Main users: internet service, government,supercomputing center and education

• 新用户:云计算、半导体 New users: cloud computing,semiconductor

中国HPC TOP100应用领域趋势 Application Area Trend

中国HPC TOP100行业应用领域机器系统份额图 Application Area System Shares

2011 China HPC TOP100 http://www.samss.org.cn

中国HPC TOP100行业应用领域机器系统份额图 Application Area System Shares

2011 China HPC TOP100 http://www.samss.org.cn

Outline

• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development

trends of China • Summary

China 863 Program • The National High-tech R&D Program (863

Program) • proposed by 4 senior Chinese Scientists and

approved by former leader Mr. Deng Xiaoping in March 1986

• One of the most important national science and technology R&D programs in China

• Now a regular national R&D program planed in 5-year terms, current we just finished the 11th five-year plan and at the begining the 12th five-year plan.

Overview of 863 key project on HPC and Grid

• “High performance computer and core software” • 4-year project, May 2002 to Dec. 2005 • 100 million Yuan funding from the MOST • More than 2Χ associated funding from local government,

application organizations, and industry • Outcomes: China National Grid (CNGrid)

• “High productivity Computer and Grid Service Environment” • Period: 2006-2010 • 940 million Yuan from the MOST and more than 1B Yuan

matching money from other sources

Major R&D activities

• Developing Petaflops Supercomputers

• Building up a grid service environment--CNGrid

• Developing Grid and HPC applications in selected areas

Two phase development • First phase: two 100TFlops machines

• Dawning 5000A for SSC • Lenovo DeepComp 7000 for SCCAS

• Second phase: three Petaflops machines • Tianhe 1A: NUDT/Inspur/Tianjin Supercomputing

Center • Dawning 6000: ICT/Dawning/South China

Supercomputing Center (Shenzhen) • Sunway Bulelight: National Engineering Center on

Parallel Computer/Shandong Supercomputing Center

Dawning5000A (2008)

• China surpassed Japan in HPC performance • ICT regained performance crown in China, following Machine-

757 (1983) and Dawning1000 (1995)

• Peak: 233.5TFlops • Linpack: 180.6TFlops (Eff. 77.34%) • Power: <800KW • MPI Latency: 1.6us • Top10, Nov 2008

Dawning 5000A • Constellation based on AMD

multicore processors • Low power CPU and high

density blade design • High performance InfiniBand

switch • 233.472TFlops peak

performance, 180.6TFlops Linpack performance

• The 10th in TOP500 in Nov. 2008, the fastest machine outside USA

Lenovo DeepComp 7000 Hybrid cluster

architecture using Intel multicore processors

Two sets of interconnect InInfiniBand Gb Ethernet

SAN connection between I/O nodes and disk array

145.965TFlops peak performance

106.5 Tflops Linpack performance

The 19th in TOP500 in Nov. 2008

Dawning Nebulae: 3PFlops (2010)

Ranked Top500 #2, Linpack 1.271PFlops

Dawning 6000 • Hybrid system

• Service unit (Nebulae) • 9600 Intel 6-core

Westmere processor • 4800 nVidia Fermi

GPGPU • 3PF peak performance • 1.27 Linpack

performance • 2.6 MW

• Computing unit • Domestic processor

Tianhe-IA • Hybrid system

• 14336 General purpose unit--Intel 6-core processors

• 7168 Acceleration unit—NVIDIA Fermi GPUs • 2048 Service unit—FT-1000 processors • 80Gbps NUDT Proprietary Th-Net(Hierarch Fat Tree) • Kylin Linux OS • MPI + OpenMP/Pthread + CUDA/OpenCL

• 4.7PFlops peak, 2.57PFlops Linpack(>50% Eff.)

• 262TB Mem. 2PB Storage, • Water cooling, 4.04MW (635.15MF/W) • 120 Compute,14 Storage,6

Communication • Installed on Aug., 2010 • TOP500 No.1 on Nov. 2010.

Quad cpu blade

TH-1A System

FT-1000

X5670

M2050

Chips

Twin GPU blade

Compute node

rack (16 x cn)

Cabinet (4 x rack)

On-line storage

TH-Net

(4CPU+2GPU)

From chips to Entire system

TH-1A software stack

Sunway Bluelight MPP Designed by National Engineering Center for Parallel Computer

Developed for the National Supercomputing Center(Shandong), Jinan, China

8704 CPUs, 1.07 Petaflops peak performance

Linpack 795.9TFlops, 74.37%. 741.06MFlops/W

Infin QDR 40Gbps, Power Consumption 1.07MW, Water Cooling

Multi-core(16) Processor SW1600 designed by China

Released on HPC China 2011@Jinan

Sunway Bluelight Architecture

• SW1600 CPU:16 Cores/975~1100MHz/124.8~140.8Gflops

• Fat Tree,QDR 4X10Gbps Infiniband,MPI latency 2us:

• SWCC/C++/Fortran/ UPC/MPICC/Mathematical Library

• Storage:2PB,Peak I/O:200GB/s,IOR(~60GB/s)

Parameters:

Outline

• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development

trends of China • Summary

Profile of user number

37%

20%

10%

8%

7%6% 2%2%

8%

Basic science research (Physics,Chemical, Astronomy, etc)Bio-medical research

New material, new energy research

Computing fluid dynamics

Engineering design, simulation andanalysisEnvironment science

Weather and climate forecasting

Petroleum exploration

Animation

Number of Users Profile on TH-1A

Profile of resource usage

24%

7%

7%

2%4%5%

8.2%

41.8%

Petroleum exploration

Bio-medical research

New material, new energy research

Environment science

Basic science research (Physics,Chemical, Astronomy, etc)Computing fluid dynamics

Weather and climate forecasting

Animation

Engineering design, simulation andanalysis

Resource Usage Profile on TH-1A

• Joint work • Shanghai Astronomical Observatory, CAS (SHAO), • Institute of Software, CAS (ISCAS) • Shanghai Supercomputer Center (SSC)

• Building a high performance parallel computing software platform for astrophysics research, focusing on the planetary fluid dynamics(thermal convection in the Earth’s outer core) and N-body problems

• New parallel computing models and parallel algorithms studied, validated and adopted to achieve high performance.

Parallel Computing Software Platform for Astrophysics

Software Architecture

Physical and Mathematical

Model

Parallel Computing

Model

Numerical Methods

MPI OpenMP Fortran C

100T Supercomputer

PETSc Aztec

Software Platform for Astrophysics

Web Portal on CNGrid

Fluid Dynamics N-body Problem

Improved Preconditioner

Improved Lib. for Collective

Comunication SpMV

FFTW GSL

Lustre

Software Development

Data Processing Scientific Visualiztion

• The early performace evaluation for Aztec code and PETSc code on Dawning 5000A is shown.

• For 80×80×50 mesh, the execution time of Aztec program is 4-7 times of the PETSc version, average 6 times;

• For 160×160×100 mesh, the execution time of Aztec program is 2-5 times of the PETSc version, average 4 times.

PETSc Optimized Version 1 (Speedup 4-6)

0

200

400

600

800

1000

1200

1400

1600

32 64 128 256 512 1024 2048

Runt

ime

(s)

Processor core

Mesh 160×160×100 (Dawning 5000A)

Aztec PETSc

0

50

100

150

200

250

300

350

400

16 32 64 128 256 512 1024 2048

Runt

ime

(s)

Processor core

Mesh 80×80×50 (Dawning 5000A)

Aztec Petsc

Method 1: Domain Decomposition Ordering Method for Field Coupling

Method 2: Preconditioner for Domain Decomposition Method

Method 3: PETSc Multi-physics Data Structure

PETSc Optimized Version 2 (Speedup 15-26)

Left: mesh 128 x 128 x 96 Right: mesh 192 x 192 x 128 Computation Speedup: 15-26

Strong scalability: Original code normal, New code ideal Test environment: BlueGene/L at NCAR (HPCA2009)

2012/2/16

Strong Scalability on Dawning 5000A

0

200

400

600

800

1000

1200

1400

1600

32 64 128 256 512 1024 2048 4096 8192

Tim

e(Se

cond

s)

Processor Cores

Dawning 5000A(160×160×100 mesh size)

Aztec Petsc

44

Strong Scalability rotm p linea r: 192x192x128

433.6

212.8

98.5

51.1

26.1

14.4

8.3

4.7

12.0

144.8

65.5

19.232.369.3157.7

257.1

13.523.8

344.7

1

10

100

1000

64 128 256 512 1024 2048 4096 8192

num b er of p rocessor core

Time(S)

BG /L

曙光5000A

深腾7000

2012/2/16

Strong Scalability on TianHe-1A

• A fully implicit shallow water atmospheric model(ISCAS) • Using 82,944 cores • Parallel efficiency 60% • #unknowns: 680M

• Petroleum seismic data processing(BGP) • GeoEast-lightning single/double-way wave

prestack depth migration software • using 85860 cores • 24.6TB data • 16hours

TianHe-1A Applications Case Study(CPU only)

TianHe-1A Application Case study(CPU+GPU) • Direct Numerical Simulation of Turbulent Flow(PKU)

• GPU-accelerated FFT solver (PKUFFT) • Taylor micro-scale Reynolds number up to 1164 • Grid resolution up to 143363

• 7168nodes, >3.2million cuda cores(>100,000 gpu cores) • 30TFlops(SP) /17TFlops(DP) FFT sustained performance(SP)

Jaguar

PKUFFT(With GPU)

MKL(Without GPU)

• High speed particle collision system simulation • Force calculation is accelerated by GPU • 21.9x speedup on a single GPU compared to a single CPU core • Excellent weak and strong scalability with up to 4096 nodes (106,496

cpu/gpu cores) for problems with up to 11.16 billion atoms • Embedded Atom Method potential. scale to the whole system is

expected

TianHe-1A Application Case study(CPU+GPU)

• Trans-scale Simulation of Silicon Deposition Process(IPECAS) • Scalable bond-order potential (BOP) for the molecular dynamics

simulation of crystalline silicon • 26 nm × 54 nm × 1560000 nm (1.56mm), 110.1 Billions Atoms • Peak Perf. 7.38PFlops(SP),(7168 (Tesla M2050 + 2-way 5670 Xeon)) • 1.17Pflops in SP plus 92.1Tflops in DP on 7168 GPUs and 86,016 CPU

cores, 5TB Mem. • 1.87Pflops in (SP) on 7168 GPUs (25.3% Peak) • 758 flop per step per atom, 44.53s per 1000 steps run.

TianHe-1A Application Case study(CPU+GPU)

1.56 mm 0.54 nm

Outline

• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development

trends of China • Summary

• Ten public supercomputing centers • Beijing(CAS), Tianjin, Shandong, Shanghai, Shenzhen,

Chengdu, Hu’nan, Wuhan, Guangzhou, Chongqing • Covering the developed areas of China • Growing in industry design and simulation

• Five private centers in mature fields • Petroleum, Meteorology, Aerospace, Defense,

Energy • Related to country security

• Four centers in emerging areas • Cyberspace security, Internet service, Sensing

China, Triple-play

Demand for petascale HPCs will be growing in the next 5 years

Performance Development Trend of China TOP100 HPC

Performance Development Trend of China HPC(1993-2011)

110

1001000

100001000001E+061E+071E+081E+091E+101E+111E+121E+13

1993

1995

1997

1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

2019

2021

2023

2025

Year

GFl

ops

No.1 Linpack

No.1 Peak

Total Perf.

Total Perf. Trends

No.1 Peak Trends

No.1 Linpack Trends

趋势和展望 (1) Trend & Outlook (1)

• 1993-2011发展 China HPC performance increase • 1993年到1996年发展平稳 1993-1996 Slow steady • 1996年到1999年第一次跨越式发展 1996-1999 Big jump • 1999年到2001年平稳发展期 1999-2001 Slow steady • 2001年到2005年另外一次快速发展时期 2001-2005 Another

period of big increase • 2005年到2007年重新进入平稳发展期 2005-2007 Slow steady

again • 2008年到2010年开始进入另外一个活跃发展周期,大约会持续2

到3年 After 2008, dramatic increase in the next 2-3 years • 2011年,开始进入一个平稳发展期,大约持续2到3年。 Slow

steady again in the next 2-3 years

趋势和展望 (2) Trend & Outlook (2)

过去的预测和(实际情况) Previous Predictions • 峰值100TFflops的机器在2007年到2008年间出现(2008年10月)

2007-2008: System with peak performance of 100TFlops (Reality: Oct 2008)

• 累计Linpack性能将在2008年到2009年间超过PFlops(2008年10月) 2008-2009: Total Linpack performance exceeds Pflops (Reality: Oct 2008)

• 峰值PFlops的机器将在2010年到2011年间出现(提前完成!) 2010-2011: System with peak performance of 1PFlops (Reality: Oct 2009)

• 累计Linpack性能将在2011年到2012年间达到10PFlops 2011-2012: Total Linpack performance reaches 10PFlops(Reality: Oct 2011)

趋势和展望 (3) Trend & Outlook (3)

未来的预测 Future Predictions • 峰值10PFlops的机器将在2012年到2013年间出现

2012-2013: System with peak performance of 10 PFlops • 峰值100PFlops的机器将在2014年到2015年间出现

2014-2015: System with peak performance of 100 PFlops • 累计Linpack性能将在2013年到2014年间达到100PFlops

2013-2014: Total Linpack performance reaches 100 PFlops

Outline

• Background of China HPC TOP100 • Analysis of 2011 China HPC TOP100 • Overview of China 863 key project • Petascale Applications on TianHe-1A • Future HPC performance development

trends of China • Summary

• With correct strategies, China wins the HPC Olympic Games on 2011, and HPC is really helping science and economy development of China.

• HPC real application still lag behind the US, Euro and Japan.

• On TianHe-1A, several applications can scale up to 80,000 cpu cores.

• The growth rate of China HPC Perf. Is the fastest. • There will be at least 19 major petaflops

supercomputing centers within 5 years.

Summary

• First Petaflops supercomputer totaly powered by domestic processor designed by China has been released on HPC China 2011@JiNan;

• According to TOP100 predictions, 10 Petaflops peak performace supercomputer will appear before 2013;

• According to TOP100 predictions, 100 Petaflops peak performance supercomputer may appear before 2015;

Summary

Thank You

• Thanks Yutong Lu, Chao Yang, We Ge • Contact: Yunchuan Zhang, Ph.D. • Emails: zyq@mail.rdcps.ac.cn

samss@mail.rdcps.ac.cn