Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre...

29
Beyond Earth Simulator 海洋研究開発機構 地球情報基盤センター 情報システム部 塚越 眞 September 11, 2017 Makoto Tsukakoshi Information Systems Department Center for Earth Information Science and Technology JAMSTEC International Computing for the Atmospheric Sciences Symposium (iCAS2017)

Transcript of Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre...

Page 1: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Beyond Earth Simulator

海洋研究開発機構

地球情報基盤センター

情報システム部

塚越 眞

September 11, 2017Makoto Tsukakoshi

Information Systems Department

Center for Earth Information Science and Technology

JAMSTEC

International Computing for the Atmospheric Sciences Symposium (iCAS2017)

Page 2: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Japan Agency for Marine-Earth Science and Technology

2

Page 3: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

The main seven research and development issues during the third mid-term plan

During the third mid-term plan, we set and address the seven research and development issues with all our strength due to promote strategic and focused research and development

based on the national and social needs.

Exploring untapped

submarine resources

Detecting signals of global environmental change

Understanding seismogeniczones, and contributing to disaster mitigation

Marine Bioscience -Exploring the unknown extreme biosphere to solve the mystery of life

Ocean drilling –Getting to know the Earth from beneath the seabed

Information Science -

Predicting the Earth's

future by simulations

Construction of research base

to spawn the ocean frontier3

Page 4: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Typhoon Vera in 1959 by Kazuhisa Tsuboki (2015)Typhoon-Ocean Interaction Study Using the Coupled Atmosphere-Ocean Non-hydrostatic Model:With Careful Consideration of Upper Outflow Layer Clouds of Typhoon

Typhoon-marine interaction using nonstatic atmospheric wave ocean coupling model

4

Page 5: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

5

• Earth Simulator :- Developed by NASDA, JAERI and JAMSTEC (563.4* Oku Yen)- Operation started on March 2002 and immediately recognized as #1 supercomputer- “It will keep #1 for 2 years in peak and 5 years in effective performance” by H. Miyoshi

*inclusing facilities and building

Page 6: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Earth Simulator has changed its rolewhen “K computer” is completed

既存のスーパーコンピュータは、本プロジェクトで開発する計算機システムの完成後も、例えば地球シミュレータについて

は、これまでに蓄積してきたアプリケーションソフトウェアを活用して効果的・効率的に海洋地球科学分野を主とした計算を

担当するなど、それぞれが得意とする各分野を中心として、目的や性能に応じた計算に応するとの役割分担の下に、今後

も増大が予想される我が国の計算機需要に応えることとしている。

Council for Science and Technology Policy (CSTP) Report on Nov 25, 2005:

After the completion of Leading Edge High Performance Supercomputer (i.e. K-Computer), other existing supercomputers should carry their roles in each fields, according to their purpose and performance , for example, Earth Simulator, should responsible for Marine and Earth Science on the base of accumulated application software assets, and thus they should meet growing demand for HPC in our country.

Operated as national infrastructure thorugh HPCI*

* “Innovative High Performance Computing Infrastructure” initiative

ES

ES

K-computerHPCI Strategy - MEXT

6

Page 7: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

March 2002Earth Simulator (1st)40TFLOPS / 10TB

1 TFLOPS

10 TFLOPS

100 TFLOPS

1 PFLOPS

10 PFLOPS

March 2009Earth Simulator (ES2)131TFLOPS / 20TB

Peak Performance 3x

* Single Program Turn Around Time Performance: 2.1x to 2.8x

Number of Programs: 4x= System Throughput 8x to 10 x

Foundation of Marine- Earth SimulationProof of Feasibility of

Simulation and Prediction

“Computenik” (Jack Donguarra,NY Times, April 20, 2002)

System for Marine-Earth Science

System for prediction and creation ofMarine-Earth Science data and information

for Society

8x to 10x Effective Performance based on Application Programs*

7

June 2015Earth Simulator1.3PFLOPS / 320TBPeak/Effective Performance 10x

3 Generation of Earth Simulator since 2002

About 5MW*

* Effective Power Consumption

About 3MW*

About 1.5MW*

Utilize Simulation, Prediction and data for SocietyChallenge new subjects in Marine-Earth Science

Flagship supercomputerfor All Science

Page 8: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

8

JAMSTEC Information System Outline

Earth Simulator(NEC SX-ACE)

Large Scale Shared Memory Server (SGI UV2000)

Super computer systems(ICE-X, UV2000, SX-9)

10Gbps ×2JAMSTEC backbone network

Earth Simulator backbone

SINETScience Information NETwork 4

(1G~10Gbps)

40Gbps ×2Visualization Server

User terminals

Mass Storage System

User terminals

Service servers

(HPE Apollo 6000, 380 node in 2018)

Page 9: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Earth Simulator Outline

Work

13.5 PB

/home87.6 TB

/data

4.7 PB

4 servers

Main Computer

Computing Nodes (5120) + Internode Network

Storage

Model: NEC SX-ACETotal Performance

- Number of nodes 5,120- Peak Performance 1.31 PFLOPS- Memory Bandwidth 1.31PB/s- Memory size 320 TB

Node Performance- Number of CPU 1(4cores)- Peak Performance 256 GFLOPS

(64GFLOPS x 4cores) Double Precision

- Memory Bandwidth 256 GB/s- Memory size 64 GBMass Storage System

JAMSTEC Backbone network

Frontend Server

Large Scale Shared Memory Server(UV2000)

Pre and Post System

17PB

Earth Simulator

9

- Memory Bandwidth 1.31PB/s

Page 10: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

温暖化 海洋/同化温暖化 地震 地震 ミクロ大気全球大気

Average (ASIS) 2.1

Average (TUNE) 2.8

Tuning

Benchmark Test showed ES’s value

The same #CPU (sockets) DOUBLES the performance with ASIS code.

2.8X with further tuning.

MIROC MIROC NICAM Specfem3D RSGDX MSSG ODA

Both ES2 and SX-ACE deploy same #CPU

(Performance Ratio vs. ES2)

This means : Effective Throughput Performance is > 10x(TAT – 2 to 2.8x, #of jobs 4x)

10

Indeed, FLOPs counts increased by 12x260 EFLOPs(executed in FY2012) to3,119EFLOPs(executed in FY2016)

Page 11: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Earth Science(18.8%)

Engineering(22.0%)

BiologyLife Science

(15.5%)

Atomic Energy Fusion(1.0%)

Mathematics(1.7%)

Particle PhysicsAstronomy(16.4%)

Material ScienceChemistry(24.3%)

Others(0.3%)

K Computer Usage in FY2015

Equivalent to 28% of Next Earth Simulator

Earth Science90.3%

Effective Resource Comparison with K-computer

Earth Simulator in FY2016

~28%

~2/3(estimation)

11

Page 12: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Earth Simulator Resource Allocation/Usage(FY2016)

Earth Simulator Resource Allocation (71 projects)

[node hour]

40%

10%

30%

20%

JAMSTEC proposed project (23)

Strategic project withspecial support (6)

Fee based usage(8)

Government contract (7)

External proposed project (27)

70.0%

20.3%

5.4%3.7% 0.7%

Earth Simulator Usage by Application [node hour]

Environmental Change

Earth Internal,Earthquake, Tsunami

Math Sci & Engineering

Industry &Innovation

Life Science

Institutes using Earth Simulator

Outside Japan

Industry

Government Agency

University

12

Page 13: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

• Stable Operation in FY2016

- Availability 99.86%,

- Utilization 89.07%

- Total 3,119EFLOPs * executed in FY2016

(7.5% of total peak performance :

24x366xall nodes)

Earth Simulator Operation Highlights

• Power Efficiency- Improved 30% from the previous ES2(NEC SX-9) - Continued effort to optimized the cooling “Total Executed FLOPs/Total Power*”

*:including facilities):

171TFLOPs/kwh in 2016- improved 9.2% from 2015

*Using 2016 Yellowstone CPU hours/24x366 (81%), 3,119EF requires 2.3% of peak performance on Cheyanne

13

Page 14: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

(proposal) Use actual FLOPs per watt to evaluate total energy efficiency of computing centers

Efficiency should be “actually achieved Performance/Total energy consumed” Peak or Linpack Performance : Don’t represent real work loads (“Green500”) HPCG or other Benchmark suites : Better but do not fit each center’s AP

spectrum (each centers should have been optimized to its own application work loads.)

Benchmarks show only capability. Do not represent the center’s operational efforts.

Power consumption must be actual and total include cooling.

Evaluate the energy efficiency by “Executed FLOPs per watt” ! Where, the energy is measured total inclusive amount.(Example)In 2016 (JAN-DEC), Earth Simulator (Entire System including cooling facilities) has actually achieved 178,956 GFLOPS/Kwh (7.6% of theoretical peak x 100% non-stop operation)Using HPCG/peak performance ratio (5.2%) and annual GWh from the annual report 2014, efficiency of “K-Computer” is estimated 109,221 GFLOPS/Kwh

14

Page 15: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Goal: “Marine-Earth Informatics”

1. Produce large and accurate datadata from observation, experimentation, simulationdata of various form (structured/unstructured, images, texts)

• Building large and precise data sets by accurate observation and simulation applying the latest data handling method.

2. Create information from large and diverse data• Development of a computing method to deal with large data with

high performance• Applying ML, pattern recognition, analysis of precursor

characteristic of rare events

3. Present new application and use of information• New use case of information for government and industry• New application area

4. Develop ICT technology to handle large and complex data• Integrating Simulation and Data Analysis• Interactive real time visualization• Leading-edge High performance platforms and devices

15

Page 16: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Historical Simulations6,000(AGCM)+3,000(RCM) years

database for Policy Decision making for Future climate change (d4PDF)• d4PDF is a database including both historical and +4K future climate simulation results,

and provides probabilistic Information on climate change in extreme events by high-resolution large ensemble simulations with a 60km AGCM and 20km NHRCM.

• 2.6M node hours* and 2PB output on Earth Simulator in 2015 by MRI and MIROC team.*9% of entire resources in 2015

MRI-AGCM60km

MRI-NHRCM

20km

downscaling

・60 years (1950-2010)・100 members (AGCM)・50 members (NHRCM)(SST perturbations representing observational error)

+4K Future Climate Simulations

5,400 years (AGCM+RCM)

・60 years (1950-2010)・90 members (AGCM)・90 members (NHRCM)(6 ΔSST from 6 CMIP5 models x 15 SST perturbations)

Histogram of daily mean precipitation at Tokyo (60km AGCM)

(a) Present day (b) Changes in the +4K world

Daily precipitation (mm/day) Daily precipitation (mm/day)

Shiogama,H. (2016) Database for Policy Decision making for Future climate change (d4PDF)

Freq

uen

cy(%

)

Rat

io o

f fr

eq. (

+4K

/pre

sen

t)

10yr

100yr

16

Page 17: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

•First-ever dataset covering the western North Pacific (15-65N, 117E -160W) over the last three decades (1982-2014) at eddy-resolving (about 10km) resolution.•Produced by 4-dimensional variational ocean data assimilation system, MOVE-4DVAR developed in JMA/MRI (Usui et al., 2015).

- 0.1°latitude × 0.1°longitude - vertical 54 layers(0-6300m depth)- temperature and salinity profiles above 1500m-depth, gridded sea surface temperature (SST)

and satellite altimeter-derived sea surface height (SSH) are assimilated•Provides accurate estimation of ocean environmental changes around Japan.•Available as a basic ocean reanalysis dataset for oceanography, climatology, meteorology, and fisheries.

http://synthesis.jamstec.go.jp/FORA/e/index.html

Four-dimensional Variational Ocean ReAnalysis for the Western North Pacific over 30 years (FORA-WNP30)

17

Page 18: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Advancement of prediction of tropical cyclone (TC) generation by deep learning

Example

(Currently) prediction two days before its generation by real-time simulation

and satellite observation data

Egg of TC

Egg of TC

not TC

Learning feature quantities of “egg” of TCs from simulation data

Applying to observation data

(Target) prediction 3 to 7 days before its generation by machine learning

Yamaguchi et al., 2016

Total >10,000,000 images + 2535 typhoon tracking data generated by NICAM simulation

Egg of TCs

TCs

Ex-TCs

Not TCs

Matsuoka et al., 7th International Workshop on Climate Informatics, Boulder, USA, Sep 2017 18

Page 19: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

next “SC” system – complements ES /meets growing needs

Earth Simulator

[High-end Supercomputer]

Next “SC” system

[Standard Linux Cluster]Standard CPU, Considerable computing loads, Commercial APs, Community codes

Growing& Emerging

Needs

Interoperability:job, file,

accounting etc.

SIMULATIONANALYSIS

• Enhance existing functions - pre and post processing, data analysis, hosting community APs• Flexible to meet the emerging and growing needs from various scientific approach• Capable to host project servers in the future

Custom CPU, High memory bandwidth, High vector performance, Proprietary codes

Used for:•Community codes on standard Linux clusters.•Programs of integer and Logical operation,• large memory space or high i/o performance.•Bio informatics. •Earth informatics. •Big data analysis.•Real time computing for observation and ship-operation.• Industry applications.

Integrated use with Earth Simulator:(example)•run reginal model downscaling from global model on ES•run short term forecast while ES runs seasonal forecast

19

Page 20: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

ES/SC Administration Server

• User I/F for both information & operation

• Autonomous data search & retrieve• Interface to Web Service

Meta D/B(data directory)

Data Management Layer Computing LayerUser/Service Layer

Internet

Cyber Manager

Data AnalysisData Management Server

Database Server Grand Challenge Simulation

JAMSTEC Future Platform

Mass Storage System

Backbone Network 100GbE

SINET5 Network Service

Frontend Servers

Internet

From Data, Create and Supply Information

VIsualization

20

Page 21: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Internet

Research Institutes

Users

Smartphones,Sensors

Images, locations

Various networks

Edge/Cloud ComputingPlatform

Edge Server

Data

Internet

Industry

Administration

JAMSTEC Future PlatformFrom Data, Create and Supply Information

Data Management Layer Computing LayerUser/Service Layer

Network Service

21

Page 22: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

ES/SC Administration Server

Data AnalysisGrand Challenge Simulation

Backbone Network 100GbE

VIsualization

JAMSTEC Future PlatformFrom Data, Create and Supply Information

Internet

Research Institutes

Users

Smartphones,Sensors

Images, locations

Various networks

Edge/Cloud ComputingPlatform

Edge Server

Data

Internet

Industry

Administration

Data Management Layer Computing LayerUser/Service Layer

Network Service

22

Page 23: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Search

RemoteObservation

Backbone Network 100GbE

JAMSTEC Future PlatformFrom Data, Create and Supply Information

Internet

Research Institutes

Users

Smartphones,Sensors

Images, locations

Various networks

Edge/Cloud ComputingPlatform

Edge Server

Data

Internet

Industry

Administration

Data Management Layer Computing LayerUser/Service Layer

Network Service

Data handlingSimulation,

Analysis

SINET5

leased line

23

Page 24: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Search

RemoteObservation

Data handlingSimulation,

Analysis

SINET5

leased line

Backbone Network 100GbE

JAMSTEC Future PlatformFrom Data, Create and Supply Information

Research Institutes

Users

Smartphones,Sensors

Images, locations

Edge/Cloud ComputingPlatform

Edge Server

Data

Internet

Industry

Administration

Data Management Layer Computing LayerUser/Service Layer

• User I/F for both information & operation

• Autonomous data search & retrieve• Interface to Web Service

Meta D/B(data directory)

Data Management ServerDatabase Server

Mass Storage System

Frontend Servers

Internet

Various networks

Network Service

Cyber Manager

24

Page 25: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

(1) Curationcurate>200TB deep sea image data

VisualizationLarge data handling

Various Database

Collaboration

Large Simulation

Data Analysis

JAMSTEC Future PlatformFrom Data, Create and Supply Information

(2) Data Managementprocess and manage observation data from >100 research cruises per year

(3) Integration of observation and simulationcreate new data sets by integrating observation and simulation data

(4) HPC TechnologyPFLOPS class supercomputers for large scale simulation, data analysis and machine learning

Information

25

Page 26: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

(1) Curationcurate>200TB deep sea image data

VisualizationLarge data handling

Various Database

Collaboration

Large Simulation

JAMSTEC Future PlatformFrom Data, Create and Supply Information

(2) Data Managementprocess and manage observation data from >100 research cruises per year

(3) Integration of observation and simulationcreate new data sets by integrating observation and simulation data

(4) HPC TechnologyPFLOPS class supercomputers for large scale simulation, data analysis and machine learning

Data Analysis

Information

26

Page 27: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

ES/SC Administration Server

• User I/F for both information & operation

• Autonomous data search & retrieve• Interface to Web Service

Meta D/B(data directory)

User/Service Layer

Internet

Cyber Manager

Data AnalysisData Management Server

Database Server Grand Challenge Simulation

JAMSTEC Future Platform

Mass Storage System

Backbone Network 100GbE

SINET5

Frontend Servers

From Data, Create and Supply Information

VIsualization

Internet

Research Institutes

Users

Smartphones,Sensors

Images, locations

Various networks

Edge/Cloud ComputingPlatform

Edge Server

Data

Internet

Industry

Administration

Network Service

可視化情報処理巨大データ圧縮技術

連携

Data handlingSimulation,

Analysis

SINET5

leased line

RemoteObservation

(1) Curation (2) Data Management

Data Management Layer Computing Layer

(3) Integration of observation and simulation

(4) HPC Technology

Information

27

Page 28: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Summary

• “Earth Simulator”, once flagship computer, is renewed and providing major computing resources for earth science in Japan.

• CEIST, Center for Earth Information Science and Technology, JAMSTEC, is widening the goals – focusing data and information.

• The future platform will be integrated and interoperable system consist of diversedlayers.

28

Page 29: Beyond Earth Simulator · accounting etc. ANALYSIS SIMULATION •Enhance existing functions - pre and post processing, data analysis, hosting community APs •Flexible to meet the

Thank you for listening !