The K computer - Fujitsu€¦ · Part1 Harnessing the Fastest Computing Speed in the World Part2...

Post on 21-Jul-2020

0 views 0 download

Transcript of The K computer - Fujitsu€¦ · Part1 Harnessing the Fastest Computing Speed in the World Part2...

Part1▶Harnessing the Fastest Computing Speed in the WorldPart2▶What “K computer” Offers to the World and to the Next GenerationPart3▶Science and Technology for the Benefit of Society

A Supercomputer to Construct a New Social Paradigm

CrossMedia Special日経コンピュータ ITpro

This offprint contains a slightly edited version of an article that appeared in the Nikkei Computer edition of January 5, 2012.

Nikkei Business Publications, Inc.All Rights Reserved.Not for Sale

AA0202

The “K computer”

2    3

CrossMed ia Spec ia l日経コンピュータ ITpro The “K Computer”

A Supercomputer to Construct a New Social Paradigm

“There is no such thing as too much computing power.” In some fields of research, pushing the limits of computing power directly affects the outcome of scientific efforts. As of November 2011, the supercomputer named “K” (after the Japanese word for “10 quadrillion”) has reached 10.51 petaflops and is the machine with the highest ranking per-formance in the world, making it an extremely valuable tool for scientists in various relevant fields. Leading usage has begun and is already showing concrete results, with fully shared operation slated to begin in November 2012. The Ministry of Education, Culture, Sports, Science and Technology (MEXT) as the developing agency of the “K computer” has identified five areas that are expected to benefit in particular from access to this level of computing power. Out of these five strategic areas, representative ex-amples of expected benefits to society are analytic projec-tions of the effects of earthquakes and tsunami, discovery of drugs for cancer treatment, and development of new ma-terials for faster semiconductors.

Predicting tsunami effects in addition to earthquakes An example where the computing power of a supercom-puter can have a direct influence on daily life is earthquake and tsunami simulation. When such simulations are used

as the basis for disaster prevention and mitigation mea-sures, it become possible to “quickly calculate expected seismic movements and tsunami height at the first onset of an earthquake, for assessing the situation with minimum delay” says Takuto Maeda (Photo 1), Project Research Associate, Earthquake Research Institute, the University of Tokyo. By analyzing the values obtained through earthquake ob-servations, scientists are able to better understand the inter-nal structure of the earth. The composition of the earth dif-fers depending on the location, as does the speed at which seismic energy propagates. Using such knowledge about geological structures, a model can be created in a super-computer where an earthquake is occurring in a given loca-tion, and the influence of that earthquake can then be as-sessed. Figure 1 shows an example where the tsunami triggered by the earthquake that occurred in March 2011 off the Pa-cific coast of the Tohoku region has been calculated and visualized by the “Earth Simulator” supercomputer owned by the Japan Agency for Marine-Earth Science and Tech-nology. Using data such as the epicenter location of the quake, its energy etc., the simulation has produced results that closely duplicate the actual event. To reproduce a tsu-

Harnessing the Fastest Computing Speed in the WorldPart 1

Bringing Tsunami Prediction, New Drug Discov-ery, Next-Generation Materials Closer to Reality

nami along with an earthquake in such as way involves an enormous amount of computation. This is due to the fact that the process must be divided into a so-called mesh of very fine units.Because the earthquake affects a wide and deep area, a very large space be-comes the target of calculation. In the Tohoku example, the size of this space amounts to 1200 (North/South axis) x 800 East/West axis) x 200 (depth) kilo-meters. By contrast, the depth of the sea even at the Japan Trench is only about 10 kilometers. In order to simulate a tsunami, the analysis must divide this into depth units of at most 250 meters. The analysis of the Tohoku earthquake was carried out using units of 1000 x 1000 x 250 meters, but according to Mr. Maeda who was in charge of the calcu-lation, this still was insufficient. He explains: “By using the ‘K computer’ we are able to perform a more detailed analysis. The capability to faithfully visualize a tsunami using a fine mesh will represent a breakthrough.” The software application used for this purpose is called “Seism3D” and requires high memory transfer speeds. It is suitable for the vector type “Earth Simulator” supercom-puter. A process that required 8 hours on the “Earth Simu-lator” (using about 1/6 of its total performance capacity) was completed in only one hour on the “K computer” (uti-lizing somewhat over 2000 nodes which represent only about 1/40 of its total performance capacity).

Supercomputer progress enables IT drug discovery Another example for a field where supercomputers prove useful is drug development. Since August 2010, the Re-search Center for Advanced Science and Technology of the University of Tokyo is utilizing a supercomputer in a program aimed at the development of drugs to treat liver cancer and rectal cancer (Figure 2). It is expected that about ten years will elapse before the drugs will reach the market, including the time required for clinical trials. However, by repeating a cycle of supercomputer based cal-culations and manufacture/testing, the core formulation can be completed in about three years. The supercomputer used is a cluster type with 3600 CPU cores using Fujitsu PRIMERGY BX922 S2 blade servers. Its performance is 38.3 teraflops which is about equal to the first-generation “Earth Simulator” in 2002 (40.98 tera-flops). Even with performance on this level, it is still only possible to work on two drugs at the same time.

When the “K computer” becomes available, its full per-formance will be about 240 times better than current lev-els. As Project Professor Hideaki Fujitani (Photo 2) of the Research Center for Advanced Science and Technology puts it, “the more the computing power of supercomputers increases, the more drugs can be developed.” If a calcula-tion job for a drug that currently takes about a month can be completed in several hours, the repeated calculation and prototype/testing cycle becomes smoother and develop-ment times can be shortened. Why can pharmaceutical drugs be developed on a super-computer? Such drugs are compounds designed to inacti-vate a target protein that is causing an illness. By combin-ing the compound (antibody) with the protein (antigen), the capability of the protein to break down or bond with other proteins is inhibited. This can for example be used to inhibit the function of cancer cells, thereby preventing a recurrence or spread. The role of the supercomputer is in the design of the com-pound so that it best matches the shape of the target pro-tein. Proteins are constantly moving in the human body and changing their shape. In order to simulate this process using molecular dynamics, a fast supercomputer is re-quired. The computational model already exists since the 1990s, and the simulation method is also well established. But with the computers of five to ten years back, the simula-tions could not be run and used for the design of drugs. “The progress made in the last two to three years finally has made it possible to design real drugs” says Prof. Fujitani. The molecular dynamics application has already been

Photo 1 Takuto maeda, Project Research As-sociate, Center for Integrated Disaster Infoma-tion Research, Earthquake Research Institute, the University of Tokyo

Figure 1 Example for Earthquake and Tsunami SimulationSimulation provides a visualization of seismic activity and tsunami during the Great East Japan Earthquake

Antigen

Antibody

Figure 2 Development Example of Can-cer Treatment DrugMutual interaction of antigen that is part of a cancer cell and antibody used for treatment is simulated using molecular dynamics, to verify the effect of synthetic antibodies

Photo 2 Hideaki Fujitani, Project Pro-fessor, Department of Systems Biology and Medic ine, Research Center fo r Advanced Science and Technology, the University of Tokyo

4    5

CrossMed ia Spec ia l日経コンピュータ ITpro The “K Computer”

A Supercomputer to Construct a New Social Paradigm

The 38th edition of the TOP 500 List (Table 1) which represents the latest ranking of the world’s most powerful supercomputers was released on November 14, 2011 (U.S. time). Maintaining its ranking from the previous edition

(released in June 2011), “K computer” at Riken occupies the top of the list. The system consists of 864 racks (with a total of 88,128 CPUs) and has achieved an impressive 10.51 petaflops, topping its own previous record for the world’s highest processing power. Delivery of all nodes of the “K computer” to RIKEN was completed in August 2011. Operation management soft-ware is currently under preparation, with full operation scheduled to start in November 2012. Various functions such as dividing all nodes efficiently into sections for scheduling multiple jobs are also being developed. In January 2012, Fujitsu will start shipping the PRIMEH-PC FX10 commercial-use supercomputer (Photo 1) which represents a further advancement of the technology real-ized in the “K computer.”

Selected by the Information Technology Center of the University of Tokyo The President of Fujitsu’s Technical Computing Solu-tions Unit, Masahiko Yamada (Photo 2), who is in charge of promoting the company’s global supercomputer busi-ness, points out that the biggest appeal of the FX10 is the

existence of the “K computer.” The “K computer” boasts the highest computing performance in the world, and its facilities are also unique in the world. The “K computer” will have an accelerating effect on research in the petaflops range, enabling the pursuit of R&D projects that so far simply were out of reach. Usage by the industrial world is also expected. The FX10 is the ideal supercomputer for driving R&D in the petaflops range, and its software com-patibility with the “K computer” will provide an efficiency

boost for work in this area. The Infor-mation Technology Center of the Uni-versity of Tokyo has already commit-ted to the purchase of a 50-rack with 4800 nodes (theoretical performance 1.13 petaflops). This system is sched-uled to start operation in April 2012.

Reflecting advances in CPU and memory Another advantage of the FX10 is the fact that it provides further improved semiconductor performance (of CPU, memory, etc.) compared to the “K computer.” While retaining the basic construc-tion of the “K” system including inter-

transferred to the “K computer.”

Gordon Bell Prize for next-generation semiconductors Supercomputer based simulation also plays a large role in the design of new semiconductor devices that drive the progress of computer performance. On November 17, 2011 (U.S. time), a study group comprised of members from the RIKEN research institute, the University of Tsukuba, the University of Tokyo, and Fujitsu received the Gordon Bell Prize for Peak Performance, in recognition for performing a calculation on the proper-ties of silicon nanowire, a promising core material for next-generation semiconduc-tors, on the “K computer.” The Gordon Bell Prize is a prestigious international award program in the field of high-performance comput-ing. The Peak Performance category in particular is award-ed for demonstrating the highest performance worldwide with a genuine application program. Professor Atsushi Oshiyama (Photo 3) of the University of Tokyo who is promoting the project points out that “from this century onward, science is driven no longer by mathe-matics but by ‘computics,’ a discipline that integrates com-putational materials science and computer science.” The application that was used to calculate the electron states of nanowire was developed by Project Lecturer Junichi Iwata

(Photo 4) of the University of Tokyo and is called “Real Space Density Functional Theory” (RSDFT). Silicon nanowire is a next-generation semiconductor de-vice (Figure 3). With a conventional transistor, leakage current increases as the size gets smaller. With a silicon nanowire type transistor on the other hand, leakage current is blocked by a gate around the nanowire carrying the cur-rent. Figure 4 shows the atomic structure of nanowire as calculated by the supercomputer. The important thing is the fact that creating prototypes of such next-generation materials takes considerable time. If

the cycle of prototype creation and testing were to be performed without the help of a supercomputer, it would go on almost endlessly. “Calculation using a supercomputer on the level of ‘K’ is therefore essential” (Prof. Oshi-yama). The research for which the Gordon Bell Prize was awarded ex-plored the electron states of a nano-wire with approximately 100,000 at-oms (20 nanometers in diameter and 6 nanometers long), close to the actual size of the material. Computations on the “K computer” took some two to three weeks. Building on the results of such research, next-generation transis-tors using silicon nanowire are expect-ed to become commercially available by about 2020.

Photo 3 Atsushi Oshiyama, Professor, Department of Applied Physics, School of Engineering, the University of Tokyo

Photo 4 Junichi Iwata,Depar tment of Applied Physics, Graduate School of En-gineering, Computational Materials Sci-ence Initiative (CMSI) Project Lecturer, the University of Tokyo

Figure 3 Spatial Structure of Next-Gener-ation TransistorWith a conventional planar transistor, reduc-tion in size leads to an increase in leakage current at the lower par t of the substrate. With the silicon nanowire type, the surround-ing gate prevents the f low of leakage cur-rent.

Figure 4 Atomic Model of Silicon Nano-wireA cy l indr ica l w i re w i th a d iamete r o f 5 nanometers and a length of 10 nanometers contains about 10,000 silicon atoms. The “K computer” can perform computations for a wire with 10 times the size (about 100,000 atoms).

Gate

Gate

Source

Source

Drain

Drain

Conventional planar transistor

Silicon nanowire type transistor

What “K computer” Offers to the World and to the Next GenerationPart 2

Never Ending Development Creates the New “Next”

Photo 1 PRIMEHPC FX10 further improves on supercomputer technology realized in the “K computer”

Rank System Name System/Processor Con�guration Peta�opsExecution

Ef�ciency (%)

1 “K computer”(Japan) SPARC64 VIIIfx 10.51 93.2

2 Tianhe-1A (China) Xeon+NVIDIA GPU 2.566 54.6

3 Jaguar (USA) Cray XT5-HE(Opteron) 1.759 75.5

4 Nebulae (China) Xeon+NVIDIA GPU 1.271 42.6

5 Tsubame 2.0 (Japan) Xeon+NVIDIA GPU 1.192 52.1

6 Cielo (USA) Cray XE6(Opteron) 1.110 81.3

7 Pleiades (USA) SGI Altix ICE(Xeon) 1.088 82.7

8 Hopper (USA) Cray XE6(Opteron) 1.054 81.8

9 Tera-100 (France) Bull bullx super-node(Xeon) 1.050 83.7

10 Roadrunner (USA) Opteron+PowerXCell 1.042 75.7

Table 1 Top 10 Systems in TOP500 List (38th Edition, November 2011)

6    7

CrossMed ia Spec ia l日経コンピュータ ITpro The “K Computer”

A Supercomputer to Construct a New Social Paradigm

connects and water cooling system, the semi-conductor technology has been further refined. With a view towards use by general type data centers, a Cooling Distribution Unit (CDU) has been newly made available as peripheral equipment. Semiconductor technology continues to march forward day by day, and the FX10 re-flects the latest advances (Table 2). The num-ber of cores per CPU was doubled from 8 to 16. The performance of a single CPU has jumped from 128 gigaflops to a maximum of 236.5 gi-gaflops. In the memory system, transfer perfor-mance has increased from 64 GB/s to 85 GB/s, and the maximum capacity of memory per CPU is now 64 GB, up from 16 GB. To the application user, the FX10 looks very similar to the “K computer.” Compiled applica-tions will run as is, and the number of nodes used at the same time (degree of parallel operation) can be accommodated by changing the compiler options or tuning the source code.

Providing availability also to foreign government agen-cies, etc. The geographical areas where the FX10 will be available are not limited to Japan. As President Yamada stresses, “Many businesses the world over have requirements for seriously fast computing. This includes for example flight simulations for a whole aircraft that are difficult to imple-ment with conventional means, oil field drilling explora-tions, and other cases where simple petaflops are not enough.” Fujitsu also plans to provide the FX10 hardware to overseas government agencies that require very high computation processing power.

Looking towards the exaflops domain Fujitsu started its investigation of realizing the next gen-eration supercomputer of which performance far exceeds the “K computer” and the “FX10”. As its technological ba-sis, they first develop the next generation commercial product. They have already started its designing. The “K computer” was also a result of continuous devel-opment from the commercial product “FX1”. Interconnect architecture of the “FX”1 is widely-used Infiniband, while it implements by hardware the collective communication and barrier-synchronization which often becomes a bottle-neck of the communication. This became a basis of the “K computer”. Mr. Oinaga (Photo 3), the Head of the Next Generation Technical Computing Unit, who leads the development team expresses his prevision upon the next generation su-per computer: “Over the last 50 years, supercomputer’s

performance has doubled about every 18 months. By following this trend, we expect reaching to the Exa-Flops level, 100 times faster than the “K comput-er”, in 2018 to 2019. It appears that Fujitsu intends passing over its devel-opment asset of the “K computer” to the next generation.

Photo 2 Masahiko Yamada, Presi-dent of Technical Computing Solu-tions Unit, Fujitsu

Photo 3 Yuji Oinaga, President of Next Generation Technical Comput-ing Unit, Fujitsu

Item “K computer” PRIMEHPC FX10

CPU

Name SPARC64 VIIIfx SPARC64 IXfx

Number of cores 8 16

Semiconductor process technology

45 nm 40 nm

Operating frequency 2 GHz 1.65/1.848 GHz

Theoretical performance 128 giga�ops 211.2/236.5 giga�ops

MemoryCapacity (per CPU) 16 GB 32/64 GB

Memory transfer speed 64 GB/s 85 GB/s

Table 2 Comparison between “K computer” and PRIMEHPC FX10Main improvements in PRIMEHPC FX10 are in CPU and memory aspects

There is not a single day when I do not use a personal computer. This is because a computer is indispensable for both of my favorite pursuits, namely the study of insects and playing computer games. I am researching a genus of beetle called the elephant beetle, and I use the computer to enter data about individual specimens, such as where found, by whom, etc., along with various photos and image data. This is absolutely necessary, since there are several tens of thousands of specimens for the elephant beetle alone. In order to put this vast amount of data that I have amassed into a form that can be useful for research, I am planning to turn it into a database, but so far, this database still exists only in my head.

As a matter of fact, I have made a valiant attempt to orga-nize some of my elephant beetle data using Excel, but I came to realize that the enormous effort required to do so might take me until the end of my days. Not only is it an extremely complex task to correlate photos of actual in-sects with various other data in a form that can be handled by a computer, the sheer difficulty and unwieldiness of data input also precludes smooth progress. The case of insects is only one example that shows that the most important aspects when wanting to harness a com-puter for a certain task are data standards and input meth-ods. Can the task really be handled properly on a comput-er? That depends on the qual i t ies of the data to be

processed. The same can be said of a supercomputer. Its usefulness will differ depending on the pur-pose. In cases where data input is not a problem, in other words for applications that lend themselves to computer process ing , i t i s bound to prove an extremely valu-able tool.

The “ informatization” of the real world The enormous amount of data generated by the real world has to be input in some form and pro-cessed by computers. This need exists practically everywhere. But when it comes to extracting some kind of universal theory from the mountains of data, some cultures are more adept at this than others. The Englishman Charles Darwin

Science and Technology for the Benefit of SocietyPart 3

What Can We See With the World’s Best Microscope?

RIKEN and Fujitsu are jointly engaged in developing the supercomputer “K computer.” As related in Part 2, on November 11, 2011, it was ranked topmost in performance worldwide for the second time in a row. With shared use scheduled to begin in November 2012, great expectations are being placed in the re-sults expected to be achieved with this system. We have asked Dr. Takeshi Yoro, an anatomist who is also famous as a social commentator and science philosopher about his opinions regarding the use of the “K computer.”

Born in 1937. Graduated from the University of Tokyo Graduate School of Medicine in 1962 and joined the Department of Anatomy. Retired from his post as professor at the University of Tokyo Graduate School of Medicine in 1995, and served as professor at Kitasato University. Currently a pro-fessor emeritus of The University of Tokyo. Active not only in the fields of anatomy and philosophy of science but also as a social and literary arts commentator. Published works include Karada no mikata”[Looking at the Body] (Chikuma Shobo), Yuinoron [Brain Theory] (Seidosha), Ningen ka-gaku [Human Science] (Chikuma Shobo), Baka no kabe [The Wall of Fools] (Shinchosha, received Mainichi Publishing Culture Award), Shi no kabe

[Wall of Death] (Shinchosha), Dejitaru konchu zu-kan [Digital Insect Encyclopedia] (Nikkei Business Publications), etc.

Takeshi Yoro

8  

CrossMed ia Spec ia l日経コンピュータ ITpro

collected data about all sorts of creatures during his voy-age on the Beagle and then used these data to develop a universal idea, namely the theory of evolution. This of course was long before the age of computers. Generally speaking, Japanese people have trouble arriving at univer-sal constructs from lots of data. I personally also am not very good at this. But why is that so? I think it may be due to the strong emphasis that Japanese culture places on sub-jective feelings. Japanese have a definite knack for isolat-ing certain aspects of the real world and dealing with them intuitively in concrete terms. But when it comes to collect-ing all these data from the real world in computers, and us-ing them to build abstract theories, they tend to fail rather miserably. While they are good at building the computer hardware, they are less successful at populating it with ac-tual data and using it on a more conceptual level. Why would that be? I believe it is because Japanese value sensibility while people from Western cultures value struc-ture. This can be seen for example in cinema, with Japa-nese films often being centered on subjective emotions while Western films tend to be more structured. I saw “Wild Strawberries” by the Swedish director Ingmar Bergman when I was a graduate student, and at the time I was totally baffled by it. But when I watched it again in my middle age, it was a very different experience. It simply depicts a day when an aged physician travels to his university to receive an hon-orary degree. However, this single day and the protago-nist’s whole life are unfolding in the movie as a parallel construct. When I first saw the film as a young man, I did not notice this construct, and did not understand what makes the film so interesting. Such differences in society and culture can also be ob-served in language. The fact that the Japanese language has no relative clauses (parts of speech introduced by a relative pronoun such as “that” or “which”) is significant here, I believe. A statement in a Western language uses certain conjunctions (such as “therefore”) to link multiple main clauses and others (such as “although” and “because”) to link a main clause and subordinate clause. Japanese does not have these constructs. In the Japanese language, a statement often consists simply of a string of clauses, linked by conjunctions that do not specify the relationship. I think this clearly points to a difference in how Western people and Japanese people use their brains.

The supercomputer as a microscope Manually entering data about the real world is a difficult undertaking. But once the data have been input, computers are extremely useful tools for arranging them and building logical systems. This usefulness is taken even further when

sensors are employed to mechanically provide values de-rived from observation of the real world. When I was a student at university, I understood the struc-ture of protein through X-ray analysis. At that time, I thought “combining X-ray analysis with a computer cre-ates a microscope.” While the X-ray analysis data by them-selves have no relationship to the shape of molecules, pro-c e s s i n g t h e m i n a c o m p u t e r c a n m a k e t h e t h r e e -dimensional structure of molecules visible.The computer can become an ultra-high resolution micro-scope as well as a telescope. The same applies to CT

(Computed Tomography) and MRI (Magnetic Resonance Imaging). The raw CT and MRI data are simply a bunch of numbers. Just viewing these figures would tell the doctor nothing. But by converting the figures into images, the computer becomes a tool that supplies concrete visual in-formation, making it highly useful for diagnosis and treat-ment. Like CT and MRI, a supercomputer also has the potential for tremendous usefulness if used properly. Simulation calculations run on a supercomputer drastically enhance the capability of computers to function as microscopes and telescopes. Earthquake simulations in principle are similar to viewing the earth with CT and MRI. But there are also simulations that do not require the input of observation data and data collected from the real world. Although it has not actually been calculated, the number of amino acids is only somewhat more than ten which should make it possible to theoretically create all possible proteins in the computer.

Using the world’s best tool If we consider the “K computer” as the world’s most powerful microscope or telescope, what will be able to see with it? Since it is a tool unlike any other, it should show us things more clearly than ever before. However, seeing something too well can also bring problems, making it harder to take in the whole picture. We might not be able to see the proverbial forest for the trees. Of course, the scientists and researchers who work with supercomputers surely are well aware of this. When the ability to see is enhanced, things that were not visible be-fore come into focus. New results are obtained. This is sure to happen. And the pursuit of tools to see even better is bound to continue. But whether a clearer view can contribute to the better-ment of society, that depends in the end on what is being observed, and which universal concepts can be drawn from it. We need to keep this in mind. (Transcribed from an in-terview)