Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184...
Transcript of Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184...
![Page 1: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/1.jpg)
Automatic Generation of Efficient Codes fromMathematical Descriptions of Stencil Computation
Takayuki Muranushi1 Seiya Nishizawa1 Hirofumi Tomita1
Keigo Nitadori1 Masaki Iwasawa1 Yutaka Maruyama1
Hisashi Yashiro1 Yoshifumi Nakamura1 Hideyuki Hotta2
Junichiro Makino3 Natsuki Hosono4 Hikaru Inoue5
1RIKEN Advanced Institute for Computational Science2Chiba University 3Kobe University4Kyoto University 5Fujitsu Ltd.
Sep 22, 2016
for FHPC 2016 workshop / ICFP’16 Nara, Japan
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 1 / 37
![Page 2: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/2.jpg)
Programming Language
Formura
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 2 / 37
![Page 3: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/3.jpg)
Programming language Formura
Domain specific language for stencil computaion
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 3 / 37
![Page 4: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/4.jpg)
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 4 / 37
![Page 5: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/5.jpg)
Good news of Formura 1/2
1:184 Petaflops (11.62% of the peak)
on 663,552 cores
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 5 / 37
![Page 6: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/6.jpg)
Good news of Formura 1/2
ACM Gordon Bell Prize Finalist
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 6 / 37
![Page 7: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/7.jpg)
Good news of Formura 2/2
@�
@t= �
3X
i=1
@
@xi(�vi)
ddt_� = - � fun(i) @ i (� * v i)
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 7 / 37
![Page 8: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/8.jpg)
Formura
is a functional programming language
is implemented in a functional programming
language (Haskell)
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 8 / 37
![Page 9: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/9.jpg)
Backend: How we generate efficient codes
Backend: How we generateefficient codes
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 9 / 37
![Page 10: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/10.jpg)
Backend: How we generate efficient codes
Stencil Computation
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 10 / 37
![Page 11: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/11.jpg)
Backend: How we generate efficient codes
Byte / Flops of hardwares are decreasing
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 11 / 37
![Page 12: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/12.jpg)
Backend: How we generate efficient codes
Naive implementation of stencil computation
The optimalB
F=
2He
Ce
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 12 / 37
![Page 13: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/13.jpg)
Backend: How we generate efficient codes
Temporal Blocking
The optimalB
F=
2He
Ce
0@ 1
NF
+2dNs
NT
1A
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 13 / 37
![Page 14: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/14.jpg)
Backend: How we generate efficient codes
Decompose & fuse array computations in space-time
manifest :: a[i]b[i] = a[i-1] + a[i] + a[i+1]
manifest :: c[i] = b[i-1] * b[i] * b[i+1]d[i] = c[i-1] + c[i] + c[i+1]
manifest :: e[i] = d[i-1] * d[i] * d[i+1]
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 14 / 37
![Page 15: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/15.jpg)
Backend: How we generate efficient codes
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 15 / 37
![Page 16: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/16.jpg)
Backend: How we generate efficient codes
In which language shall we code?
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 16 / 37
![Page 17: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/17.jpg)
Backend: How we generate efficient codes
Paraiso : a DSL embedded inHaskell (Muranushi, 2012)
among Nikola (Mainland & Morrisett, 2010), Obsidian
(Svensson, 2011), Accelerate (Chakravarty et al., 2011),
SPOC (Bourgoin et al., 2012), NOVA (Collins et al.,
2014), and LMS series (Rompf, 2012).
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 17 / 37
![Page 18: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/18.jpg)
Backend: How we generate efficient codes
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 18 / 37
![Page 19: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/19.jpg)
Backend: How we generate efficient codes
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 19 / 37
![Page 20: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/20.jpg)
Backend: How we generate efficient codes
Paraiso: a bad sell
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 20 / 37
![Page 21: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/21.jpg)
Backend: How we generate efficient codes
Our team
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 21 / 37
![Page 22: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/22.jpg)
Formura : a standalone DSL
Formura : a standalone DSL
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 22 / 37
![Page 23: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/23.jpg)
Formura : a standalone DSL
Design principle of Formura
�Simple enough�Rich enough
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 23 / 37
![Page 24: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/24.jpg)
Formura : a standalone DSL
Syntax of Formura
# dimension declarationdimension :: 3# array declarationdouble [] :: vx, vy, vz# array computationA2[i,j,k] = A[i-1] + A[i+1]# Tuplev = (vx, vy, vz)# Lambda expressiontripe = fun (x) 3 * x
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 24 / 37
![Page 25: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/25.jpg)
Formura : a standalone DSL
Tuples are functions
(a,b) 1 = b(f,(h,p,c)) 1 2 = c
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 25 / 37
![Page 26: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/26.jpg)
Formura : a standalone DSL
Inferred promotion to tuples and functions
x + (a,b) = (x+a,x+b)(x,y) + (a,b) = (x+a,y+b)
(x,y) + (a,b,c) = ?
(f + g) x = f x + g x(f + g + 1) x = f x + g x + 1
rk4 = fun(ddt) \fun(sys_0) let \
sys_q4 = sys_0 + dt/4 * ddt(sys_0)sys_q3 = sys_0 + dt/3 * ddt(sys_q4)sys_q2 = sys_0 + dt/2 * ddt(sys_q3)sys_next = sys_0 + dt * ddt(sys_q2)
in sys_next
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 26 / 37
![Page 27: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/27.jpg)
Formura : a standalone DSL
Differentiation Operators
ddx = fun(a) (a[i+1/2,j,k] - a[i-1/2,j,k])/dxddy = fun(a) (a[i,j+1/2,k] - a[i,j-1/2,k])/dyddz = fun(a) (a[i,j,k+1/2] - a[i,j,k -1/2])/ dz
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 27 / 37
![Page 28: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/28.jpg)
Formura : a standalone DSL
Nabla and Summation
@ = (ddx ,ddy ,ddz)� = fun (e) e 0 + e 1 + e 2
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 28 / 37
![Page 29: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/29.jpg)
Formura : a standalone DSL
Evaluation of formura expression
� fun(i) @ i (� * v i)
� = fun (e) e 0 + e 1 + e 2
�! (fun(i) @ i (� * v i)) 0
+ (fun(i) @ i (� * v i)) 1
+ (fun(i) @ i (� * v i)) 2
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 29 / 37
![Page 30: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/30.jpg)
Formura : a standalone DSL
Evaluation of formura expression
� fun(i) @ i (� * v i)� = fun (e) e 0 + e 1 + e 2
�! (fun(i) @ i (� * v i)) 0
+ (fun(i) @ i (� * v i)) 1
+ (fun(i) @ i (� * v i)) 2
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 29 / 37
![Page 31: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/31.jpg)
Formura : a standalone DSL
Evaluation of formura expression
� fun(i) @ i (� * v i)� = fun (e) e 0 + e 1 + e 2
�! (fun(i) @ i (� * v i)) 0
+ (fun(i) @ i (� * v i)) 1
+ (fun(i) @ i (� * v i)) 2
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 29 / 37
![Page 32: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/32.jpg)
Formura : a standalone DSL
Evaluation of formura expression
(fun(i) @ i (� * v i)) 0
�! @ 0 (� * v 0))@ = (ddx ,ddy ,ddz)v = (vx,vy ,vz)
(a,b,c) 0 = a
�! ddx (� * vx)
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 30 / 37
![Page 33: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/33.jpg)
Formura : a standalone DSL
Evaluation of formura expression
(fun(i) @ i (� * v i)) 0�! @ 0 (� * v 0))
@ = (ddx ,ddy ,ddz)v = (vx,vy ,vz)
(a,b,c) 0 = a
�! ddx (� * vx)
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 30 / 37
![Page 34: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/34.jpg)
Formura : a standalone DSL
Evaluation of formura expression
(fun(i) @ i (� * v i)) 0�! @ 0 (� * v 0))
@ = (ddx ,ddy ,ddz)v = (vx,vy ,vz)
(a,b,c) 0 = a
�! ddx (� * vx)
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 30 / 37
![Page 35: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/35.jpg)
Formura : a standalone DSL
Evaluation of formura expression
(fun(i) @ i (� * v i)) 0�! @ 0 (� * v 0))
@ = (ddx ,ddy ,ddz)v = (vx,vy ,vz)
(a,b,c) 0 = a
�! ddx (� * vx)
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 30 / 37
![Page 36: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/36.jpg)
Formura : a standalone DSL
Evaluation of formura expression
ddx (� * vx)
ddx = fun(a) (a[i+1/2,j,k] - a[i-1/2,j,k])/dx
�! ((� * vx)[i+1/2,j,k] -
(� * vx)[i-1/2,j,k])/dx
�! (�[i+1/2,j,k] * vx[i+1/2,j,k] -
�[i-1/2,j,k] * vx[i-1/2,j,k])/dx
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 31 / 37
![Page 37: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/37.jpg)
Formura : a standalone DSL
Evaluation of formura expression
ddx (� * vx)ddx = fun(a) (a[i+1/2,j,k] - a[i-1/2,j,k])/dx
�! ((� * vx)[i+1/2,j,k] -
(� * vx)[i-1/2,j,k])/dx
�! (�[i+1/2,j,k] * vx[i+1/2,j,k] -
�[i-1/2,j,k] * vx[i-1/2,j,k])/dx
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 31 / 37
![Page 38: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/38.jpg)
Formura : a standalone DSL
Evaluation of formura expression
ddx (� * vx)ddx = fun(a) (a[i+1/2,j,k] - a[i-1/2,j,k])/dx
�! ((� * vx)[i+1/2,j,k] -
(� * vx)[i-1/2,j,k])/dx
�! (�[i+1/2,j,k] * vx[i+1/2,j,k] -
�[i-1/2,j,k] * vx[i-1/2,j,k])/dx
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 31 / 37
![Page 39: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/39.jpg)
Formura : a standalone DSL
Evaluation of formura expression
ddx (� * vx)ddx = fun(a) (a[i+1/2,j,k] - a[i-1/2,j,k])/dx
�! ((� * vx)[i+1/2,j,k] -
(� * vx)[i-1/2,j,k])/dx
�! (�[i+1/2,j,k] * vx[i+1/2,j,k] -
�[i-1/2,j,k] * vx[i-1/2,j,k])/dx
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 31 / 37
![Page 40: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/40.jpg)
Formura : a standalone DSL
Evaluation of formura expression
� fun(i) @ i (� * v i)
�! (�[i+1/2,j,k] * vx[i+1/2,j,k] -
�[i-1/2,j,k] * vx[i-1/2,j,k])/dx +
(�[i,j+1/2,k] * vy[i,j+1/2,k] -
�[i,j-1/2,k] * vy[i,j-1/2,k])/dy +
(�[i,j,k+1/2] * vz[i,j,k+1/2] -
�[i,j,k-1/2] * vz[i,j,k-1/2])/dz
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 32 / 37
![Page 41: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/41.jpg)
Formura : a standalone DSL
Evaluation of formura expression
� fun(i) @ i (� * v i)
�! (�[i+1/2,j,k] * vx[i+1/2,j,k] -
�[i-1/2,j,k] * vx[i-1/2,j,k])/dx +
(�[i,j+1/2,k] * vy[i,j+1/2,k] -
�[i,j-1/2,k] * vy[i,j-1/2,k])/dy +
(�[i,j,k+1/2] * vz[i,j,k+1/2] -
�[i,j,k-1/2] * vz[i,j,k-1/2])/dz
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 32 / 37
![Page 42: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/42.jpg)
Formura : a standalone DSL
Evaluation of formura expression
3X
i=1
@
@xi(�vi)
� fun(i) @ i (� * v i)
�! (�[i+1/2,j,k] * vx[i+1/2,j,k] -�[i-1/2,j,k] * vx[i-1/2,j,k])/dx +(�[i,j+1/2,k] * vy[i,j+1/2,k] -
�[i,j-1/2,k] * vy[i,j-1/2,k])/dy +(�[i,j,k+1/2] * vz[i,j,k+1/2] -�[i,j,k-1/2] * vz[i,j,k-1/2])/dz
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 33 / 37
![Page 43: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/43.jpg)
Formura : a standalone DSL
More to talk about
Modular Reifiable Matching (MRM)(Oliveira et al., 2015) +Pattern synoynm solves “expression problem”
Details of code transformation paths
Varieties of temporal blocking methods
How we have gave proof to certain types of temporal blockingmethods
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 34 / 37
![Page 44: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/44.jpg)
Conclusion
Conclusion
Functional programming
is a good choice for user interface
�! weather scientists and astronomers can
use it
is crucial in implementing all the program
transformations
�! achieves high performance
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 35 / 37
![Page 45: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/45.jpg)
Conclusion
Conclusion
1.184 PflopsFormura
T. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 36 / 37
![Page 46: Automatic Generation of E cient Codes from Mathematical … · Good news of Formura 1/2 1 : 184 Peta ops (11.62% of the peak) on 663,552 cores T. Muranushi et al. (RIKEN AICS) Formura](https://reader033.fdocument.pub/reader033/viewer/2022050422/5f919b66d1159825f00f2640/html5/thumbnails/46.jpg)
Bibliography
Bibliography I
Bourgoin, M., Chailloux, E., & Lamotte, J.-L. 2012, Parallel ProcessingLetters, 22, 1240007
Chakravarty, M. M., Keller, G., Lee, S., McDonell, T. L., & Grover, V.2011, in Proceedings of the sixth workshop on Declarative aspects ofmulticore programming, ACM, 3–14
Collins, A., Grewe, D., Grover, V., Lee, S., & Susnea, A. 2014, inProceedings of ACM SIGPLAN International Workshop on Libraries,Languages, and Compilers for Array Programming, ACM, 8
Mainland, G., & Morrisett, G. 2010in , ACM, 67–78
Oliveira, B. C. d. S., Mu, S.-C., & You, S.-H. 2015, in Proceedings of the8th ACM SIGPLAN Symposium on Haskell, ACM, 82–93
Rompf, T. 2012, PhD thesis, ECOLE POLYTECHNIQUE FEDERALE DELAUSANNE
Svensson, J. 2011, PhD thesis, Chalmers University of TechnologyT. Muranushi et al. (RIKEN AICS) Formura Sep 22, 2016 37 / 37