Preparando o SPRACEpara a era do HL-LHC
SPRACE
T H I A G O TO M E I , R O G E R I O I O P E , J A D I R M A R R A , M A R C I O C O S TA
High-Luminosity LHC Schedule
2
SPRACE Computing Power Evolution
2000 2004 2009 2014 2020 2024
210
310
410
510
Pled
ged
Com
putin
g Po
wer
(HS0
6)
2000 2005 2010 2015 2020 2025Year
3
CMS HL-LHC Resource Projections
4
CHF/HS06 Future Projections
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029
Last 5 year average improvement factor = 1.25
1
10
210
CH
F / H
S06
Past evolution
Conservative projection
Pessimistic projection
1.331.21
1.08 1.05 1.14
0.77
1.071.69
1.62
1.15
1.20
v8 Jan 2021Price / performance evolution of installed CPU servers (CERN)
SSD®HDD
120% RAM price increase
3GB/core memory®2GB
INTEL-AMD price war, low RAM prices
AMD market push
Improvement factor/year
5
Machine Learning in HEP: Jet GenerationGenerative Adversarial Networks (GANs)
q Generator + discriminator in tandem
Hadronic jetsq As calorimeter images
§ Convolutional NNs§ But: sparse and unordered
q As point-cloud-style datasets§ Graph NNs§ But: physics features and
variable particle number
Message Passing GAN (MPGAN)q Outperforms point cloud GANsq https://arxiv.org/abs/2106.11535
6
Machine Learning in HEP: Jet GenerationGenerative Adversarial Networks (GANs)
q Generator + discriminator in tandem
Hadronic jetsq As calorimeter images
§ Convolutional NNs§ But: sparse and unordered
q As point-cloud-style datasets§ Graph NNs§ But: physics features and
variable particle number
Message Passing GAN (MPGAN)q Outperforms point cloud GANsq https://arxiv.org/abs/2106.11535
0.4- 0.3- 0.2- 0.1- 0 0.1 0.2 0.3 0.4relh
0.4-
0.3-
0.2-
0.1-
0
0.1
0.2
0.3
0.4
rel
f
2-10
1-10
1
7
Machine Learning in HEP: Jet GenerationGenerative Adversarial Networks (GANs)
q Generator + discriminator in tandem
Hadronic jetsq As calorimeter images
§ Convolutional NNs§ But: sparse and unordered
q As point-cloud-style datasets§ Graph NNs§ But: physics features and
variable particle number
Message Passing GAN (MPGAN)q Outperforms point cloud GANsq https://arxiv.org/abs/2106.11535
0.05-0
0.05
relh0.1-
0.05-
0
0.05
relf
1-10
T re
lp
8
Machine Learning in HEP: Jet GenerationGenerative Adversarial Networks (GANs)
q Generator + discriminator in tandem
Hadronic jetsq As calorimeter images
§ Convolutional NNs§ But: sparse and unordered
q As point-cloud-style datasets§ Graph NNs§ But: physics features and
variable particle number
Message Passing GAN (MPGAN)q Outperforms point cloud GANsq https://arxiv.org/abs/2106.11535
Generator
fe
fn
…
{Initial Noise/Intermediate Features
h1
h2
{Final Featuresη φ pT 1
FC Layer…
Initial Noise
…1
1
1
0
mask
Number of particles randomly samples from real distribution N = ∑ masks
Masks assigned to first points, sorted in point
feature space
N
Message Passing
Generated Particle Cloud
FC LayerAvePool
…
{Initial Featuresη φ pT 1
fe
fn
…
{Initial Features
Discriminator
Real Particle Cloud
Generated Particle Cloud
Real or Generated
Message Passing
9
Machine Learning in HEP: TrackingOne of the most challenging HL-
LHC computing tasksq Exponential complexity with nhits
q Track-ML challenge in 2018
Joining efforts with Exa.TrkXq Geometric Deep Learning
§ Metric learning§ (Attention) Graph NNs
q https://arxiv.org/abs/2103.0699510
Machine Learning in HEP: TrackingOne of the most challenging HL-
LHC computing tasksq Exponential complexity with nhits
q Track-ML challenge in 2018
Joining efforts with Exa.TrkXq Geometric Deep Learning
§ Metric learning§ (Attention) Graph NNs
q https://arxiv.org/abs/2103.0699511
Heterogeneous ComputingPrototype: Phase-2 CMS HLTq https://cds.cern.ch/record/2766559q 37.3 MHS06 farm (PU = 200)q Process events at 750 kHzq High efficiency, low output rate
General-purpose GPUsq Cheaper CHF/HS06q Structure of Arraysq Vectorized loops
CMSSW with GPUsq Run-3: pixel tracks, calo local recoq Phase-2: HGCAL reconstruction
12
Heterogeneous ComputingPrototype: Phase-2 CMS HLTq https://cds.cern.ch/record/2766559q 37.3 MHS06 farm (PU = 200)q Process events at 750 kHzq High efficiency, low output rate
General-purpose GPUsq Cheaper CHF/HS06q Structure of Arraysq Vectorized loops
CMSSW with GPUsq Run-3: pixel tracks, calo local recoq Phase-2: HGCAL reconstruction
13
Heterogeneous ComputingPrototype: Phase-2 CMS HLTq https://cds.cern.ch/record/2766559q 37.3 MHS06 farm (PU = 200)q Process events at 750 kHzq High efficiency, low output rate
General-purpose GPUsq Cheaper CHF/HS06q Structure of Arraysq Vectorized loops
CMSSW with GPUsq Run-3: pixel tracks, calo local recoq Phase-2: HGCAL reconstruction
14
Networking – LHCONEPrivate network connecting
Tier1s and Tier2s:q Serving any LHC sites according to their
needs and allowing them to growq Model: sharing the cost and
use of expensive resourcesq A collaborative effort among Research
& Education Network Providers
LHCONE services q L3VPN (VRF): routed Virtual
Private Network – operationalq P2P: dedicated, bandwidth
guaranteed, point-to-pointlinks – in development
q Monitoring: monitoring infrastructure – operational
15
16
TEIN
GÉANT,Europe
LHCONE Map Ver. 5.4, September 2020 – WEJohnston, ESnet, [email protected]
LHCONE L3VPN: A global infrastructure for High Energy Physics data analysis (LHC, Belle II, Pierre Auger Observatory, NOvA, XENON, JUNO)TIFR
NKN
, India
toCE
RN
GÉANTOpen
London
CESNETCzechia
ESnet
Geneva
RedIRISSpain
RoEduNetRomania
UAIC xx, ISS,NIHAM,
ITIM, NIPNEx3
JANETUK
UChi(MWT2)
ARNESSlovenia
SiGNET xx
AGLT2MSU
AGLT2UM
Budapest
LHC1-RENATERFrance, AS2091
LPC,CPPM,IPHC,
SUBATECH
Korea
CAN
AR
IE
TO: ESnet, via
StCANARIEarLight, then to
GÉANT, CERN, RU-VRF via
NETHERLIGHT
To: E
Snet
, CAN
ARIE
,
Inte
rnet
2 via
St
arLig
ht, L
A, th
en T
o
GÉAN
T, N
ORDU
net,
CERN
via N
ethe
rLigh
t
NORDUne
t , ES
net,
GÉANT,
via M
AN
LAN
GÉANT
CERN
CERN
To JA
NET
To In
tern
et2,
SINE
T, A
ARNe
t
Peer
: Calt
ech,
UCS
D,
UCSB
, TAC
C,To: Internet2
ESnet, CalREN, U Fl
Caltech
MIT
RedIRIS
ESne
tRE
NATE
R
Pacif
icWav
e(d
istrib
uted
exc
hang
e)
ESnet
Bucharest
PSNC
ESnet
TO CSTNETVia TEIN andCNGI-CERNET
GARR
UWiscMadison
SINET,Tokyo
To: ESnet, Internet2, CANETvia MAN LAN
To: CERN
RNP / REDNESPBrazil
AtlanticWave
(distributedexchange)
CERN
UIUC(MWT2)
All sites at MANLAN
multipoint VLAN, inc MIT,
NORDUnet, NET2 (North East Tier 2)
UCSD
ToMANLAN
to G
ÉAN
T Ge
neva
,CE
RN
Internet2USA
KR
EON
et2
RU-VRF
Global Research Platform Network (GRPnet)
To Amsterdam
Paci
ficW
ave/
PNW
G
TransPAC,
(US N
SF)
Singapore-LA,
(Internet2+SingAREN/NSCC)
Madrid
CANARIE
to G
ÉAN
T(L
ondo
n vi
a O
rient
+)
FCCNPortugal
London
NORDUnet
RoEd
uNet
ESnet
GÉANT GÉAN
T
SINET
(Via ANA-400
via MOXY)
To G
ÉANT
, Par
is
RENATER
Prague
Brussels
Lisbon
ESnetUSA
FNAL-T1
ESnet, GenevaESnet, Am
st.ESnet, London
Peer: Internet2UNL, Wisc, AGLT2,UChi, IU, Purdue,
UT Arlington, UIUC,Duke via Internet2 L2
To WIX, ANSP, RNP
Peer: Vanderbilt and UFL via
SOX, Amaparo via RNP
prague_cesnet_lcg2,FZU/praguelcg2
DFNGermany
GSI
>NORDUnet, RU-VRF
via ANA-400
via NetherlLight
ANA-400
ANA-400
GÉANT+
CERN
NO
RDU
net
to CERN
ANA-400
RU-VRF-ESnet
NO
RD
Unet
CAN
AR
IE
Peer: ASGC, ESnet, Internet2, CANARIE,
KREONet2;------------
GÉANT, GARR, CSTNET
UT Arlington
Kharkov-KIPT
URANUkraine
Beijing/TEIN North
CSTNETChina
China Next Generation Internet, Internet Exchange
to SI
NET,
ASG
C, C
OMSA
TS, J
GN N
CP
via Tr
ansP
AC, J
GN ,T
EIN,
HEP
PERN
,
and
AARN
et
CIEMAT-LCG2,UAM-LCG2
AARNetAustralia
UMel
AARNet
PERNPakistan
NCP-LCG2
PK-CIIT
PNNL, SLAC, ORNL
JGN –
SingAREN/NSCC
ASGCTaiwanASGC-T1
JGN
–Sin
gARE
N/NS
CC
IndianaUSA
UNL
KR
EON
et2
ESnetInternet2
GRNET
Peers:ESnet,
Internet2
To ES
net
NET2 / BU(North East Tier 2)
Peer:
ESnet
To ES
net
To ESnet
ESnet
UCSB
ESnet
To ESnet
To:ESnet, Internet2
StarLight
To: SINET,ESnet, Internet2
CANARIE/CANETCanada
StarLight
Montreal
To: ESnet,
Internet2
Harvard SingaporeA
AR
Net, TEIN
,SingA
REN
SANETSlovakia
FMPh-UNIBA xx
IEPSAS-Kosice
GARR
Amsterdam
To R
U-VR
F
Peers: ASGC, CERNLight,
NORDUnet2, GÉANT, RU-
VRF/KIAE, SINET, NL-T1, KREONet2;
----------------SURFsara, ESnet,
PSNC,
NORDUnet
SURF
sara
ASG
C
Vienna
PIONIERPoland PSNC
AGH Kraków, U.Warsaw
To CERNLight,
peer: CERN, GARR
RU-V
RF
RU-VRF
To GARR
Via GÉANT
CERNLight
Geneva
GRNETGreece
ANA-400
UIC
Vand
erbil
t
BTAA
Om
niPo
P(C
hica
go)
SINE
T to
ESn
et, I
nter
net2
,CA
NARI
E vi
a Pa
cWav
e, LA
Peers: UOK, UFL, CANET,IU, PURDUE, IUPUI,
ESnet, CALTECH, MIT,UOC, NORDUNET,GÉANT, AARNet,
(same VRF asESnet Europe)
ND
WIX(Washington)
Peer
: ESn
et
ASGC2
SINETJapan
Hirosh.,Tsukuba,
Tokyo
CUDIMexico
UNAM
SINE
T,To
kyo
SINETTokyo
To SINET, ESnet
via Seattle
KREONet2
KREONet2
To Se
attle
Hong Kong
CNGI-CERNETChina
KREONet2Korea
PNUKISTI –T1
KNU, KCMS
NORD
Unet
GÉANT,CANARIE,
NORDUnet, SINETESnet
ESnet
MOXY(Montreal)
To RU-VRF, GÉANT,
MAN LAN,
to
JGN, SINet,
TEIN
KRLightDaejeon
MREN(Chicago)
GÉA
NT to B
eijingvia O
rient+
to G
ÉAN
TLo
ndon
KR
EON
et2 ARNES
CERN, GÉANTto TIFR
toJGN
KREONet
CEN
IC/
CalR
EN
HEPNET-JJapanKEK T1
UTNETICEPP
U Tokyo
Tokyo
Korea
TANet2Taiwan
NCU, NTU
Peers:
ESnet, Internet2
U Fl
RU-V
RF/K
IAE
BelnetBelgium
BEgrid-ULB-VUB,Begrid-UCL, IIHE
Belnet
Belnet
LondonAmsterdam
DE-KIT-T1RWTH
DESY
To ESnet
IUPUI(MWT2)
Purdue
SURFsaraNetherlands
CEASaclay
CC-IN2P3-T1
DNIC
HEPCAS,IHEP BeijingCMS ATLAS
SimFraU
BCNetUBC
UTorontoUVic
TRIUMF-T1
NL-T1
NikhefA
SGC Taiw
an via Singapore
CSTNet, Bejing
To ASGC,Taiwan
BNL-T1BNL-T1
ANL
ERX-ERNETIndia
To: JGN, ESnet,
TEIN, GÉANT
To: KERONet,
ASGC, TEIN
to CERN
ASG
C
ASGC, Taiwan
via Starlight
ASGC
to G
ÉAN
T Am
ster
dam
, CER
N
TIFRMumbai
SINETto LA
Peer: ESnet,
TEIN, via JGN
To TEIN
To GÉANT
To ASGC,KREONet2,CANARIE
SouthGridT2
Cyfronet
POZMAN
ICM-PUB
Ioannina
ThaiRENThailand
THAISARN,PUBNET-TH,NSTDA-TH
UNINET-TH,SUT-TH
UNINET,ERX-CHULANET
PIC-T1
Marseille
ASGC toNetherLight
ASGC toSingapore
ASGC toSingapore
ASGC toAmsterdam
ASGC
ASGC to StarLight
To A
SGC
via LA
ASGC
SINET,To MANLAN
SINETTo LA
SINET to Tokyo
SINET
To Tokyo SINETTo Tokyo
SIN
ET t
o A
mst
erda
m
SINETto Amsterdam
SINET to TOKYO
SINE
Tto
Sin
gapo
re
CAE-1: Singapore-London,(TEINCC, SurfNet, Nordunet, AARnet, SingAREN, GÉANT)
ToLondon
Perth
SINET
ANA-400 ANA-400
To: I
nter
net2
ANA-400
ANA-400SINET
SINET
Peers: SINET, ESnet,
Internet2, TransPAC
SingARENSingapore
Peers:ASGC, SINET, JGN,
TEIN, AARNet, GÉANT
Singapore
MAN LAN(New York)
Internet2, CERN, ESnet, NORDUnet,
CANARIE
NORDUnetNordic
NDGF-T1,NDGF-T1bNDGF-T1c
To NetherLight,
Peer: CANARIE,
SINET
ASGC
MyRENMalaysia
GARR LHCONE VRF domain/aggregator- A provider network.
Provider network PoP router
NREN/site router at exchange point
London
Connector network – provides,e.g., an L2 path between VRFs.ANSP
CANARIE
CUDI
Communication links:1/10, 20/30/40, and 100Gb/s or N x 100GUnderlined link information indicates link provider, not use
Exchange point
Not currently connected to LHCONE
JGN –SingAREN/NSCC
NDGF-T1
RAL
Dotted outline indicates distributed site
Blue dashed outline indicating a WLCG federation site not currently on LHCONE
Dublin
To G
ÉANT
, Lon
don
ANA-400GÉANT, NEAAR
NEAAR/IU/US NSF
PacWave /PNWG
(Seattle)Peer:
ESnet, KREONet2
To StarLight, Chicago
IU
GÉANT, CERNET
GÉANT, CERNET
NORDUnetNORDUnet
>NOR
DUne
t Peer: NORDUnet
KIAE-TRANSIT(RU-VRF)
Russia
Moscow
GridPNPI
INR (Troitsk)
BINP IHEPProtvino
JINR-T1/T2SarFTI
RRC-KI T1/T2
GARRItaly
INFNBari, Catania,
Frascati, Legnaro,Milano, Roma1,
Torino
INFNPisa, Napoli
CNAF-T1
Peer
s:GÉ
ANT,
RU-
VRF
PacWave(Los Angeles)
NKNIndiaVECC
Kolkata
Peers:Internet2, RU-VRF,
NORDUnet,ESnet
Starlight(Chicago)
JGNJapan
HongKong
TransPACUS/NSF/IU
HongKong
TransPAC,via PIREN)
HKIXHong Kong
GOREX(U of Guam)
Guam
PacWave(Sunnyvale)
Peers: ESnet,UCSD,UCSD
TransPAC,
Waterloo
NetherLight(Amsterdam)
Milan
CERNGeneva
CERN-T1
CERN-T0
Amsterdam
Peer: RU-VRF
Peer:
SURFsara,
GÉANT,
PIONIER,ASGC,
KREONet2,
ESnet
Paris
London
To ESnet
EEnetEstonia
HEPC
GÉANT+
CERN
Poznan
Talli
nn
Hamburg
RAL-PPD
Bristol
Oxford
Sussex
ScotGrid T2Durham
ECDF
GlasgowNorthGrid
T2Lancs
Liv
Man
Shef
To: GÉANT
International infrastructure by provider/collaboration
GÉANT
JGN, SingAREN (+ TIENCC, SurfNet, NORDUnet,GÉANT on the London link)
SINET, JapanASGC, Taiwan
KREONet2, Korea
ESnet transatlantic, USA
NORDUnetvariousKIAE, Russia
ANA-300/400 - Various links provided by CANARIE, ESnet, GÉANT, Internet2, NORDUnet, SURFnet, SINET,IU/NSF
LondonT2
IC-HEP
Brunel
QMUL
RHUL
LPNHEAPC,LAL
IPNHOLLR,CEA-IRFU
Helsinki>NORDUnet VRF
Frankfurt
CAE-1
GÉANT, CSTNET
RAL-LCG
RedCLARA,S. America
AMPath/AMLight
NAP of Americas(Miami)
Panama
REUNAChile
UTFSM/CCTVal
ANA-400
SOX(Atlanta)
São PauloPTTA
NAP do Brasil
to ESnet, Internet2, and GÉANT London
and Paris via ANA-400
ANA-400
Chile
BELLA: GÉANT, et al, RedCLARA, et al
Fortaleza
To: GÉANT
ToRedCLARA
BELL
A: G
ÉAN
T, e
t al,
RedC
LARA
, et a
l
SPRACE(UNESP)
HEPGrid(UERJ)
SAMPA(USP)
CBPF
Adapted by R. L. Iope(July/2021)
NCC – NAP Network ConnectionAcademic Network of São Paulo - REDNESP (@ Equinix SP4*)
40G
40G
10G
10G
40G
40G
10G
10G
Huawei CE 6881
Brocade MLXe(ANSP Router)
~ 40 Km
DWDM linkMegatelecom(2x 100 Gbps)
Padtec Flexponder
Padtec Flexponder
UNESP Center for Scientific Computing
100G
Padtec Transponder
100G
Padtec Transponder
Padtec Mux /
Demux
PadtecMux /
Demux
Huawei CE 8861
Cisco Catalyst 6506E
10G
Brocade MLXe(SOL - SouthernLight)
To AmLight(4x 10G + 2x 100G)
100G
Juniper(RNP Router)
40G
4x 10G
4x 10G
10GCH 27
CH 31
CH 27
CH 31
* https://www.equinix.com.br/data-centers/americas-colocation/brazil-colocation/sao-paulo-data-centers
SPRACE cluster
40G10G
17
Conclusions and OutlookThe HL-LHC represents a monumental step in computingq Similar to the Tevatron à LHC jumpq Computing power evolution alone will not save us
Two approaches to Research and Developmentq Machine Learningq Heterogeneous computing
SPRACE will be part of the HL-LHC computing eraq Upgrades foreseen for 2021, 2023 – and beyond.q Involvement in R&D activities
18
Computação no SPRACE na era do HL-LHCInfrastrutura de Computação
q Upgrade (Storage + WNs) 2021: (92k + 150k) = 242k USDq Upgrade (Storage + WNs) 2023: (92k + 150k) = 242k USDq Instalação de GPUs na Tier-2: 108k USDq Upgrade para HL-LHC (2026): 242k USDq Upgrade para HL-LHC (2031): 242k USDq Infrastrutura do datacenter: 40k USD
P&D para Computaçãoq ML TT-V: 2020-2025: 178k BRLq HetComp TT-V: 2020-2025: 178k BRLq ML TT-V: 2026-2027 178k BRLq HetComp TT-V: 2026-2027 178k BRL
Cluster de Desenvolvimentoq 2 servidores, 8 GPUs: 59k USD
19
Azul: previsto no Projeto TemáticoFAPESP 2018/25225-9
Vermelho:ainda não custeado
Top Related