Machine Learning techniques for long lived Dark …...Iacopo Longarini / 6th LHC LLP workshop...
Transcript of Machine Learning techniques for long lived Dark …...Iacopo Longarini / 6th LHC LLP workshop...
Machine Learning techniques for long lived Dark Photon at ATLAS
Iacopo Longarinifor the ATLAS collaboration
Iacopo Longarini / 6th LHC LLP workshop
• Hidden sector weakly coupled with SM• Minimal model!
• No direct coupling to SM - new U(1) gauge invariance• Kinetic mixing with only one parameter (ε) • Search oriented to small ε and mass,
highly collimated decay products
Benchmark model
!2
e+μ+π+
e−μ−π−
e+μ+π+
e−μ−π−
Dark-QED U(1)
L / ✏e�µd J
emµ
<latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit><latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit><latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit><latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit>
c⌧ =1
�tot�d
/ 1
✏2m�d
Current status in ATLAS
!3
• Muonic Lepton Jets: collimated muon bundle in the MS with no tracks in the IDmain background: cosmic
• Hadronic Lepton Jets:Low EECAL/EHCAL jets with no tracks in the IDmain background: QCD multijet, beam induced background
µ
µ
e +e -,π +π -Event Selection
Dedicated Triggers, Quality cuts, PV
LJ Selection2 LJ per event, Cuts on LJ, BDT selection
ABCD plane Isolation in ID, Δφ between LJ
CERN-EP-2019-140
Iacopo Longarini / 6th LHC LLP workshop
Current status in ATLAS
!4
Results on 36 fb-1 data collected in 2015-2016 at √s = 13 TeVExclusion possible only in muonic Lepton Jet case due to low sensitivity!
CERN-EP-2019-140
Main Limitations
Muonic channel:• LLP triggers limited due to pT
requirements to keep rate low. • maximum number of muons in the
same RoI • Intrinsic pointing to IP
Hadronic channel:• BDT not enough powerful to
discriminate signal from background
Deep Learning can help us to overcome these limitations!
Iacopo Longarini / 6th LHC LLP workshop
Learn from RAW data!
!5
The idea: Make a map of jet energy deposits!
Processing this information, additional LLP features may be learnt
and used alongside of jet-level variables
(jet pT, jet timing, jet mass, JVT, charge …)
This information could be obtained from: Jet associated cells position (x,y,z) of calorimeter cell
Jet associated calocluster information (eta, phi, E in each calo sampling)
This study
A Convolutional Neural Network could process RAW information obtained
from jet deposits relative position and energy distribution
Iacopo Longarini / 6th LHC LLP workshop
Calorimetric jet images
!6
preSamplerB + EMB
TileBar0
TileBar1TileBar2
preSamplerE + EMEHEC0HEC1HEC2
TileExt0TileExt1TileExt2 HEC3
Jet clusters
η
ɸ
Sampling
Images corresponds to a 25x25 binned ηxɸ = [1.4;1.4] windowcentred around the jet axis
Iacopo Longarini / 6th LHC LLP workshop
CNN architecture
!7
Conv3DBatchNorm
LeakyReLU
(x3)
Dense
MaxPooling3D
Dropout
Conv3DBatchNorm
LeakyReLU
(x3)
Dense
MaxPooling3D
Dropout
Conv3DBatchNorm
LeakyReLU
(x3)
Dense
MaxPooling3D
Dropout
Dense
P(signal)
Sigmoid
Dense
Dense
(x2)
Concatenate
Jet-level variables
masscharge
EECAL/EHCALWidthJVT
Iacopo Longarini / 6th LHC LLP workshop
CNN expected performance
!8
• Improvement to BDT possible due to use of ConvNets• Standard Network architecture, many optimisation possible
0 0.2 0.4 0.6 0.8 1Signal Efficiency
0
0.2
0.4
0.6
0.8
1
Back
grou
nd R
ejec
tion
BDT ROC curve
CNN without jet-level variables as input
CNN with jet-level variables as input
accuracy ~ 88% on test sample
Expected performances
Iacopo Longarini / 6th LHC LLP workshop
What about trigger?
!9
• Searches for displaced decays in the MS are highly sensitive to triggering algorithms
• Common trigger strategy
• Recursive search for hits in corresponding coincidence windows
• Stable and reliable with good performances
• Momentum estimation limited to a pT range
• Pointing constraint to primary vertex
Iacopo Longarini / 6th LHC LLP workshop
Learn from RAW data!
!10
• Phase II ATLAS trigger will be based on high performance Xilinx FPGAs
• Main requirement is on processing time (latency ≤ few µs)
• This opens the possibility to run Deep Learning model at Level-0 trigger
• Signals from RPC mapped to images, which can be processed by a CNN
• Additional features such as secondary vertex, precise pT measurement, higher trigger efficiency
ATLAS-TDR-026
Iacopo Longarini / 6th LHC LLP workshop
RPC hit maps
!11
• 384 bins in η x 9 RPC station• Training sample from Phase II FullSim
flat η,ϕ, pT distribution ([0; 20] GeV)balanced sample with 0 to 3 muons per image
Not
revi
ewed
,for
inte
rnal
circ
ulat
ion
only
0 50 100 150 200 250 300 350η
0
1
2
3
4
5
6
7
8
9
La
yer
ind
ex
ATLAS Simulation Preliminary RPC hit
Figure 7: An example input to a Convolutional Neural Network (CNN) for the Phase-2 ATLAS Level-0 muontrigger, implemented on a FPGA, is shown. This example contains three muons with background. Resistive PlateChambers (RPC) hits of a fixed � sector are arranged in a matrix-like object. Each bin of the y-axis corresponds to adetector layer (3 detector layers for inner station, 4 for the middle and 2 for the outer station). The x-axis maps the ⌘coordinates of each physics RPC strip: for the i-th strip ⌘ibin = 384 ⌘i�⌘min
⌘max�⌘min , where ⌘max and ⌘min are respectivelythe maximum (⌘max = 0.95) and the minimum (⌘min = 0.07) ⌘ values for the barrel RPC strips chosen to preventmuons to fall outside any layer of a specified sector; 384 is a realistic number of strips per layer. This particularchoice has been taken in order to evaluate ML algorithm performances, without any geometrical acceptance e�ect.Random background has been added. The background rate has been evaluated from minimum bias events. Eventsused in the training phase of the CNN can also contain two or more muons in the same sector. Events with more thanone muon are built superimposing one muon images with no background, which is then included. The CNN output isset to evaluate transverse momentum and ⌘ of the leading and sub-leading muons (if the latter exists) in the sectorand returns also a flag for events that contain more than 2 muons.
8
Not
revi
ewed
,for
inte
rnal
circ
ulat
ion
only
0 50 100 150 200 250 300 350η
0
1
2
3
4
5
6
7
8
9
La
yer
ind
ex
ATLAS Simulation Preliminary RPC hit
Figure 2: An example input to a Convolutional Neural Network (CNN) for the Phase-2 ATLAS Level-0 muon trigger,implemented on a FPGA, is shown for background only. Resistive Plate Chambers (RPC) hits of a fixed � sector arearranged in a matrix-like object. Each bin of the y-axis corresponds to a detector layer (3 detector layers for innerstation, 4 for the middle and 2 for the outer station). The x-axis maps the ⌘ coordinates of each physics RPC strip:for the i-th strip ⌘ibin = 384 ⌘i�⌘min
⌘max�⌘min , where ⌘max and ⌘min are respectively the maximum (⌘max = 0.95) and theminimum (⌘min = 0.07) ⌘ values for the barrel RPC strips chosen to prevent muons to fall outside any layer of aspecified sector; 384 is a realistic number of strips per layer. This particular choice has been taken in order to evaluateML algorithm performances, without any geometrical acceptance e�ect. Random background has been added. Thebackground rate has been evaluated from minimum bias events. Events used in the training phase of the CNN canalso contain two or more muons in the same sector. Events with more than one muon are built superimposing onemuon images with no background, which is then included. The CNN output is set to evaluate transverse momentumand ⌘ of the leading and sub-leading muons (if the latter exists) in the sector and returns also a flag for events thatcontain more than 2 muons.
3
ATLAS-L0-MUON-PUBLIC
Iacopo Longarini / 6th LHC LLP workshop
Ternary ConvNet on FPGA
!12
• Limited resources on FPGA → ternary network!
• {-1, 0, 1} as network weights to reduce memory usage
• Lower number of operations: no multiplication if w=0
• Parallelisation of the input for smarter FPGA usage
• Simplified VGG-like network
Con
v2D
Batc
hNO
RMM
axPo
olin
g
Den
se
Con
v2D
Batc
hNO
RMM
axPo
olin
g
Con
v2D
Batc
hNO
RMM
axPo
olin
g
Batc
hNO
RMD
ense
Batc
hNO
RMD
ense
(4,3)56
(4,1) (4,3)56
(4,1) (4,3)128
(4,1) 56 56 5
Iacopo Longarini / 6th LHC LLP workshop
Trigger performance
!13
Not
revi
ewed
,for
inte
rnal
circ
ulat
ion
only
2 4 6 8 10 12 14 16 18 20
[GeV]T
p
0
0.2
0.4
0.6
0.8
1
Leve
l-0 E
ffic
iency
ATLAS
Simulation Preliminary
Standard algorithm
CNN (384 x 9)
tCNN (384 x 9)
Figure 11: A comparison of the e�ciencies of the Phase-2 ATLAS Level-0 standard muon trigger algorithm (cyan),the Convolutional Neural Network (red, CNN) and the ternary Convolutional Neural Network (blue, tCNN) is shownas a function of the transverse momentum pT . For the standard algorithm only the e�ciency obtained with theconfiguration which requires at least 3 spatial coincidences over the 4 Resistive Plate Chambers stations is shown. Allthe e�ciency curves take the single-muon images without background as input, and are tuned so that the e�ciencyreached at nominal threshold is the same (82%) to ease the comparison.
12
Not
revi
ewed
,for
inte
rnal
circ
ulat
ion
only
0.994 0.003
0.006 0.969 0.022
0.028 0.940 0.054
0.038 0.945
0 1 2 3True number of muons
0
1
2
3
Pre
dic
ted
nu
mb
er
of
mu
on
s
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Eve
nt
fra
ctio
n
ATLAS Simulation Preliminary tCNN
Figure 10: A ternary Convolutional Neural Network (tCNN, arXiv:1605.04711) implementation for the Phase-2ATLAS Level-0 muon trigger has been set up and trained. The inputs to this network are all the strips of all thedetector layers of a sector. The predicted number of muon is shown as a function of the numbers of muons obtainedfrom truth information after detector simulation. Ternary networks have weights made of just 2 bits. For this reason,tCNNs represent an optimal solution for FPGA synthesis and implementation, since the resource occupancy can bereduced up to a factor of 16, compared to a same-architecture non-ternary network. The columns of the histogramare normalized to unity. No pT selection is applied to muon candidates. Fake muons rate is at the level of 0.6% but isfurthermore reduced requiring reasonable pT thresholds (great fraction of fakes is predicted at very low momentum).
11
efficiency for pT = 10 GeV threshold Confusion matrix
fake rate of 0.6% without pT requirementreduced to 0.01% for pT = 10 GeV threshold
ATLAS-L0-MUON-PUBLIC
Iacopo Longarini / 6th LHC LLP workshop
Secondary vertices
!14
100− 50− 0 50 100strips z position
500
550
600
650
700
750
800
850
900
950
1000
Laye
r R [c
m]
RPC0
RPC1
RPC2
RPC3
decay radius 461 cmToy images for displaced
vertex identification
pT > 20 GeVL ∈ [0,5] m
ΔΦ ∈ [0.05,0.2] rad
Ongoing studies on FullSim events
σ~ 3.5cm, μ~-0.3cm
Iacopo Longarini / 6th LHC LLP workshop
Wrapping up
!15
• Dark Photon searches can be heavily improved with better object identification and trigger ability to identify secondary vertices
• (Deep) Learning using RAW information can be the key!
• 3D jet image ConvNet can exploit key features from jet cluster topology providing a highly efficient selection
• Exploiting FPGA CNN at L0 for precise pT measurement and secondary vertexing
• Trigger features already implemented for prompt muons!