Machine Learning techniques for long lived Dark …...Iacopo Longarini / 6th LHC LLP workshop...

Machine Learning techniques for long lived Dark Photon at ATLAS

Iacopo Longarinifor the ATLAS collaboration

Iacopo Longarini / 6th LHC LLP workshop

• Hidden sector weakly coupled with SM• Minimal model!

• No direct coupling to SM - new U(1) gauge invariance• Kinetic mixing with only one parameter (ε) • Search oriented to small ε and mass,

highly collimated decay products

Benchmark model

!2

e+μ+π+

e−μ−π−

e+μ+π+

e−μ−π−

Dark-QED U(1)

L / ✏e�µd J

emµ

<latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit><latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit><latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit><latexit sha1_base64="Lzx2vBmoBXeGzTW0eGFaKNilRB0=">AAACNXicdVBNaxQxGM7Uj9b1a61HL8FFqAeHzGzttreCFUR6aMFtC5vtkMm+uw2bL5KMdBnmT3nxf3jSgwdFvPoXzGxXUNEHEh6e531Jnqe0UvhAyKdk7dr1GzfXN251bt+5e+9+98HmiTeV4zDkRhp3VjIPUmgYBhEknFkHTJUSTsv5i9Y/fQvOC6PfhIWFsWIzLaaCsxClontIA1yG+oC5+bPjlwd4uJU9bSjtUMXCBWeyPmyodcYGQ8F6IY3GgOmMKcWKyTlVFX5dxPu8BtV0im6PpCR/3s93MUnzPhnkg0i28/5evoezlCzRQyscFd0PdGJ4pUAHLpn3o4zYMK6ZC4JLaDq08mAZn7MZjCLVTIEf18vUDX4SlQmeGhePDnip/r5RM+X9QpVxsg3j//Za8V/eqArT3XEttK0CaH710LSSOBjcVognwgEPchEJ407Ev2J+wRzjIRbdlvArKf4/OcnTjKTZ8XZvf2dVxwZ6hB6jLZShAdpHr9ARGiKO3qGP6Av6mrxPPiffku9Xo2vJauch+gPJj58Vpquk</latexit>

c⌧ =1

�tot�d

/ 1

✏2m�d

Current status in ATLAS

!3

• Muonic Lepton Jets: collimated muon bundle in the MS with no tracks in the IDmain background: cosmic

• Hadronic Lepton Jets:Low EECAL/EHCAL jets with no tracks in the IDmain background: QCD multijet, beam induced background

µ

µ

e +e -,π +π -Event Selection

Dedicated Triggers, Quality cuts, PV

LJ Selection2 LJ per event, Cuts on LJ, BDT selection

ABCD plane Isolation in ID, Δφ between LJ

CERN-EP-2019-140

https://cds.cern.ch/record/2688373


Current status in ATLAS

!4

Results on 36 fb-1 data collected in 2015-2016 at √s = 13 TeVExclusion possible only in muonic Lepton Jet case due to low sensitivity!

CERN-EP-2019-140

Main Limitations

Muonic channel:• LLP triggers limited due to pT

requirements to keep rate low. • maximum number of muons in the

same RoI • Intrinsic pointing to IP

Hadronic channel:• BDT not enough powerful to

discriminate signal from background

Deep Learning can help us to overcome these limitations!

https://cds.cern.ch/record/2688373


Learn from RAW data!

!5

The idea: Make a map of jet energy deposits!

Processing this information, additional LLP features may be learnt

and used alongside of jet-level variables

(jet pT, jet timing, jet mass, JVT, charge …)

This information could be obtained from: Jet associated cells position (x,y,z) of calorimeter cell

Jet associated calocluster information (eta, phi, E in each calo sampling)

This study

A Convolutional Neural Network could process RAW information obtained

from jet deposits relative position and energy distribution


Calorimetric jet images

!6

preSamplerB + EMB

TileBar0

TileBar1TileBar2

preSamplerE + EMEHEC0HEC1HEC2

TileExt0TileExt1TileExt2 HEC3

Jet clusters

η

ɸ

Sampling

Images corresponds to a 25x25 binned ηxɸ = [1.4;1.4] windowcentred around the jet axis


CNN architecture

!7

Conv3DBatchNorm

LeakyReLU

(x3)

Dense

MaxPooling3D

Dropout

Conv3DBatchNorm

LeakyReLU

(x3)

Dense

MaxPooling3D

Dropout

Conv3DBatchNorm

LeakyReLU

(x3)

Dense

MaxPooling3D

Dropout

Dense

P(signal)

Sigmoid

Dense

Dense

(x2)

Concatenate

Jet-level variables

masscharge

EECAL/EHCALWidthJVT


CNN expected performance

!8

• Improvement to BDT possible due to use of ConvNets• Standard Network architecture, many optimisation possible

0 0.2 0.4 0.6 0.8 1Signal Efficiency

0

0.2

0.4

0.6

0.8

1

Back

grou

nd R

ejec

tion

BDT ROC curve

CNN without jet-level variables as input

CNN with jet-level variables as input

accuracy ~ 88% on test sample

Expected performances


What about trigger?

!9

• Searches for displaced decays in the MS are highly sensitive to triggering algorithms

• Common trigger strategy

• Recursive search for hits in corresponding coincidence windows

• Stable and reliable with good performances

• Momentum estimation limited to a pT range

• Pointing constraint to primary vertex


Learn from RAW data!

!10

• Phase II ATLAS trigger will be based on high performance Xilinx FPGAs

• Main requirement is on processing time (latency ≤ few µs)

• This opens the possibility to run Deep Learning model at Level-0 trigger

• Signals from RPC mapped to images, which can be processed by a CNN

• Additional features such as secondary vertex, precise pT measurement, higher trigger efficiency

ATLAS-TDR-026

http://cdsweb.cern.ch/record/2285580


RPC hit maps

!11

• 384 bins in η x 9 RPC station• Training sample from Phase II FullSim

flat η,ϕ, pT distribution ([0; 20] GeV)balanced sample with 0 to 3 muons per image

Not

revi

ewed

,for

inte

rnal

circ

ulat

ion

only

0 50 100 150 200 250 300 350η

0

1

2

3

4

5

6

7

8

9

La

yer

ind

ex

ATLAS Simulation Preliminary RPC hit

Figure 7: An example input to a Convolutional Neural Network (CNN) for the Phase-2 ATLAS Level-0 muontrigger, implemented on a FPGA, is shown. This example contains three muons with background. Resistive PlateChambers (RPC) hits of a fixed � sector are arranged in a matrix-like object. Each bin of the y-axis corresponds to adetector layer (3 detector layers for inner station, 4 for the middle and 2 for the outer station). The x-axis maps the ⌘coordinates of each physics RPC strip: for the i-th strip ⌘ibin = 384 ⌘i�⌘min

⌘max�⌘min , where ⌘max and ⌘min are respectivelythe maximum (⌘max = 0.95) and the minimum (⌘min = 0.07) ⌘ values for the barrel RPC strips chosen to preventmuons to fall outside any layer of a specified sector; 384 is a realistic number of strips per layer. This particularchoice has been taken in order to evaluate ML algorithm performances, without any geometrical acceptance e�ect.Random background has been added. The background rate has been evaluated from minimum bias events. Eventsused in the training phase of the CNN can also contain two or more muons in the same sector. Events with more thanone muon are built superimposing one muon images with no background, which is then included. The CNN output isset to evaluate transverse momentum and ⌘ of the leading and sub-leading muons (if the latter exists) in the sectorand returns also a flag for events that contain more than 2 muons.

8

Not

revi

ewed

,for

inte

rnal

circ

ulat

ion

only

0 50 100 150 200 250 300 350η

0

1

2

3

4

5

6

7

8

9

La

yer

ind

ex

ATLAS Simulation Preliminary RPC hit

Figure 2: An example input to a Convolutional Neural Network (CNN) for the Phase-2 ATLAS Level-0 muon trigger,implemented on a FPGA, is shown for background only. Resistive Plate Chambers (RPC) hits of a fixed � sector arearranged in a matrix-like object. Each bin of the y-axis corresponds to a detector layer (3 detector layers for innerstation, 4 for the middle and 2 for the outer station). The x-axis maps the ⌘ coordinates of each physics RPC strip:for the i-th strip ⌘ibin = 384 ⌘i�⌘min

⌘max�⌘min , where ⌘max and ⌘min are respectively the maximum (⌘max = 0.95) and theminimum (⌘min = 0.07) ⌘ values for the barrel RPC strips chosen to prevent muons to fall outside any layer of aspecified sector; 384 is a realistic number of strips per layer. This particular choice has been taken in order to evaluateML algorithm performances, without any geometrical acceptance e�ect. Random background has been added. Thebackground rate has been evaluated from minimum bias events. Events used in the training phase of the CNN canalso contain two or more muons in the same sector. Events with more than one muon are built superimposing onemuon images with no background, which is then included. The CNN output is set to evaluate transverse momentumand ⌘ of the leading and sub-leading muons (if the latter exists) in the sector and returns also a flag for events thatcontain more than 2 muons.

3

ATLAS-L0-MUON-PUBLIC

https://twiki.cern.ch/twiki/bin/view/AtlasPublic/L0MuonTriggerPublicResults


Ternary ConvNet on FPGA

!12

• Limited resources on FPGA → ternary network!

• {-1, 0, 1} as network weights to reduce memory usage

• Lower number of operations: no multiplication if w=0

• Parallelisation of the input for smarter FPGA usage

• Simplified VGG-like network

Con

v2D

Batc

hNO

RMM

axPo

olin

g

Den

se

Con

v2D

Batc

hNO

RMM

axPo

olin

g

Con

v2D

Batc

hNO

RMM

axPo

olin

g

Batc

hNO

RMD

ense

Batc

hNO

RMD

ense

(4,3)56

(4,1) (4,3)56

(4,1) (4,3)128

(4,1) 56 56 5


Trigger performance

!13

Not

revi

ewed

,for

inte

rnal

circ

ulat

ion

only

2 4 6 8 10 12 14 16 18 20

[GeV]T

p

0

0.2

0.4

0.6

0.8

1

Leve

l-0 E

ffic

iency

ATLAS

Simulation Preliminary

Standard algorithm

CNN (384 x 9)

tCNN (384 x 9)

Figure 11: A comparison of the e�ciencies of the Phase-2 ATLAS Level-0 standard muon trigger algorithm (cyan),the Convolutional Neural Network (red, CNN) and the ternary Convolutional Neural Network (blue, tCNN) is shownas a function of the transverse momentum pT . For the standard algorithm only the e�ciency obtained with theconfiguration which requires at least 3 spatial coincidences over the 4 Resistive Plate Chambers stations is shown. Allthe e�ciency curves take the single-muon images without background as input, and are tuned so that the e�ciencyreached at nominal threshold is the same (82%) to ease the comparison.

12

Not

revi

ewed

,for

inte

rnal

circ

ulat

ion

only

0.994 0.003

0.006 0.969 0.022

0.028 0.940 0.054

0.038 0.945

0 1 2 3True number of muons

0

1

2

3

Pre

dic

ted

nu

mb

er

of

mu

on

s

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Eve

nt

fra

ctio

n

ATLAS Simulation Preliminary tCNN

Figure 10: A ternary Convolutional Neural Network (tCNN, arXiv:1605.04711) implementation for the Phase-2ATLAS Level-0 muon trigger has been set up and trained. The inputs to this network are all the strips of all thedetector layers of a sector. The predicted number of muon is shown as a function of the numbers of muons obtainedfrom truth information after detector simulation. Ternary networks have weights made of just 2 bits. For this reason,tCNNs represent an optimal solution for FPGA synthesis and implementation, since the resource occupancy can bereduced up to a factor of 16, compared to a same-architecture non-ternary network. The columns of the histogramare normalized to unity. No pT selection is applied to muon candidates. Fake muons rate is at the level of 0.6% but isfurthermore reduced requiring reasonable pT thresholds (great fraction of fakes is predicted at very low momentum).

11

efficiency for pT = 10 GeV threshold Confusion matrix

fake rate of 0.6% without pT requirementreduced to 0.01% for pT = 10 GeV threshold

ATLAS-L0-MUON-PUBLIC

https://twiki.cern.ch/twiki/bin/view/AtlasPublic/L0MuonTriggerPublicResults


Secondary vertices

!14

100− 50− 0 50 100strips z position

500

550

600

650

700

750

800

850

900

950

1000

Laye

r R [c

m]

RPC0

RPC1

RPC2

RPC3

decay radius 461 cmToy images for displaced

vertex identification

pT > 20 GeVL ∈ [0,5] m

ΔΦ ∈ [0.05,0.2] rad

Ongoing studies on FullSim events

σ~ 3.5cm, μ~-0.3cm


Wrapping up

!15

• Dark Photon searches can be heavily improved with better object identification and trigger ability to identify secondary vertices

• (Deep) Learning using RAW information can be the key!

• 3D jet image ConvNet can exploit key features from jet cluster topology providing a highly efficient selection

• Exploiting FPGA CNN at L0 for precise pT measurement and secondary vertexing

• Trigger features already implemented for prompt muons!

Machine Learning techniques for long lived Dark …...Iacopo Longarini / 6th LHC LLP workshop...

Documents

Transcript of Machine Learning techniques for long lived Dark …...Iacopo Longarini / 6th LHC LLP workshop...