“高専の研究力”...医療工学関連研究 放射線関連研究 防災関連研究 物質材料関連研究 農業関連研究 エネルギー(太陽光発電)関連研究
Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics,...
-
Upload
bennett-banks -
Category
Documents
-
view
217 -
download
0
Transcript of Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics,...
Probability, Statistics and Errorsin High Energy Physics
Probability, Statistics and Errorsin High Energy Physics
Wen-Chen ChangInstitute of Physics, Academia Sinica
章文箴中央研究院 物理研究所
OutlineOutline
• Errors
• Probability distribution: Binomial, Poisson, Gaussian
• Confidence Level
• Monte Carlo Method
Why do we do experiments?Why do we do experiments?
1. Parameter determination: determine the numerical value of some physical quantity.
2. Hypothesis testing: test whether a particular theory is consistent with our data.
Why estimate errors?Why estimate errors?
• We are concerned not only with the answer but also with its accuracy.
• For example, speed of light 2.998x108 m/sec– (3.090.15) x108:– (3.090.01) x108:– (3.092) x108:
Source of ErrorsSource of Errors
• Random (Statistic) error: the inability of any measuring device to give infinitely accurate answers.
• Systematic error: uncertainty.
Systematic ErrorsSystematic Errors
Systematic effects is a general category which includes effects such as background, scanning efficiency, energy resolution, angle resolution, variation of counter efficiency with beam position and energy, dead time, etc. The uncertainty in the estimation of such as systematic effect is called a systematic error
Orear
Systematic Error: reproducible inaccuracy introduced by faulty equipment, calibration, or technique
Bevington Error=mistake?
Error=uncertainty?
Experimental ExamplesExperimental Examples
• Energy in a calorimeter E=aD+ba & b determined by calibration expt
• Branching ratio B=N/(NT) found from Monte Carlo studies
• Steel rule calibrated at 15C but used in warm lab
If not spotted, this is a mistakeIf temp. measured, not a problem
If temp. not measured guess uncertaintyRepeating measurements doesn’t help
0
0.2
0.4
0 1 2 3 4 5
r
The BinomialThe Binomial
n trials r successesIndividual success probability p
rnr pprnr
npnrP
)1(
)!(!
!),;(
Variance
V=<(r- )2>=<r2>-<r>2
=np(1-p)
Mean
=<r>=rP( r )
= np
1-p p q
A random process with exactly two possible outcomes which occur with fixed probabilities.
Binomial ExamplesBinomial Examples
0
0.5
1
r
0
0.1
0.2
0.3
r
0
0.1
0.2
0.3
0.4
r
0
0.1
0.2
0.3
0.4
r
n=10 p=0.2 p=0.5 p=0.8
0
0.1
0.2
0.3
r
p=0.1 n=20
0
0.05
0.1
0.15
0.2
r
n=50 n=5
PoissonPoisson
‘ Events in a continuum’The probability of observing r
independent events in a time interval t, when the counting rate is and the expected number events in the time interval is .
0
0.1
0.2
0.3
r
!!
)();(
re
r
terP
rrt
Mean
=<r>=rP( r )
= Variance
V=<(r- )2>=<r2>-<r>2
=
=2.5
tconstNp
p
N
0
More about PoissonMore about Poisson
• The approach of the binomial to the Poisson distribution as N increases.
• The mean value of r for a variable with a Poisson distribution is and so is the variance. This is the basis of the well known nn formula that applies to statistical errors in many situations involving the counting of independent events during a fixed interval.
• As , the Poisson distribution tends to a Gaussian one.
Poisson ExamplesPoisson Examples
0
0.1
0.2
0.3
r0
0.2
0.4
0.6
0.8
r
0
0.1
0.2
0.3
0.4
r
0
0.1
0.2
r
0
0.2
r
0
0.1
r
=25=10=5.0
=0.5 =2.0=1.0
ExamplesExamples
• The number of particles detected by a counter in a time t, in a situation where the particle flux and detector are independent of time, and where counter dead-time is such that <<1.
• The number of interactions produced in a thin target when an intense pulse of N beam particles is incident on it.
• The number of entries in a given bin of a histogram when the data are accumulated over a fixed time interval.
Binomial and PoissonBinomial and Poisson
From an exam paperA student is standing by the road, hoping to hitch a lift. Cars pass
according to a Poisson distribution with a mean frequency of 1 per minute. The probability of an individual car giving a lift is 1%. Calculate the probability that the student is still waiting for a lift
(a) After 60 cars have passed
(b) After 1 hour
a) 0.9960=0.5472 b) e-0.6 * 0.60 /0! =0.5488
Gaussian (Normal)Gaussian (Normal)
Probability Density
ex
xP22 2/)(
2
1),;(
Mean
=<x>=xP( x ) dx
=Variance
V=<(x- )2>=<x2>-<x>2
=
Different GaussiansDifferent Gaussians
There’s only one!
Normalisation (if required)
Location change
Width scaling factor
Falls to 1/e of peak at x=
Probability ContentsProbability Contents
68.27% within 195.45% within 299.73% within 3
90% within 1.645 95% within 1.960 99% within 2.576 99.9% within
3.290These numbers apply to Gaussians and only Gaussians
Other distributions have equivalent values which you could use of you wanted
Central Limit TheoremCentral Limit Theorem
Or: why is the Gaussian Normal?If a variable x is produced by the
convolution of variables x1,x2…xN
I) <x>=1+2+…N
II) V(x)=V1+V2+…VN
III) P(x) becomes Gaussian for large N
Multidimensional GaussianMultidimensional Gaussian
e yxyxyyxx yxyx
yx
yxyxyxP
/))((2/)(/)()1(2
1
2
22222
12
1
),,,,;,(
ee yyxx yx
yxyxyxyxP
2222 2/)(2/)(
2
1),,,;,(
Chi squaredChi squared
Sum of squared discrepancies, scaled by expected error
Integrate all but 1-D of multi-D Gaussian
n
i i
iix
1
2
2
2/22/
2 2
)2/(
2);(
e
nnP n
n
About EstimationAbout Estimation
Theory Data
Statistical
Inference
TheoryData
Probability
Calculus
Given these distribution parameters, what can we
say about the data? Given this data, what can we say about the properties or parameters or correctness of
the distribution functions?
What is an estimator?What is an estimator?
An estimator (written with a hat) is a function of the data whose value, the estimate, is intended as a meaningful guess for the value of the parameter . (from PDG) 2)ˆ(
1}{ˆ
iix
NxV
i
ixN
x1
}{̂
2)ˆ(1
1}{ˆ
iix
NxV
2
}{ˆ minmax xxx
Minimum Variance Bound
What is a good estimator?What is a good estimator?
A perfect estimator is:• Consistent• Unbiassed
• Efficient
minimum
aaLimitN
ˆ
adxdxaxPaxPaxPxxaa ...)...;();();(,...),(ˆ...ˆ 2132121
2ˆˆ)ˆ( aaaV
One often has to work with less-than-perfect estimators
2
2 ln
1)ˆ(da
LdaV
The Likelihood FunctionThe Likelihood Function
Set of data {x1, x2, x3, …xN}
Each x may be multidimensional – never mind
Probability depends on some parameter a
a may be multidimensional – never mind
Total probability (density)
P(x1;a) P(x2;a) P(x3;a) …P(xN;a)=L(x1, x2, x3, …xN ;a)
The Likelihood
Maximum Likelihood Estimation
Maximum Likelihood Estimation
In practice usually maximise ln L as it’s easier to calculate and handle; just add the ln P(xi)
ML has lots of nice properties
0ˆ
aadA
dL
Given data {x1, x2, x3, …xN} estimate a by maximising the likelihood L(x1, x2, x3, …xN ;a)
a
Ln L
â
Properties of ML estimationProperties of ML estimation
• It’s consistent (no big deal)
• It’s biased for small NMay need to worry
• It is efficient for large NSaturates the Minimum Variance Bound
• It is invariantIf you switch to using u(a), then û=u(â)
a
Ln L
â u
Ln L
û
More about MLMore about ML
• It is not ‘right’. Just sensible.
• It does not give the ‘most likely value of a’. It’s the value of a for which this data is most likely.
• Numerical Methods are often needed
• Maximisation / Minimisation in >1 variable is not easy
• Use MINUIT but remember the minus sign
ML does not give goodness-of-fit
ML does not give goodness-of-fit
• ML will not complain if your assumed P(x;a) is rubbish
• The value of L tells you nothing
Fit P(x)=a1x+a0
will give a1=0; constant P
L= a0N
Just like you get from fitting
Least SquaresLeast Squares
• Measurements of y at various x with errors and prediction f(x;a)
• Probability• Ln L
• To maximise ln L, minimise 2
22 2/));(( axfye 2
);(
2
1
ii
ii axfy
x
y
So ML ‘proves’ Least Squares. But what ‘proves’ ML? Nothing
Least Squares: The Really nice thing
Least Squares: The Really nice thing
• Should get 21 per data point• Minimise 2 makes it smaller – effect is 1 unit
of 2 for each variable adjusted. (Dimensionality of MultiD Gaussian decreased by 1.)
Ndegrees Of Freedom=Ndata pts – N parameters
• Provides ‘Goodness of agreement’ figure which allows for credibility check
Chi Squared ResultsChi Squared Results
Large 2 comes from
1. Bad Measurements
2. Bad Theory
3. Underestimated errors
4. Bad luck
Small 2 comes from
1. Overestimated errors
2. Good luck
Fitting HistogramsFitting Histograms
Often put {xi} into bins
Data is then {nj}
nj given by Poisson,
mean f(xj) =P(xj)x4 Techniques
Full MLBinned MLProper 2
Simple 2
x
x
What you maximise/minimiseWhat you maximise/minimise
j j jjjjj ffnfnPoissonL ln);(lnln
j
j
jj
f
fn 2
j
j
jj
n
fn 2
• Full ML
• Binned ML
• Proper 2
• Simple 2
i i axPL );(lnln
Confidence Level:Meaning of Error Estimates
Confidence Level:Meaning of Error Estimates
• How often we expect to include “the true fixed value of our paramter” P0, within our quoted range, pp, for a repeated series of experiments?
• For the actual value P0, the probability that a measurement will give us an answer in a specific range of p is given by the area under the relevant part of Gaussian curve. A conventional choice of this probability is 68%.
The Straightforward ExampleThe Straightforward Example
Apples of different weights
Need to describe the distribution
= 68g = 17 g
50 100
All weights between 24 and 167 g (Tolerance)
90% lie between 50 and 100 g
94% are less than 100 g
96% are more than 50 g
Confidence level
statements
Confidence LevelsConfidence Levels
• Can quote at any level
(68%, 95%, 99%…)• Upper or lower or two-sided
(x<U x<L L<x<U)• Two-sided has further choice
(central, shortest…)
U
L U’
Maximum Likelihood and Confidence Levels
Maximum Likelihood and Confidence Levels
ML estimator (large N) has variance given by MVB
At peak For large N
Ln L is a parabola (L is a Gaussian)
2
2 ln
12ˆ )ˆ(
da
Lda aV
aada
LdaaLL
ˆ2
22
max
ln
2
)ˆ(ln
a
Ln Laa
da
Ld
da
Ld
ˆ2
2
2
2 lnln
2ˆ
2
max 2
)ˆ(ln
a
aaLL
Falls by ½ at aaa ˆˆ Falls by 2 at aaa ˆ2ˆ
Read off 68% , 95% confidence regions
Monte Carlo CalculationsMonte Carlo Calculations
• The Monte Carlo approach provides a method of solving probability theory problems in situations where the necessary integrals are too difficult to perform.
• Crucial element: random number generator.
An ExampleAn Example
1.-0 range in the
ddistributeuniformly numbers random of series a
of members are where)(*.2
/)(*)5.0(.1
)(
)(
1
iii
i
n
ii
b
a
rabrax
nabiax
xyn
ab
dxxyI
ReferencesReferences
• Lectures and Notes on Statistics in HEP, http://www.ep.ph.bham.ac.uk//group/locdoc/lectures/stats/index.html
• Lecture notes of Prof. Roger Barlow, http://www.hep.man.ac.uk/u/roger/
• Louis Lyons, “Statistics for Nuclear and Particle Physicists”, Cambridge 1986.
• Particle Data Group, http://pdg.lbl.gov/2004/reviews/contents_sports.html#mathtoolsetc