Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics,...

41
Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章章章 章章章章章 章章章章章

Transcript of Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics,...

Page 1: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Probability, Statistics and Errorsin High Energy Physics

Probability, Statistics and Errorsin High Energy Physics

Wen-Chen ChangInstitute of Physics, Academia Sinica

章文箴中央研究院 物理研究所

Page 2: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

OutlineOutline

• Errors

• Probability distribution: Binomial, Poisson, Gaussian

• Confidence Level

• Monte Carlo Method

Page 3: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Why do we do experiments?Why do we do experiments?

1. Parameter determination: determine the numerical value of some physical quantity.

2. Hypothesis testing: test whether a particular theory is consistent with our data.

Page 4: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Why estimate errors?Why estimate errors?

• We are concerned not only with the answer but also with its accuracy.

• For example, speed of light 2.998x108 m/sec– (3.090.15) x108:– (3.090.01) x108:– (3.092) x108:

Page 5: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Source of ErrorsSource of Errors

• Random (Statistic) error: the inability of any measuring device to give infinitely accurate answers.

• Systematic error: uncertainty.

Page 6: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Systematic ErrorsSystematic Errors

Systematic effects is a general category which includes effects such as background, scanning efficiency, energy resolution, angle resolution, variation of counter efficiency with beam position and energy, dead time, etc. The uncertainty in the estimation of such as systematic effect is called a systematic error

Orear

Systematic Error: reproducible inaccuracy introduced by faulty equipment, calibration, or technique

Bevington Error=mistake?

Error=uncertainty?

Page 7: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Experimental ExamplesExperimental Examples

• Energy in a calorimeter E=aD+ba & b determined by calibration expt

• Branching ratio B=N/(NT) found from Monte Carlo studies

• Steel rule calibrated at 15C but used in warm lab

If not spotted, this is a mistakeIf temp. measured, not a problem

If temp. not measured guess uncertaintyRepeating measurements doesn’t help

Page 8: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

0

0.2

0.4

0 1 2 3 4 5

r

The BinomialThe Binomial

n trials r successesIndividual success probability p

rnr pprnr

npnrP

)1(

)!(!

!),;(

Variance

V=<(r- )2>=<r2>-<r>2

=np(1-p)

Mean

=<r>=rP( r )

= np

1-p p q

A random process with exactly two possible outcomes which occur with fixed probabilities.

Page 9: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Binomial ExamplesBinomial Examples

0

0.5

1

r

0

0.1

0.2

0.3

r

0

0.1

0.2

0.3

0.4

r

0

0.1

0.2

0.3

0.4

r

n=10 p=0.2 p=0.5 p=0.8

0

0.1

0.2

0.3

r

p=0.1 n=20

0

0.05

0.1

0.15

0.2

r

n=50 n=5

Page 10: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

PoissonPoisson

‘ Events in a continuum’The probability of observing r

independent events in a time interval t, when the counting rate is and the expected number events in the time interval is .

0

0.1

0.2

0.3

r

!!

)();(

re

r

terP

rrt

Mean

=<r>=rP( r )

= Variance

V=<(r- )2>=<r2>-<r>2

=

=2.5

tconstNp

p

N

0

Page 11: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

More about PoissonMore about Poisson

• The approach of the binomial to the Poisson distribution as N increases.

• The mean value of r for a variable with a Poisson distribution is and so is the variance. This is the basis of the well known nn formula that applies to statistical errors in many situations involving the counting of independent events during a fixed interval.

• As , the Poisson distribution tends to a Gaussian one.

Page 12: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Poisson ExamplesPoisson Examples

0

0.1

0.2

0.3

r0

0.2

0.4

0.6

0.8

r

0

0.1

0.2

0.3

0.4

r

0

0.1

0.2

r

0

0.2

r

0

0.1

r

=25=10=5.0

=0.5 =2.0=1.0

Page 13: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

ExamplesExamples

• The number of particles detected by a counter in a time t, in a situation where the particle flux and detector are independent of time, and where counter dead-time is such that <<1.

• The number of interactions produced in a thin target when an intense pulse of N beam particles is incident on it.

• The number of entries in a given bin of a histogram when the data are accumulated over a fixed time interval.

Page 14: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Binomial and PoissonBinomial and Poisson

From an exam paperA student is standing by the road, hoping to hitch a lift. Cars pass

according to a Poisson distribution with a mean frequency of 1 per minute. The probability of an individual car giving a lift is 1%. Calculate the probability that the student is still waiting for a lift

(a) After 60 cars have passed

(b) After 1 hour

a) 0.9960=0.5472 b) e-0.6 * 0.60 /0! =0.5488

Page 15: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Gaussian (Normal)Gaussian (Normal)

Probability Density

ex

xP22 2/)(

2

1),;(

Mean

=<x>=xP( x ) dx

=Variance

V=<(x- )2>=<x2>-<x>2

=

Page 16: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Different GaussiansDifferent Gaussians

There’s only one!

Normalisation (if required)

Location change

Width scaling factor

Falls to 1/e of peak at x=

Page 17: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Probability ContentsProbability Contents

68.27% within 195.45% within 299.73% within 3

90% within 1.645 95% within 1.960 99% within 2.576 99.9% within

3.290These numbers apply to Gaussians and only Gaussians

Other distributions have equivalent values which you could use of you wanted

Page 18: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Central Limit TheoremCentral Limit Theorem

Or: why is the Gaussian Normal?If a variable x is produced by the

convolution of variables x1,x2…xN

I) <x>=1+2+…N

II) V(x)=V1+V2+…VN

III) P(x) becomes Gaussian for large N

Page 19: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Multidimensional GaussianMultidimensional Gaussian

e yxyxyyxx yxyx

yx

yxyxyxP

/))((2/)(/)()1(2

1

2

22222

12

1

),,,,;,(

ee yyxx yx

yxyxyxyxP

2222 2/)(2/)(

2

1),,,;,(

Page 20: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Chi squaredChi squared

Sum of squared discrepancies, scaled by expected error

Integrate all but 1-D of multi-D Gaussian

n

i i

iix

1

2

2

2/22/

2 2

)2/(

2);(

e

nnP n

n

Page 21: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.
Page 22: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

About EstimationAbout Estimation

Theory Data

Statistical

Inference

TheoryData

Probability

Calculus

Given these distribution parameters, what can we

say about the data? Given this data, what can we say about the properties or parameters or correctness of

the distribution functions?

Page 23: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

What is an estimator?What is an estimator?

An estimator (written with a hat) is a function of the data whose value, the estimate, is intended as a meaningful guess for the value of the parameter . (from PDG) 2)ˆ(

1}{ˆ

iix

NxV

i

ixN

x1

}{̂

2)ˆ(1

1}{ˆ

iix

NxV

2

}{ˆ minmax xxx

Page 24: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Minimum Variance Bound

What is a good estimator?What is a good estimator?

A perfect estimator is:• Consistent• Unbiassed

• Efficient

minimum

aaLimitN

ˆ

adxdxaxPaxPaxPxxaa ...)...;();();(,...),(ˆ...ˆ 2132121

2ˆˆ)ˆ( aaaV

One often has to work with less-than-perfect estimators

2

2 ln

1)ˆ(da

LdaV

Page 25: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

The Likelihood FunctionThe Likelihood Function

Set of data {x1, x2, x3, …xN}

Each x may be multidimensional – never mind

Probability depends on some parameter a

a may be multidimensional – never mind

Total probability (density)

P(x1;a) P(x2;a) P(x3;a) …P(xN;a)=L(x1, x2, x3, …xN ;a)

The Likelihood

Page 26: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Maximum Likelihood Estimation

Maximum Likelihood Estimation

In practice usually maximise ln L as it’s easier to calculate and handle; just add the ln P(xi)

ML has lots of nice properties

aadA

dL

Given data {x1, x2, x3, …xN} estimate a by maximising the likelihood L(x1, x2, x3, …xN ;a)

a

Ln L

â

Page 27: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Properties of ML estimationProperties of ML estimation

• It’s consistent (no big deal)

• It’s biased for small NMay need to worry

• It is efficient for large NSaturates the Minimum Variance Bound

• It is invariantIf you switch to using u(a), then û=u(â)

a

Ln L

â u

Ln L

û

Page 28: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

More about MLMore about ML

• It is not ‘right’. Just sensible.

• It does not give the ‘most likely value of a’. It’s the value of a for which this data is most likely.

• Numerical Methods are often needed

• Maximisation / Minimisation in >1 variable is not easy

• Use MINUIT but remember the minus sign

Page 29: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

ML does not give goodness-of-fit

ML does not give goodness-of-fit

• ML will not complain if your assumed P(x;a) is rubbish

• The value of L tells you nothing

Fit P(x)=a1x+a0

will give a1=0; constant P

L= a0N

Just like you get from fitting

Page 30: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Least SquaresLeast Squares

• Measurements of y at various x with errors and prediction f(x;a)

• Probability• Ln L

• To maximise ln L, minimise 2

22 2/));(( axfye 2

);(

2

1

ii

ii axfy

x

y

So ML ‘proves’ Least Squares. But what ‘proves’ ML? Nothing

Page 31: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Least Squares: The Really nice thing

Least Squares: The Really nice thing

• Should get 21 per data point• Minimise 2 makes it smaller – effect is 1 unit

of 2 for each variable adjusted. (Dimensionality of MultiD Gaussian decreased by 1.)

Ndegrees Of Freedom=Ndata pts – N parameters

• Provides ‘Goodness of agreement’ figure which allows for credibility check

Page 32: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Chi Squared ResultsChi Squared Results

Large 2 comes from

1. Bad Measurements

2. Bad Theory

3. Underestimated errors

4. Bad luck

Small 2 comes from

1. Overestimated errors

2. Good luck

Page 33: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Fitting HistogramsFitting Histograms

Often put {xi} into bins

Data is then {nj}

nj given by Poisson,

mean f(xj) =P(xj)x4 Techniques

Full MLBinned MLProper 2

Simple 2

x

x

Page 34: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

What you maximise/minimiseWhat you maximise/minimise

j j jjjjj ffnfnPoissonL ln);(lnln

j

j

jj

f

fn 2

j

j

jj

n

fn 2

• Full ML

• Binned ML

• Proper 2

• Simple 2

i i axPL );(lnln

Page 35: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Confidence Level:Meaning of Error Estimates

Confidence Level:Meaning of Error Estimates

• How often we expect to include “the true fixed value of our paramter” P0, within our quoted range, pp, for a repeated series of experiments?

• For the actual value P0, the probability that a measurement will give us an answer in a specific range of p is given by the area under the relevant part of Gaussian curve. A conventional choice of this probability is 68%.

Page 36: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

The Straightforward ExampleThe Straightforward Example

Apples of different weights

Need to describe the distribution

= 68g = 17 g

50 100

All weights between 24 and 167 g (Tolerance)

90% lie between 50 and 100 g

94% are less than 100 g

96% are more than 50 g

Confidence level

statements

Page 37: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Confidence LevelsConfidence Levels

• Can quote at any level

(68%, 95%, 99%…)• Upper or lower or two-sided

(x<U x<L L<x<U)• Two-sided has further choice

(central, shortest…)

U

L U’

Page 38: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Maximum Likelihood and Confidence Levels

Maximum Likelihood and Confidence Levels

ML estimator (large N) has variance given by MVB

At peak For large N

Ln L is a parabola (L is a Gaussian)

2

2 ln

12ˆ )ˆ(

da

Lda aV

aada

LdaaLL

ˆ2

22

max

ln

2

)ˆ(ln

a

Ln Laa

da

Ld

da

Ld

ˆ2

2

2

2 lnln

2

max 2

)ˆ(ln

a

aaLL

Falls by ½ at aaa ˆˆ Falls by 2 at aaa ˆ2ˆ

Read off 68% , 95% confidence regions

Page 39: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Monte Carlo CalculationsMonte Carlo Calculations

• The Monte Carlo approach provides a method of solving probability theory problems in situations where the necessary integrals are too difficult to perform.

• Crucial element: random number generator.

Page 40: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

An ExampleAn Example

1.-0 range in the

ddistributeuniformly numbers random of series a

of members are where)(*.2

/)(*)5.0(.1

)(

)(

1

iii

i

n

ii

b

a

rabrax

nabiax

xyn

ab

dxxyI

Page 41: Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

ReferencesReferences

• Lectures and Notes on Statistics in HEP, http://www.ep.ph.bham.ac.uk//group/locdoc/lectures/stats/index.html

• Lecture notes of Prof. Roger Barlow, http://www.hep.man.ac.uk/u/roger/

• Louis Lyons, “Statistics for Nuclear and Particle Physicists”, Cambridge 1986.

• Particle Data Group, http://pdg.lbl.gov/2004/reviews/contents_sports.html#mathtoolsetc