Ch10-4page

download Ch10-4page

of 30

Transcript of Ch10-4page

  • 8/22/2019 Ch10-4page

    1/30

    10 Hypothesis Testing 1

    10 Hypothesis Testing

    Hypothesis testing is used to investigate whether or not data are

    consistent with some theory, when that theory can be quantified

    through a particular value of a (population) parameter.

    10.1 Four Basic Elements of a Hypothesis Test

    Null hypothesis H0 Alternative hypothesis Ha

    Test statistic

    Rejection region, RR (also called critical region)

    10 Hypothesis Testing 2

    1. The Null Hypothesis: denoted by H0.

    The null hypothesis H0 is an assertion about the population

    (parameter). The purpose of the hypothesis testing is to test the

    viability of the null hypothesis in the light of experimental data.

    E.g. An experiment on a new antidepressant drug: Ten people suffering from depression were sampled and treatedwith the new drug, and the level of depression of all subjects

    was measured after 12 weeks, denoted by Y1, , Yn. We want to compare the mean depression level of the

    drug-taken patients with that of patients not taking any drug

    (known to be 0 = 6, say).

    The null hypothesis would be designated by the following symbols:H0 : = 6.

    In this course, we will usually consider the simple null

    hypothesis; that is, there is only one possible value of under H0.

    10 Hypothesis Testing 3

    2. The Alternative Hypothesis: denoted by Ha.

    The alternative hypothesis Ha describes values of other than

    those specified in H0. It is usually the hypothesis that we seek to

    support based on the information contained in the sample data set.

    In the previous example, if the researcher believes that the mean

    depression level for patients taking the new drug is smaller than 0 = 6, i.e. the drug is effective in reducing thedepression level, the researcher will use

    Ha : < 6 (left-tailed)

    larger than 0 = 6, i.e. the drug is not effective, the researcherwill use

    Ha : > 6 (right-tailed)

    different from 0 = 6, the researcher will useHa : = 6 (two-tailed)

    10 Hypothesis Testing 4

    Remark: Notice that H0 false doesnt necessarily mean that Ha is

    true (unless the union of H0 and Ha constitutes the entire

    parameter space).

    In general, the alternative is chosen to reflect the researchers

    belief about the parameter. Therefore, Ha is sometimes called the

    researchers hypothesis. The aim is to see if the researchershypothesis is supported by the data set.

    3. The Test Statistic: the test statistic is a statistic that is used

    to test H0 versus Ha. We make our decision by comparing the

    observed value of the test statistic to its sampling distribution

    under H0. Recall that a statistic is a function of the observed data

    Y1, , Yn. used for inference about (population) parameters.

    For the antidepressant drug example, a test statistic would bebased on the unbiased estimator of :

    T = Y

  • 8/22/2019 Ch10-4page

    2/30

    10 Hypothesis Testing 5

    Most often a test statistic is based on a MVUE or MLE of the

    parameter of interest that describes H0.

    If the observed value of the test statistic is consistent with itssampling distribution under H0, then there is not enough

    evidence for Ha If the observed value of the test statistic is not consistent withits sampling distribution under H0, and is in the direction of

    sampling distribution specified under Ha, then there is enough

    evidence to reject H0 (and support Ha.)

    4. The Rejection Region (RR): The rejection region (RR)

    specifies the values of the test statistic for which H0 is rejected.

    If tobs RR, reject H0 If tobs / RR, do not reject H0 The RR is usually located in tails of the sampling distribution

    of t derived under H0

    10 Hypothesis Testing 6

    In the depression example, suppose RR = {Y 4} and thaty = 4 is observed. Then we reject H0 and conclude that there

    is evidence that the druge, on average, reduces depression

    levels. The sample mean, y = 3, is significantly lower than

    the hypothetical mean, = 6.)The above four elements: H0, Ha, test statistic T, and RR are the

    building blocks of a statistical test. Any statistical test should include

    all of the above four elements.

    10 Hypothesis Testing 7

    Another Hypothesis Testing Example

    Assessing ESP (Extra-sensory perception) ability

    (fabricated data)

    Hypotheses:

    H0: Rachael does not have ESP (=random guessing)

    Ha: Rachael has ESP

    Experiment:

    A deck of 52 cards has 26 red and 26 black. The cards are shuffled

    and one selected at random. Rachael guesses the color of the card.

    Data from experiment:

    Carry out n = 20 repetitions, shuffling each time, observing correct or

    not correct.Test statistic:

    T = Number of correct responses out of n

    10 Hypothesis Testing 8

    Distribution of Test Statistic under H0:

    T Bin(n = 20, p),where p is the success rate, with H0 : p = 0.5. (Trials assumed

    independent)

    Restate hypotheses in terms of parameters:

    H0 : p = 0.5 [pure guessing]

    Ha : p > 0.5 [responses informed by ESP]

    Rejection Region RR = {tobs 15}.Outcome of the experiment: tobs = 16

    Conclusion:

    Since tobs falls in RR, i.e. tobs = 16

    {tobs

    15

    }, contradicting the

    null hypothesis H0. We reject H0 as implausible and conclude that the

    observed rate of success ( p = .8) is significantly higher than p0 = 0.5.

    There is evidence of ESP.

  • 8/22/2019 Ch10-4page

    3/30

    10 Hypothesis Testing 9

    Hypothesis TestingErrors

    Because we are choosing between H0 and Ha based on the sample

    data, there is a chance that we make an error.

    tobs RR tobs / RRReject H0 Do not reject H0

    H0 true Type I error

    H0 false Type II error

    Two types of errors can be made in reaching a decision.

    Type I error: Reject H0 when H0 is true.Type II error: Fail to reject H0 when H0 is false.

    10 Hypothesis Testing 10

    Error Probabilities

    Probability of making a Type I error:

    = P(reject H0|H0 true)

    is called the Level of significance or the Level of the test.

    Probability of making a type II error (at a particular/given value of

    = a in Ha):

    = (a) = P(fail to reject H0| = a).

    10 Hypothesis Testing 11

    Error Probabilities

    Example 10.1.1 In the ESP example, calculate and when p = .7

    The Type I error probability:

    = P(reject H0|H0 true)

    10 Hypothesis Testing 12

    The Type II error probability for p = 0.7:

    (0.7) = P(fail to reject H0|p = 0.7)= P(T < 15|p = 0.7)= P(T 14|p = 0.7)

    = B(14; n = 20, p = 0.7)= 0.584.

    Practice: calculate (0.9).

  • 8/22/2019 Ch10-4page

    4/30

    10 Hypothesis Testing 13

    Error Probabilities

    Type I error probability (also referred to as level of

    significance, or less formally, false positive rate)

    = P(reject H0|H0 true)

    Type II error probability (also referred to false negative rate):

    = (a) = P(fail to reject H0| = a).

    Power(a) = 1 (a):1 (a) = P(reject H0| = a).

    Remark: Ideally we would like to reduce both and . However, witha fixed sample size n, we cannot reduce both of them. So we generally

    fix to a small value (say 0.05 or 0.01), and construct a RR (in this

    way, will be minimized).

    10 Hypothesis Testing 14

    Example 10.1.2 LetY Uniform(, + 1). Consider H0 : = 0versus Ha : > 0. The test procedure: reject H0 if Y > 0.95.

    1. Calculate the level of significance of this test

    2. Calculate when = 0.5.

    10 Hypothesis Testing 15

    10.2 Large Sample Tests

    Suppose Y1, , Yn is a rs with n large and from a distribution withparameter of interest . Let denote an estimator with large sample

    distribution N(, 2

    ).

    For example, if is consistent and unbiased estimator of . By CLT,

    we often have, for large n,

    d N(, 2

    ).

    We assume that 2

    is known or can be estimated by a consistent

    estimator 2

    .

    10 Hypothesis Testing 16

    Large Sample Tests

    Notational convention: SD() = andSD() = , V() = 2 .

    Null and alternative Hypotheses:H0 : = 0 versus Ha : > 0,

    where the value 0 is specified. Test statistic: T = RR = { : > k} (value of k to be determined) Specify the level of significance, (say 0.05 or 0.01), i.e., the

    largest Type I error probability that can be tolerated.

    We determine the value of k so that the corresponding type I errorprobability equals the pre-specified level . That is, we determine

    k by solving the equation:

    = P( > k | = 0)

  • 8/22/2019 Ch10-4page

    5/30

    10 Hypothesis Testing 17

    set= P( > k| = 0)

    = P

    0

    >

    k 0

    | = 0

    PZ > k 0

    | = 0= P(Z > z).

    Therefore, to solve this equation for k, we need to set

    k 0

    = z k = 0 + z

    Thus,

    RR =

    : > 0 + z

    =

    :

    0

    > z

    .

    10 Hypothesis Testing 18

    Large Sample TestSummary Procedure

    State hypotheses (right-tailed alternative):

    H0 : = 0 versus Ha : > 0

    Test Statistic:

    Z = 0

    , [Z

    approx N(0, 1) when = 0]

    Rejection region:

    RR = {zobs : zobs > z}. If left-tailed test: H0 : = 0 versus Ha : < 0, then

    RR =

    {zobs : zobs z/2}.

    Note that the alternatives are paired with the corresponding RRs.

    10 Hypothesis Testing 19

    Large Sample TestUnknown Variance

    Suppose that is unknown and that

    p ,

    i.e. is a consistent estimator of . Then

    Z = 0

    =

    0

    Since

    p 1, and 0

    d N(0, 1).

    It follows by Sluskys Theorem that

    Z =

    0

    d

    N(0, 1).

    Therefore, the same test procedure follows.

    10 Hypothesis Testing 20

    Large Sample TestUnknown Variance

    Use Consistent Estimator of , , in the Z-test statistic.

    Test Statistic:

    Z = 0

    SD(), [Z

    approx N(0, 1) when = 0]

    Right-tailed test: H0 : = 0 vs Ha : > 0.RR = {zobs : zobs > z}.

    Left-tailed test: H0 : = 0 vs Ha : < 0.RR = {zobs : zobs < z}.

    Two-tailed test: H0 : = 0 vs Ha : = 0.

    RR = {zobs : |zobs| > z/2}.

  • 8/22/2019 Ch10-4page

    6/30

    10 Hypothesis Testing 21

    Large Sample TestExamples

    Example 10.2.1 A company wishes to test its claim that the average

    lifetime of the tire they sell is 20,000 miles. The experiment yields

    n = 36 observations with the sample mean y = 19, 375 miles. Carry

    out the hypothesis test and conclude at significance level = 0.01.Suggested procedures:

    Parameter of interest: = = mean life time for the population Construct null and alternative hypotheses:

    H0 : = 20, 000 vs Ha : < 20, 000

    Compute the test statistic:

    Construct the rejection region at level 0.01

    10 Hypothesis Testing 22

    Conclude:

    Q: What if Ha : = 20, 000?

    10 Hypothesis Testing 23

    Example 10.2.2 (10.19 of WMS) A sample of 40 independent

    readings on the voltage for this circuit gave a sample mean of 128.6,

    and standard deviation 2.1. Test the hypothesis that the mean output

    voltage is 130 against the alternative that it is less than 130. Use a

    test with level 0.05.

    10 Hypothesis Testing 24

    Large Sample TestProportions

    Data: Y Bin(n, p) Hypotheses: H0 : p = p0 vs Ha : p > p0

    for a given value of p0 (0, 1). The level is specified.

    Parameter of interest: = p Unbiased and consistent estimator = p = Y /n 2

    = V(Y/n) = p(1p)/n. So under H0, 2 = p0(1p0)/n.

    Test statistic:

    Z = 0

    =

    p p0

    p0(1p0)/n[by CLT , Z

    d N(0, 1) under H0]

    RR = {zobs : zobs > z}.

  • 8/22/2019 Ch10-4page

    7/30

    10 Hypothesis Testing 25

    Large Sample TestProportion

    Example 10.2.3 Each member of a panel of 100 tasters was

    presented with three glasses of beer in random order, one of which was

    different from the other two (e.g. AAB). Each taster was asked to

    identify which beer was different. Let p = P(B is correctly identified).If the tasters are unable to distinguish between the beers we would

    expect p = 1/3. If they are able to distinguish we expect p > 1/3.

    Suppose among 100 tasters, 40 answered correctly. Are tasters able to

    distinguish? Carry out the hypothesis test at level 0.05.

    Construct the appropriate null and alternative hypotheses:

    10 Hypothesis Testing 26

    Calculate the test statistic value:

    Construct the rejection region at level 0.05

    Conclude:

    (BTW, what is the population here?)

    10 Hypothesis Testing 27

    Large Sample TestOther

    Table 8.1 on page 397:

    10 Hypothesis Testing 28

    Example 10.2.4 LetX1, , Xn be a random sample from theexponential distribution with pdf

    f(x) =1

    ex/, 0 < x < +, 0 < < +.

    1. Find the large sample distribution of the method of moments

    estimator of , .

    2. Set up a large sample test of H0 : = 0 versus Ha : < 0using level . Specify the rejection region.

  • 8/22/2019 Ch10-4page

    8/30

    10 Hypothesis Testing 29

    3. Using a random sample of size n = 64 and level = 0.05, test the

    hypothesis H0 : = 10 versus H0 : < 10 when the sample mean

    is x = 7.7. State your conclusion.

    10 Hypothesis Testing 30

    Large Sample Test

    ExampleTwo-sample Z-test

    Example 10.2.5 Samples of 36 males and 40 females tested to

    determine their temperature preference. Assume that variances are

    known.

    Samples:

    Y1,1, Y1,2, , Y1,nm , and Y2,1, Y2,2, , Y2,nf.

    Males: nm = 36, 2m = 4.0, ym = 74.6

    Females: nf = 40, 2f = 2.5, yf = 76.5

    Do females and males differ with respect to their temperaturepreferences? Conduct the test at level = 0.01.

    Let m and f denote the mean temperature preference of males and

    females, respectively.

    10 Hypothesis Testing 31

    Solution:

    10 Hypothesis Testing 32

    Solution, continued.

    Note that if 2m and 2f are unknown, since both sample sizes are

    large, we would substitute their estimators from each sample, s2m

    and

    s2f, where, for example, S2m =

    1nm1

    nmj=1(Yi,j Ym)2. We may do

    this because S2m (and S2f) are consistent estimators of

    2m (and

    2f).

  • 8/22/2019 Ch10-4page

    9/30

    10 Hypothesis Testing 33

    Difference between two population proportions

    Example: 10.33 on WMS. A political researcher believes that the

    fraction p1 of Republicans strongly in favor of the death penalty is

    greater than the fraction p2 of Democrats strongly in favor of the

    death penalty. He acquired independent random samples of 200Republicans and 200 Democrats and found 46 Republicans and 34

    Democrats strongly favoring the death penalty. Does this evidence

    provide statistical support for the researchers belief? Use = 0.05.

    10 Hypothesis Testing 34

    10.3 Sample Size and Power

    Suppose Y1, Y2, , Yn is a rs from N(, 2). We wish to testH0 : = 0 versus Ha : > 0.

    Question: can we find a sample size n, which guarantees that theType I & Type II error probabilities will not exceed and

    respectively? Here and are prespecified values.

    10 Hypothesis Testing 35

    Recall the level test procedure for this hypothesis testing problem.

    Note that under H0 : = 0,

    Z =Y 0/

    n N(0, 1).

    Therefore from the large sample test constructed earlier, we reject H0

    when

    zobs > z.

    That is, the Rejection Region (RR) is:

    RR = {Y : Y > k}, where k = 0 + z n

    .

    10 Hypothesis Testing 36

    Therefore, the Type II error probability is:

    (a) = P(Do not reject H0| = a)= P(Y k| = a)= P

    Y k|Y N(a, 2/n)

    = P Y a/n < k a/n = P(Z 0 for right-tailed test)

    10 Hypothesis Testing 38

    Sample Size DeterminationSummary

    For upper- or lower-tailed level tests, to control the Type II error

    probability at when = a under Ha, the required sample size is

    (z + z)0 a 2

    For two-tailed level tests, the required sample size is:

    n =

    (z/2 + z)

    0 a

    2Remark: the above conclusions hold for the z-type tests based on

    the normal distribution (or approximate), which applies when the teststatistic is normally distributed (or approximately for large samples)

    10 Hypothesis Testing 39

    Sample Size Determination: Example

    Example 10.3.1 An SAT prep course claims to increase average

    Verbal SAT scores.

    To test the claim, n candidates will be selected at random to receive

    the training and take the test.

    It is known that in the population under study, SAT-V scores are

    distributed

    N( = 565, 2 = 402)

    We want the probability of making a type II error, when there is a 15

    point mean increase, to be 0.1(or less), when the level is = 0.05.

    Determine how many candidate should be selected.

    10 Hypothesis Testing 40

    Power of the Test

    The power of a test at a particular value in the alternative, = a, is

    defined as

    Power(a) = P{Reject H0| = a} = 1 (a)

    In the SAT example, we find that ifn is 61 and the true mean SAT-Vscore of people taking the SAT prep course is 580 (increases 15

    points), then the power of the test is

    Power(580) = 1 0.1 = 0.9

  • 8/22/2019 Ch10-4page

    11/30

    10 Hypothesis Testing 41

    Power curve for SAT-V example (for n = 61)

    10 Hypothesis Testing 42

    Power curve for SAT-V example (for a = 580)

    10 Hypothesis Testing 43

    Another exampleBeer tasting

    Refer to Example 10.2.3: H0 : p = 1/3 versus H0 : p > 1/3.

    RR for a level = 0.05 test is:

    RR : zobs =pp0

    p0(1p0)/n

    > 1.645,

    where p0 = 1/3, p = Y /n, Y is the number of tasters that answered

    correctly. That is,

    RR : p > k, where k = p0 +

    p0(1p0)

    n z

    Calculate (0.5), the Type II error rate when pa = 0.5.

    Recall n = 100.

    10 Hypothesis Testing 44

    Solution:

    10 H th i T ti 45 10 H th i T ti 46

  • 8/22/2019 Ch10-4page

    12/30

    10 Hypothesis Testing 45

    Q: Calculate the minimize sample size n to control (0.5) at 0.01.

    We need solve n such that

    k pa

    pa(1pa)/n

    = z, (1)

    where k = p0 +p0(1p0)

    n z also depends on n. Solving

    p0 +

    p0(1p0)n z pa

    pa(1pa)/n= z

    n =

    z

    pa(1pa) + z

    p0(1p0)p0 pa

    2Thus for this example, p0 = 1/3, pa = 0.5, z = 1.645, z = 2.33, so

    n = 135

    10 Hypothesis Testing 46

    Power curve for Beer example (for n = 100)

    10 Hypothesis Testing 47

    Type II error rate for Beer example (for pa = 0.5)

    10 Hypothesis Testing 48

    10.4 Test/confidence interval relationship

    Hypothesis Testing and Confidence Intervals

    Hypothesis testing has a close connection to confidence intervals (CI)

    in the sense that confidence intervals are often the

    complement of rejection regions

    The complement of the RR is sometimes called the acceptance

    region

    Consider the problem of two-tailed alternatives:

    H0 : = 0 Ha : = 0

    Test statistic: Z = 0

    Rejection region:RR = {Z : |Z| > z/2}

    10 Hypothesis Testing 49 10 Hypothesis Testing 50

  • 8/22/2019 Ch10-4page

    13/30

    10 Hypothesis Testing 49

    From the above, the Acceptance Region is:

    RR =|Z| z/2 = 0 z/2 .

    Notice that a 100(1 )% CI for is:

    z/2.

    Therefore,

    Reject H0 if and only if 0 / CIIn other words, testing for a level two-tailed alternative is equivalent

    to checking if the hypothesized value of (= 0) lies in the

    100(1 )% CI for .A similar relationship exists between one-sided alternative hypotheses

    and one-sided confidence intervals.

    10 Hypothesis Testing 50

    Test-confidence interval relationshipExample

    Refer to Example 10.2.5 (Male and Female Temperature Experience).

    Males: nm = 36, 2m = 4.0, ym = 74.6

    Females: nf = 40, 2f = 2.5, yf = 76.5

    Let m and f denote the mean temperature preference of males and

    females, respectively.

    Construct a 99% confidence interval for m f. Is the valuem f = 0 contained in the confidence interval? Based on theinterval, should we reject H0 : m f = 0?

    10 Hypothesis Testing 51

    10.5 The p-value

    Observed significance level Often misunderstood P(these data or more extreme; H0 is true) Reject H0 at level p value < Definition: the smallest level of significance at which H0 can be

    rejected.

    10 Hypothesis Testing 52

    Refer to Example 10.2.4.

    H0 : = 10 versus Ha : < 10; zobs = 1.84 If = 0.1, z = 1.28, RR: zobs < 1.28, conclusion: Reject H0 If = 0.05, z = 1.65, RR: zobs < 1.65, conclusion: Reject H0

    ... If = 0.025, z = 1.96, RR: zobs < 1.96, so Do not reject H0

    Questions:

    Calculate P(Z < 1.28) and P(Z < 1.85) What is the smallest so that H0 will be rejected?

    For what values, we can reject H0? for any > 0.03

    Normal Calculator:

    http://www.stat.tamu.edu/west/applets/normaldemo.html

    10 Hypothesis Testing 53 10 Hypothesis Testing 54

  • 8/22/2019 Ch10-4page

    14/30

    10 Hypothesis Testing 53

    Calculation of p-values

    The p-value is the probability of obtaining a test statistic value as

    extreme as the observed value, calculated assuming H0 is true.

    Consider testing H0 : = 0. Suppose the test statistic Z N(0, 1)(or approximately) under H0.

    Left-tailed test Ha : < 0, p-value=P(Z < zobs) Right-tailed test Ha : > 0, p-value=P(Z > zobs) Two-tailed test Ha : = 0,

    p-value=P(Z < |zobs| OR Z > |zobs|) = 2P(Z > |zobs|)

    The more extreme observed test statistic value smaller p-value more evidence to reject H0

    10 Hypothesis Testing 54

    Calculation of p-value

    Example 10.5.1 Refer to the Beer Tasting example 10.2.3.

    H0 : p = 1/3, Ha : p > 1/3, = 0.05

    y = 40 out of n = 100 correct ids observed. Calculate the p-value andconclude.

    10 Hypothesis Testing 55

    Example 10.5.2 (10.57 of WMS)

    A publisher of a newsmagazine ad found through past experience that

    60% of subscribers renew their subscription. In a recent random

    sample of n = 200 subscribers, 108 indicated that they planned to

    renew. What is the p-value associated with the test that the current

    rate of renewal differs from the previously experienced? State your

    conclusion using = 0.05. How about = 0.1?

    BTW, does the total number of subscribers, N matter?

    10 Hypothesis Testing 56

    Example 10.5.3 Refer to Example 10.2.5 (Male and Female

    Temperature Preference).

    Test H0 : m f = 0 versus Ha : m f = 0. Calculate thep-value and conclude using = 0.01.

    Males: nm = 36, 2m = 4.0, ym = 74.6

    Females: nf = 40, 2f = 2.5, yf = 76.5

    10 Hypothesis Testing 57 10 Hypothesis Testing 58

  • 8/22/2019 Ch10-4page

    15/30

    yp g

    10.6 Testing means in small samples (normal)

    Recall: The z-test

    Testing Means in Normal Samples with Known Variances

    Assumption: Y1, , Yn a rs from N(, 2

    ), is known H0 : = 0 versus

    Ha : = 0 (two-tailed test) Ha : < 0 (lower-tailed test)

    Ha : > 0 (upper-tailed test)

    Test statistic:

    Z =Y

    0/n

    Under H0 : Z N(0, 1)

    yp g

    Two-tailed test:Ha : = 0RR =

    zobs : |zobs| > z/2

    p-value = P(|Z| > |zobs|) = 2P(Z > |zobs|)

    Right-tailed test:

    Ha : > 0

    RR = {zobs : zobs > z}p-value = P(Z > zobs)

    Left-tailed test:Ha : < 0

    RR = {zobs : zobs < z}p-value = P(Z < zobs)

    10 Hypothesis Testing 59

    Recall: The large-sample z-test

    Testing Means in Large Samples with Unknown Variances

    Assumption: Y1, , Yn a rs with common mean andunknown variance 2, where n is large.

    H0 : = 0 versus Ha : = 0 (two-tailed test) Ha : < 0 (left-tailed test)

    Ha : > 0 (right-tailed test)

    Test statistic:Z =

    Y 0S/

    n,

    S2 is the sample variance.

    Under H0 : Z approx N(0, 1) RR and p-value calculations are the same as the previous z-test

    10 Hypothesis Testing 60

    Two-tailed test:Ha : = 0RR =

    zobs : |zobs| > z/2

    p-value = P(|Z| > |zobs|) = 2P(Z > |zobs|)

    Right-tailed test:

    Ha : > 0

    RR = {zobs : zobs > z}p-value = P(Z > zobs)

    Left-tailed test:Ha : < 0

    RR = {zobs : zobs < z}p-value = P(Z < zobs)

    10 Hypothesis Testing 61 10 Hypothesis Testing 62

  • 8/22/2019 Ch10-4page

    16/30

    The t-test

    Testing Means in Small Normal Samples

    Y1, , Yn a rs from N(, 2), 2 unknown and n is small

    H0 : = 0 versus

    Ha : = 0 (two-tailed test) Ha : < 0 (lower-tailed test)

    Ha : > 0 (upper-tailed test)

    Significance level: Test statistic:

    T =Y 0S/n

    ,

    where Y and S2 are the sample mean and variance.

    Under H0, T tn1

    Two-tailed test:Ha : = 0RR =

    tobs : |tobs| > tn1,/2

    p-value = P(|Tn1| > |tobs|) = 2P(Tn1 > |tobs|),

    where Tn1 is a rv following the tn1 distribution.Right-tailed test:

    Ha : > 0

    RR = {tobs : tobs > tn1,}p-value = P(Tn1 > tobs)

    Left-tailed test:Ha : < 0

    RR = {tobs : tobs < tn1,}p-value = P(Tn1 < tobs)

    T distribution calculator:

    http://www.stat.tamu.edu/west/applets/tdemo.html

    10 Hypothesis Testing 63

    Example: IQ Test

    Example 10.6.1 Ten sampled students aged 18-21 years received

    special training. They are given an IQ test that is N(100, 102) in the

    general population. Let be the mean IQ of these students who

    received special training. The observed IQ scores:

    121, 98, 95, 94, 102, 106, 112, 120, 108, 109

    Test if the special training improves the IQ score using significance

    level = 0.05.

    10 Hypothesis Testing 64

    Solution, continued

    . . .p = .029, so that at level = .05, H0 is rejected, and the observed

    sample mean, Y is significantly greater than = 100.

    10 Hypothesis Testing 65 10 Hypothesis Testing 66

  • 8/22/2019 Ch10-4page

    17/30

    Small Sample TestsTwo-Sample t-test

    Independent random samples of size n1 and n2 from populations

    N(1, 2) and N(2,

    2), where is unknown.

    H0 : 1

    2 = D0 (e.g. D0 = 0) versus

    Ha : 1 2 = D0 (or 1 2 < D0 or 1 2 > D0) Significance level Test statistic:

    T =Y1 Y2 D0Sp

    1n1

    + 1n2

    where Y1 and Y2 are the sample means and S21 and S

    22 are the

    sample variances from two groups.

    and the pooled estimator of the common variance 2 is

    S2p =(n1 1)S21 + (n2 1)S22

    n1 + n2 2 .

    Under H0 : T

    tn1+n22

    Two-tailed test:Ha : 1 2 = D0RR =

    tobs : |tobs| > t/2,n1+n22

    p-value = P(|Tn1+n22| > |tobs|) = 2P(Tn1+n22 > |tobs|),where Tn1+n22 is a rv following the tn1+n22 distribution.

    10 Hypothesis Testing67

    Example: Recovery time for new drug

    Example 10.6.2 Twenty subjects randomized to two groups, n = 10

    each. The recovery time for patients taking a new drug (or placebot)

    is measured in days. Data follow

    with drug (1): 15 10 13 7 9 8 21 9 14 8

    placebo(2): 15 14 12 8 14 7 16 10 15 12

    Assume that the data are normally distributed and that 1 = 2. Use

    = 0.05 to test H0 : 1 2 = 0 versus Ha : 1 2 < 0

    10 Hypothesis Testing68

    Two-Sample t-test (Unequal variances)

    Basic Assumptions

    1. X1, , Xm is a random samples from N(1, 21), and 1 isunknown.

    2. Y1, , Yn is a random samples from N(2, 22), and 2 isunknown.

    3. The X and Y samples are independent of each other.

    The standardized variable

    (X Y) (1 2)S21

    m +S22

    n

    has approximately a t distribution with degree of freedom estimated

    from the data by

    =( s

    21

    m +s22n )

    2

    (s21

    /m)2

    m1 +(s22

    /n)2

    n1(round down)

    10 Hypothesis Testing 69 10 Hypothesis Testing 70

  • 8/22/2019 Ch10-4page

    18/30

    Null hypothesis: H0 : 1 2 = D0.Test statistic value:

    tobs =(x y)D0

    s21

    m +s22

    n

    .

    Alternative Hypothesis Rejection Region for Level Test

    Ha : 1 2 > 0 tobs t,Ha : 1 2 < 0 tobs t,Ha : 1 2 = 0 tobs t,/2 OR t t,/2

    The p-values can be calculated as in the two-sample t-test with equal

    variance.

    Sign Test - Nonparametric Test

    Random sample size n from a continuous distribution with median .

    H0 : = 0 versus Ha : > 0

    Significance level

    Test statistic: S =No of observations > 0 Under H0: S Bin(n, 0.5) Form of RR = {sobs k} p-value =P(S sobs)

    10 Hypothesis Testing71

    Sign Test

    Example: IQ test

    Refer to Example 10.6.1:

    The observed IQ scores:

    121, 98, 95, 94, 102, 106, 112, 120, 108, 109

    H0 : = 100 versus Ha : > 100 = 0.05 Observed test statistic value: sobs = 7 Under H0 : S Bin(10, 0.5)

    p-value=P(S

    7) = 1

    P(S

    6) = 1

    0.828 = 0.172.

    Conclusion: do not reject H0

    10 Hypothesis Testing72

    10.7 Testing Variances in Small Normal Samples

    Random sample from N(, 2). Consider

    H0 : 2 = 20 versus Ha : 2 = 20(or 2 < 20 or 2 > 20) Significance level Test statistic:

    2 = (n1)S2

    20

    ( 2n1 under H0)

    10 Hypothesis Testing 73 10 Hypothesis Testing 74

  • 8/22/2019 Ch10-4page

    19/30

    Two-tailed test:

    Ha : 2 = 20

    RR : 2obs > 2/2,n1 or

    2obs <

    21/2,n1

    p-value = 2 min{P(2 > 2obs), P(2 < 2obs)}

    Right-tailed test:

    Ha : 2 > 20

    RR : 2obs > 2,n1

    p-value = P(2 > 2obs)

    Left-tailed test:

    Ha : 2

    < 20

    RR : 2obs < 21,n1

    p-value = P(2 < 2obs)

    Example: IQ test

    Example 10.7.1 IQ Example 10.6.1: H0 : 2 = 100 versus

    Ha : 2 > 100. Use significance level = 0.05.

    Recall: y = 106.5, s = 9.5.

    Solution:

    Observed test statistic value:

    Rejection region:

    p-value=

    10 Hypothesis Testing 75

    Two Sample Variance Tests

    Suppose that S21 and S22 are the sample variances for two independent

    random samples of size n1 and n2 from distributions N(1, 21) and

    N(2, 22). All parameters are unknown.

    The forms

    Uj =(nj

    1)S2j

    2j , j = 1, 2

    are independent 2 random variables with (n1 1) and (n2 1)degrees of freedom, respectively, so that

    F =U1

    n1 1 /U2

    n2 1 =S1

    22

    S2221

    Fn11,n21.

    Thus, when 21 = 22,

    F =S21S22

    Fn11,n21

    10 Hypothesis Testing 76

    To test the equality of the two population variances:

    H0 : 21 = 22 versus Ha : 21 = 22 (or 21 > 22 or 21 < 22) Significance level

    Test statistic: F = S21/S

    22

    RR (two-tailed test): F > Fn11,n21,/2 orF < Fn11,n21,1/2 = (Fn21,n11,/2)

    1

    RR (right-tailed test): F > Fn11,n21, = (Fn21,n11,1)1

    RR (left-tailed test): F < Fn11,n21,1 = (Fn21,n11,)1

    10 Hypothesis Testing 77 10 Hypothesis Testing 78

  • 8/22/2019 Ch10-4page

    20/30

    Two Sample Variance Tests: Example

    Example 10.7.2 Compare the variances of the amount of active

    ingredients in generic and brand-name drugs. Random samples of

    size 20 (generic) and 30 (brand-name). Data: s2g = 0.00109mg2,

    s2

    b = 0.000384mg2

    . Use level = 0.05 to test

    H0 : 2g =

    2b versus Ha :

    2g >

    2b

    10.8 Neyman-Pearson - MP -level test

    Consider a test involving a parameter with test statistic W and

    rejection region RR. The power of the test:

    Power() = P(reject H0 when the parameter value is )= P(W RR, when the parameter value is )

    Relationship between power and ,

    Suppose H0 : = 0, a is a parameter value under Ha. Then

    Power(0) = = P(Reject H0 when H0 is true)

    Power(a) = 1

    (a)

    We would like to choose a level (Type I error) RR to maximize the

    Power() for in Ha, i.e. find the Most Powerful (MP) -level

    test.

    10 Hypothesis Testing 79

    Neyman-Pearson Lemma

    MP -level Tests

    Y1, , Yn is a rs from a distribution with parameter and likelihoodL(). We wish to test:

    H0 : = 0 versus Ha : = a,

    using level of significance , where 0 and a are given.

    Theorem 1 The Neyman-Pearson Lemma For the given level

    of significance, , the test that maximizes Power(a) has a RR with

    the form:

    RR :L(0)

    L(a)< k,

    wherek is chosen to insure that the level (Type I error probability) is

    . The Most Powerful (MP) - level test is sometimes called the best

    test.

    10 Hypothesis Testing 80

    Remark: The Neyman-Pearson Lemma gives the Rejection Region,

    RR, that maximizes the power:

    Power(a) = P(RR| = a)

    given that

    P(RR

    | = 0) =

    10 Hypothesis Testing 81 10 Hypothesis Testing 82

  • 8/22/2019 Ch10-4page

    21/30

    MP -level Test

    Example: Beta(, 1) (n = 1)

    Example 10.8.1 One observation n = 1, Y from Beta(, 1) with

    pdf:

    f(y|) = y

    1

    , 0 < y < 1(a) Use the N-P lemma to find the = .05 MP test of

    H0 : = 2 versus Ha : = 1

    (b) For the MP0.05-level test derived above, calculate Power(1)

    Solution:

    Likelihood L() = f(y

    |) = y1 Therefore, in this case

    L(0)

    L(a)=

    L(2)

    L(1)=

    2y

    1y0= 2y, 0 < y < 1

    By N-P Lemma, the rejection region for the MP test:

    RR = {2Y < k} or {Y < k/2}

    Determining k:

    = P(RR| = 0 = 2)= P(Y < k/2| = 2)

    =

    k/20

    2ydy = y2|k/20 = (k/2)2

    k/2 = k = 2

    Therefore, for this problem, the MP 0.05-level test has the rejection

    region

    RR : Y < 0.05 = 0.2236

    10 Hypothesis Testing 83

    (b) Solution (compute Power(1)):

    Power(1) = P(RR| = a = 1)= P(Y < 0.2236| = 1)

    =

    0.22360

    1y0dy =

    0.22360

    1dy

    = y|0.22360= 0.2236.

    10 Hypothesis Testing 84

    MP -level Test

    Normal Sample2 known

    Y1, , Yn rs N(, 2), 2 known. TestH0 : = 0 versus Ha : = a, where a > 0

    The pdf of Yi is

    f(y|) = 1

    2exp

    (y )

    2

    22

    , < y < +.

    Use the N-P lemma to find the MP -level test procedure.

    Solution:

    From the pdf, we obtain the likelihood for :

    L() = f(y1

    |)f(y2

    |)

    f(yn

    |)

    = 1

    2

    nexp

    ni=1

    (yi )222

    10 Hypothesis Testing 85 10 Hypothesis Testing 86

  • 8/22/2019 Ch10-4page

    22/30

    By the N-P Lemma, the FORM of the MP -level test rejection region

    is:

    RR :L(0)

    L(a)< k.

    L(0)L(a)

    = 12n

    1

    2

    n expni=1 (yi0)2

    22 exp

    ni=1 (yia)222 < k

    exp 1

    22

    n

    i=1

    (yi 0)2 n

    i=1

    (yi a)2

    < k

    1

    22

    n

    i=1(yi 0)2

    n

    i=1(yi a)2

    < ln(k)

    n

    i=1

    (yi 0)2 n

    i=1

    (yi a)2

    > 22ln(k)

    RR :

    n

    i=1

    (yi 0)2 n

    i=1

    (yi a)2

    > 22ln(k)

    n

    i=1y2i 2ny0 + n20

    n

    i=1y2i 2nya + n2a

    > 22ln(k)

    2ny(a 0) + n20 n2a > 22ln(k) 2ny(a 0) > 22ln(k) n20 n2a

    since a 0 > 0 RR : y > 2

    2ln(k) n20 n2a2n(a 0) .

    Note that the right hand side of the above does not involve the data,

    so the inequality is equivalent to y > k So the form of the RR becomes:

    RR : y > k

    10 Hypothesis Testing 87

    RR : y > k

    To determine k, we set

    = P(Y RR| = 0)= P(Y > k | = 0)

    = P Y 0/

    n> k

    0/

    n| = 0

    = P

    Z >

    k 0/

    n

    Therefore,

    k 0/

    n= z

    k = 0 + z n

    10 Hypothesis Testing 88

    Thus, the MP -level test of

    H0 : = 0 versus Ha : = a, where a > 0

    has the rejection region:

    RR : y > 0 + z

    n

    or equivalently:

    RR : Z =y 0/

    n> z.

    10 Hypothesis Testing 89 10 Hypothesis Testing 90

  • 8/22/2019 Ch10-4page

    23/30

    Uniformly Most Powerful (UMP) -level test

    In the Normal sample example with known , note that the test does

    not depend on a (except that we need the assumption a > 0).

    Because of this, the test is the most powerful level test for

    Ha : = a, where a > 0. We call such test the UniformlyMost Powerful (UMP) -level test of

    H0 : = 0 versus Ha : > 0.

    Simple hypothesis: hypothesis that uniquely specifies the

    distribution of the population from which the sample is taken.

    Composite hypothesis: not a simple hypothesis.Eg. for the above normal sample example with known H : = 0 is

    simple, while H : > 0 is composite

    MP -level test

    Exponential Sample

    Example 10.8.2 Y1, , Yn rs from Exp() with pdf:

    f(y) =1

    ey/ , 0 < y < +

    (1) Using N-P Lemma to construct the MP -level test for

    H0 : = 0 versus Ha : = a, wherea > 0

    Hint:n

    i=1 Yi Gamma(n, ).

    10 Hypothesis Testing 91 10 Hypothesis Testing 92

    (2) Construct the UMP -level test for

    H0 : = 0 versus Ha : > 0.

    10 Hypothesis Testing 93 10 Hypothesis Testing 94

  • 8/22/2019 Ch10-4page

    24/30

    (3) For n = 36, we wish to test

    H0 : = 1 versusHa : = 2(or > 1)

    using level = 0.01. Use StaTable to find the critical value at

    http: // www. cytel. com/ Products/ StaTable/ or

    http: // mcsp. wartburg. edu/ nmb/ fall10/ math313/

    seeingstats/ Chpt4/ gammaProb. html

    Gamma(, ), is the shape parameter and is the scale

    parameter.

    (4) (Large Sample Approach). Using the CLT to obtain an

    approximate test for the hypotheses in (3).

    10 Hypothesis Testing 95

    Summary: Neyman-Pearson Lemma

    MP -level Tests

    N-P Lemma provides the test statistic and form of the RR for theMP test. The constant (critical value) must be determined to

    assure that the test is level .

    It is not always possible to find a UMP test. The N-P Lemma cannot be applied if there are unknown

    parameters other than .

    If the rvs in the random sample are discrete, then it is usually notpossible to achieve a given level of significance, .

    10 Hypothesis Testing 96

    10.9 Likelihood Ratio Test

    Likelihood Ratio Test (LRT)

    An approach for developing tests when either or both of thehypotheses are composite.

    Can be used when the model for the data has more than oneparameter:

    1, 2, , kWe will refer to the vector of parameters:

    = (1, 2, , k)

    The likelihood of the random sample:

    L(1, 2, , k) = L()

    10 Hypothesis Testing 97 10 Hypothesis Testing 98

  • 8/22/2019 Ch10-4page

    25/30

    LRT: Normal Example

    Let Y1, , Yn be a random sample from N(, 2), where 2 isunknown. Then

    = (, 2).

    We wish to test

    H0 : = 0 where Ha : = 0Note that the null hypothesis is actually:

    H0 : = 0, 2 > 0,

    i.e. H0 is not simple (it is composite).

    In this situation, H0 states that the parameters fall in a particular set,i.e. is in set 0. We write 0: the parameter space underthe null hypothesis.

    For our example,

    H0 : 0 = {(, 2) : = 0, 2 > 0}

    The alternative states:

    Ha : {(, 2) : = 0, 2 > 0} = a,where a denotes the parameter space under the alternative

    hypothesis.

    10 Hypothesis Testing 99

    We denote the union of the sets 0 and a by , i.e.

    0 a =

    In our example:

    = 0 a= {(, 2) : = 0, 2 > 0} {(, 2) : = 0, 2 > 0}= {(, 2) : < < +, 2 > 0}

    That is, = set of all possible values of the parameters, without

    regard to the hypotheses.

    10 Hypothesis Testing 100

    Likelihood Ratio Test (LRT)

    Notations:

    L(0) = max0

    L()

    denotes the maximum of the likelihood of the parameter values in 0

    (under H0).

    L() = max

    L()

    denotes the maximum of the likelihood of the parameter values in .

    10 Hypothesis Testing 101 10 Hypothesis Testing 102

  • 8/22/2019 Ch10-4page

    26/30

    Likelihood Ratio Test (LRT)

    This is always true:

    L(0) L(),since the space contains 0. If the maximum over falls in 0 then

    L(0) = L(a)

    Evidence that H0 is false (and Ha is true) is that

    L(0) 0}

    = {(, 2) : < < +, 2 > 0}Use the LRT method to find the RR for a level test.

    10 Hypothesis Testing 104

    10 Hypothesis Testing 105 10 Hypothesis Testing 106

  • 8/22/2019 Ch10-4page

    27/30

    10 Hypothesis Testing 107

    LRT: Large Sample RR

    Theorem 2 Let Y1, , Yn have a joint likelihood L(). Letr0 = # free parameters in 0

    r = # free parameters in

    Assuming that certain regularity conditions hold, then under H0 and

    for large n, 2ln() has approximately a 2 distribution with r r0degrees of freedom.

    10 Hypothesis Testing 108

    LRT: Normal Example

    Using Theorem ?? to derive the level large sample RR.

    10 Hypothesis Testing 109

    LRT E l 2 P i Di i T t

    10 Hypothesis Testing 110

    E l 2 P i Di T t

  • 8/22/2019 Ch10-4page

    28/30

    LRT Example 2: Poisson Dispersion Test

    Let X1, X2, , Xn be independent rv fromPoisson(i), i = 1, 2, , n with pmf:

    P(Xi = xi) =eixii

    xi!, xi = 0, 1,

    ; i > 0

    We wish to test

    H0 : i = , i = 1, 2, , nversus

    Ha : i are not all equal.

    Assume that n is large. Construct an approximate level LRT.

    Example 2: Poisson Disperson Test

    NIST test of asbestos fibers, # fiber on 23 squares on a grid:

    31 29 19 18 31 28 34 27 34 30 16 18 26 27 27 18 24 22 28 24 21 17 24

    10 Hypothesis Testing 111

    Example 3: Binomials

    Let Xi Binomial(ni, pi), i = 1, 2 be independent. We wish to test

    H0 : p1 = p2 versus Ha : p1 = p20 = {(p1, p2) : 0 < p1 = p2 < 1}

    a = {(p1, p2) : 0 < p1 < 1, 0 < p2 < 1}Both hypotheses are composite. Suppose ni are large. Carry out an

    approximate level LRT.

    10 Hypothesis Testing 112

    Example 3: Binomials

    Clinical Trial: Allergy Medicine versus Placebo

    Randomization of 3774 subjects:

    Allergy medicine group: n1 = 2103, x1 = 547 reported headaches

    Placebo group: n2 = 1671, x2 = 368 reported headaches

    Test whether the proportion of those reporting headaches in differentin the two groups, using significance level = 0.05.

    Ref: Michael Sullivan, III. (2004) Statistics: Informed Decisions Using Data.

    10 Hypothesis Testing 113

    S LRT

    10 Hypothesis Testing 114

    S f Ch t 10

  • 8/22/2019 Ch10-4page

    29/30

    Summary: LRT

    The likelihood ratio approach does not guarantee an optimum test(unlike the N-P Lemma).

    Using the likelihood ratio approach will customarily provide anacceptable test.

    Unlike that N-P Lemma, the likelihood ratio approach can beapplied where the underlying model has nuisance parameters

    (parameter not of particular interest).

    Summary of Chapter 10

    Four Basic Elements of a Statistical Test: (1) H0; (2) Ha; (3)test statistic; (4) rejection region

    Error probabilities:

    Type I error probability (level): = P(reject H0H0 istrue) (sending an innocent person to jail)

    Type II error probability: (a) = P(fail to reject H0Hais true with = a) (setting a guilty person free)

    Power() = P(reject H0 with parameter value ) Large sample tests

    Suppose

    N(, 2

    ) or approximately. For instance, apply

    CLT when is consistent and unbiased for , and n is large.

    2

    is known or can be estimated consistently by 2

    H0 : = 0 versus Ha : > 0

    10 Hypothesis Testing 115

    Test statistic: T = or Z = ( 0)/ RR: : 0

    > z or equivalently > 0 + z

    P-value=P(Z > zobs) (for right tailed z-test)

    Typical examples:

    Test on population means (one-sample z-test) Test on two population mean difference (two-sample z-test) Test on population proportions (one-sample z-test) Test on two proportion difference (two-sample z-test)

    Calculation of (a) and Power for a level test procedure Determination of sample size to control the Type II error

    probability at level for a level test procedure. Examples

    discussed include

    Test on means with one-sample z test (SAT prep course)

    Test on means with two-sample z test (WMS 10.44)

    10 Hypothesis Testing 116

    Test on population proportions with one-sample z test

    (Beer-tasting)

    Test-CI relationship: CIs are complements of RRs, which are alsocalled acceptance regions

    p-value Observed significance level Smallest level of significance, for which the observed value

    indicates that H0 should be rejected

    Tail area captured by the observed test statistic

    Reject H0 when p-value <

    Testing mean in small normal samples

    One sample t-test Two sample t-test (assuming equal variance)

    10 Hypothesis Testing 117

    Testing variance in small normal samples

    10 Hypothesis Testing 118

    L() ( t i t d lik lih d) h i l f

  • 8/22/2019 Ch10-4page

    30/30

    Testing variance in small normal samples 2-test for one population variance

    F-test for testing the equality of two population variances

    Neyman-Pearson Lemma

    Use N-P Lemma to construct MP -level test for testingsimple hypotheses H0 and Ha

    Based on the MP test, construct uniformly MP -level test for

    composite hypotheses

    Only one unknown parameter is involved

    Likelihood-ratio test Test statistic: = L(0)

    L()

    L(0) (restricted likelihood): the maximum value ofthe likelihood when the parameters are restricted (and

    reduced in number) based on the assumption of H0

    L() (unrestricted likelihood): the maximum value ofthe likelihood when some the parameters are unrestricted,

    i.e. obtained under the entire parameter space

    Construct the RR for a level based on the sampling

    distribution of

    Large sample procedure: for large n, when the distribution

    satisfies some regularity conditions, 2ln() 2rr0 , r: No of free parameters in the whole parameter space

    (unrestricted)

    r0: No of free parameters in the restricted parameter space(under H0)