12주차

84
Introduction to Probability and Statistics 12 th Week (5/31) Hypothesis Testing (II)

description

 

Transcript of 12주차

Page 1: 12주차

Introduction to Probability and Statistics12th Week (5/31)

Hypothesis Testing (II)

Page 2: 12주차

중간고사

Page 3: 12주차

중간고사

Page 4: 12주차

Statistical Hypotheses and Null Hypotheses

Statistical hypotheses: Assumptions or guesses about the populations involved. (Such assumptions, which may or may not be true)

Null hypotheses (H0): Hypothesis that there is no difference between the procedures. We formulate it if we want to decide whether one procedure is better than another.

Alternative hypotheses (H1): Any hypothesis that differs from a given null hypothesis

Example 1. For example, if the null hypothesis is p = 0.5, possible alternative hypotheses are p =0.7, or p ≠ 0.5.

Page 5: 12주차

Table 7-2 Type I and Type II Errors

True State of Nature

We decide to

reject the

null hypothesis

We fail to

reject the

null hypothesis

The null

hypothesis is

true

The null

hypothesis is

false

Type I error

(rejecting a true

null hypothesis)

Type II error

(failing to reject

a false null

hypothesis)

Correct

decision

Correct

decision

Decision

Page 6: 12주차

P Value

Small P values provide evidence for rejecting the null hypothesis in favor of the alternative hypothesis, and large P values provide evidence for not rejecting the null hypothesis in favor of the alternative hypothesis.

The P value and the level of significance do not provide criteria for rejecting or not rejecting the null hypothesis by itself, but for rejecting or not rejecting the null hypothesis in favor of the alternative hypothesis.

When the test statistic S is the standard normal random variable, the table in Appendix C is sufficient to compute the P value, but when S is one of the t, F, or chi-square random variables, all of which have different distributions depending on their degrees of freedom, either computer software or more extensive tables will be needed to compute the P value.

Page 7: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.7

Concepts of Hypothesis Testing…

For example, if we’re trying to decide whether the mean is not equal to 350, a large value of (say, 600) would provide enough evidence.

If is close to 350 (say, 355) we could not say that this provides a great deal of evidence to infer that the population mean is different than 350.

Page 8: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.8

Concepts of Hypothesis Testing (4)…The two possible decisions that can be made:

Conclude that there is enough evidence to support the alternative hypothesis

(also stated as: reject the null hypothesis in favor of the alternative)

Conclude that there is not enough evidence to support the alternative hypothesis

(also stated as: failing to reject the null hypothesis in favor of the alternative)

NOTE: we do not say that we accept the null hypothesis if a statistician is around…

Page 9: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.9

Concepts of Hypothesis Testing (2)…

The testing procedure begins with the assumption that the null hypothesis is true.

Thus, until we have further statistical evidence, we will assume:

H0: = 350 (assumed to be TRUE)

The next step will be to determine the sampling distribution of the sample mean assuming the true mean is 350.

is normal with 350

75/SQRT(25) = 15

Page 10: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.10

Is the Sample Mean in the Guts of the Sampling Distribution??

Page 11: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.11

Three ways to determine this: First way

1. Unstandardized test statistic: Is in the guts of the sampling distribution? Depends on what you define as the “guts” of the sampling distribution.

If we define the guts as the center 95% of the distribution [this means = 0.05], then the critical values that define the guts will be 1.96 standard deviations of X-Bar on either side of the mean of the sampling distribution [350], or

UCV = 350 + 1.96*15 = 350 + 29.4 = 379.4

LCV = 350 – 1.96*15 = 350 – 29.4 = 320.6

Page 12: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.12

1. Unstandardized Test Statistic Approach

Page 13: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.13

Three ways to determine this: Second way2. Standardized test statistic: Since we defined the “guts” of the sampling distribution to be the center 95% [ = 0.05],

If the Z-Score for the sample mean is greater than 1.96, we know that will be in the reject region on the right side or

If the Z-Score for the sample mean is less than -1.97, we know that will be in the reject region on the left side.

Z = ( - )/ = (370.16 – 350)/15 = 1.344

Is this Z-Score in the guts of the sampling distribution???

Page 14: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.14

2. Standardized Test Statistic Approach

Page 15: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.15

Three ways to determine this: Third way

3. The p-value approach (which is generally used with a computer and statistical software): Increase the “Rejection Region” until it “captures” the sample mean.

For this example, since is to the right of the mean, calculate

P( > 370.16) = P(Z > 1.344) = 0.0901

Since this is a two tailed test, you must double this area for the p-value.

p-value = 2*(0.0901) = 0.1802

Since we defined the guts as the center 95% [ = 0.05], the reject region is the other 5%. Since our sample mean, , is in the 18.02% region, it cannot be in our 5% rejection region [ = 0.05].

Page 16: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.16

3. p-value approach

Page 17: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.17

Statistical Conclusions:

Unstandardized Test Statistic:

Since LCV (320.6) < (370.16) < UCV (379.4), we reject the null hypothesis at a 5% level of significance.

Standardized Test Statistic:

Since -Z/2(-1.96) < Z(1.344) < Z/2 (1.96), we fail to reject the null hypothesis at a 5% level of significance.

P-value:

Since p-value (0.1802) > 0.05 [], we fail to reject the hull hypothesis at a 5% level of significance.

Page 18: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.18

Example 11.1…

A department store manager determines that a new billing system will be cost-effective only if the mean monthly account is more than $170.

A random sample of 400 monthly accounts is drawn, for which the sample mean is $178. The accounts are approximately normally distributed with a standard deviation of $65.

Can we conclude that the new system will be cost-effective?

Page 19: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.19

Example 11.1…

The system will be cost effective if the mean account balance for all customers is greater than $170.

We express this belief as a our research hypothesis, that is:

H1: > 170 (this is what we want to determine)

Thus, our null hypothesis becomes:

H0: = 170 (this specifies a single value for the parameter of interest) – Actually H0: μ < 170

Page 20: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.20

Example 11.1…

What we want to show:

H1: > 170

H0: < 170 (we’ll assume this is true)

Normally we put Ho first.

We know:

n = 400,

= 178, and

= 65

= 65/SQRT(400) = 3.25

= 0.05

Page 21: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.21

Example 11.1… Rejection Region…

The rejection region is a range of values such that if the test statistic falls into that range, we decide to reject the null hypothesis in favor of the alternative hypothesis.

is the critical value of to reject H0.

Page 22: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.22

Example 11.1…At a 5% significance level (i.e. =0.05), we get [all in one tail]

Z = Z0.05 = 1.645

Therefore, UCV = 170 + 1.645*3.25 = 175.35Since our sample mean (178) is greater than the critical value we calculated (175.35), we reject the null hypothesis in favor of H1

OR

(>1.645) Reject null

OR

p-value = P( > 178) = P(Z > 2.46) = 0.0069 < 0.05 Reject null

Page 23: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.23

Example 11.1… The Big Picture…

=175.34

=178

H1: > 170

H0: = 170

Reject H0 in favor of

Page 24: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.24

Conclusions of a Test of Hypothesis…

If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true.

If we fail to reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true. This does not mean that we have proven that the null hypothesis is true!

Keep in mind that committing a Type I error OR a Type II error can be VERY bad depending on the problem.

Page 25: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.25

One tail test with rejection region on right The last example was a one tail test, because the rejection region is located in only one tail of the sampling distribution:

More correctly, this was an example of a right tail test.

H1: μ > 170

H0: μ < 170

Page 26: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.26

One tail test with rejection region on leftThe rejection region will be in the left tail.

Page 27: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.27

Two tail test with rejection region in both tailsThe rejection region is split equally between the two tails.

Page 28: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.28

Example 11.2… Students work

AT&T’s argues that its rates are such that customers won’t see a difference in their phone bills between them and their competitors. They calculate the mean and standard deviation for all their customers at $17.09 and $3.87 (respectively). Note: Don’t know the true value for σ, so we estimate σ from the data [σ ~ s = 3.87] – large sample so don’t worry.

They then sample 100 customers at random and recalculate a monthly phone bill based on competitor’s rates.

Our null and alternative hypotheses are

H1: ≠ 17.09. We do this by assuming that:

H0: = 17.09

Page 29: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.29

Example 11.2…

The rejection region is set up so we can reject the null hypothesis when the test statistic is large or when it is small.

That is, we set up a two-tail rejection region. The total area in the rejection region must sum to , so we divide by 2.

stat is “small” stat is “large”

Page 30: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.30

Example 11.2…

At a 5% significance level (i.e. = .05), we have

/2 = .025. Thus, z.025 = 1.96 and our rejection region is:

z < –1.96 -or- z > 1.96

z-z.025 +z.0250

Page 31: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.31

Example 11.2…

From the data, we calculate = 17.55

Using our standardized test statistic:

We find that:

Since z = 1.19 is not greater than 1.96, nor less than –1.96 we cannot reject the null hypothesis in favor of H1. That is “there is insufficient evidence to infer that there is a difference between the bills of AT&T and the competitor.”

Page 32: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.32

Probability of a Type II Error –

A Type II error occurs when a false null hypothesis is not rejected or “you accept the null when it is not true” but don’t say it this way if a statistician is around.

In practice, this is by far the most serious error you can make in most cases, especially in the “quality field”.

Page 33: 12주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.33

Probability you ship pills whose mean amount of medication is 7 mg approximately 67%

Page 34: 12주차

Special Tests of Significance for Large Samples: Means

Page 35: 12주차

Special Tests of Significance for Large Samples: Means

Page 36: 12주차

Special Tests of Significance for Large Samples: Means

Page 37: 12주차

Special Tests of Significance for Large Samples: Means

Page 38: 12주차

Special Tests of Significance for Large Samples: Proportions

Page 39: 12주차

Special Tests of Significance for Large Samples: Proportions

Page 40: 12주차

Special Tests of Significance for Large Samples: Difference of Means

Page 41: 12주차

Special Tests of Significance for Large Samples: Difference of Means

Page 42: 12주차

Special Tests of Significance for Large Samples: Difference of Means

Page 43: 12주차

Special Tests of Significance for Large Samples: Difference of Means

Page 44: 12주차

Special Tests of Significance for Large Samples: Difference of Means

Page 45: 12주차

Special Tests of Significance for Large Samples: Difference of Means

Page 46: 12주차

Special Tests of Significance for Large Samples: Difference of Proportions

Page 47: 12주차

Special Tests of Significance for Large Samples: Difference of Proportions

Page 48: 12주차

Special Tests of Significance for Large Samples: Difference of Proportions

Page 49: 12주차

Special Tests of Significance for Small Samples: Means

Page 50: 12주차

Special Tests of Significance for Small Samples: Means

Page 51: 12주차

Special Tests of Significance for Small Samples: Means

Page 52: 12주차

Special Tests of Significance for Small Samples: Means

Page 53: 12주차

Using the Student’s t Distribution for Small Samples (One Sample T-Test) When the sample size is small

(approximately < 100) then the Student’s t distribution should be used (see Appendix B)

The test statistic is known as “t”. The curve of the t distribution is flatter than

that of the Z distribution but as the sample size increases, the t-curve starts to resemble the Z-curve (see text p. 230 for illustration)

Page 54: 12주차

Degrees of Freedom

The curve of the t distribution varies with sample size (the smaller the size, the flatter the curve)

In using the t-table, we use “degrees of freedom” based on the sample size.

For a one-sample test, df = N – 1. When looking at the table, find the t-value for

the appropriate df = N-1. This will be the cutoff point for your critical region.

Page 55: 12주차

Formula for one sample t-test:

1

NS

t

Page 56: 12주차

Example

A random sample of 26 sociology graduates scored 458 on the GRE advanced sociology test with a standard deviation of 20. Is this significantly different from the population average (µ = 440)?

Page 57: 12주차

Solution (using five step model) Step 1: Make Assumptions and Meet Test

Requirements:

1. Random sample 2. Level of measurement is interval-ratio 3. The sample is small (<100)

Page 58: 12주차

Solution (cont.)

Step 2: State the null and alternate hypotheses.

H0: µ = 440 (or H0: = μ)

H1: µ ≠ 440

Page 59: 12주차

Solution (cont.) Step 3: Select Sampling Distribution and

Establish the Critical Region

1. Small sample, I-R level, so use t distribution.

2. Alpha (α) = .05

3. Degrees of Freedom = N-1 = 26-1 = 25

4. Critical t = ±2.060

Page 60: 12주차

Solution (cont.)

Step 4: Use Formula to Compute the Test Statistic

5.4

12620

440458

1

NS

t

Page 61: 12주차

Looking at the curve for the t distribution Alpha (α) = .05

t= -2.060

c

t = +2.060

c t= +4.50 I

Page 62: 12주차

Step 5 Make a Decision and Interpret Results

The obtained t score fell in the Critical Region, so we

reject the H0 (t (obtained) > t (critical)

If the H0 were true, a sample outcome of 458

would be unlikely. Therefore, the H0 is false and must be rejected.

Sociology graduates have a GRE score that is significantly different from the general student body (t = 4.5, df = 25, α = .05).

Page 63: 12주차

Testing Sample Proportions:

When your variable is at the nominal (or ordinal) level the one sample z-test for proportions should be used.

If the data are in % format, convert to a proportion first.

The method is the same as the one sample Z-test for means (see above)

Page 64: 12주차

Special Tests of Significance for Small Samples: Variance

Page 65: 12주차

Special Tests of Significance for Small Samples: Variance

Page 66: 12주차

Special Tests of Significance for Small Samples: Variance

Page 67: 12주차

Special Tests of Significance for Small Samples: Difference of Means

Page 68: 12주차

Special Tests of Significance for Small Samples: Difference of Means

Page 69: 12주차

Special Tests of Significance for Small Samples: Ratios of Variances

Page 70: 12주차

Special Tests of Significance for Small Samples: Ratios of Variances

Page 71: 12주차

Special Tests of Significance for Small Samples: Ratios of Variances

Page 72: 12주차

Summary

Page 73: 12주차

Example #1

Page 74: 12주차

Example #1 - Answer

Page 75: 12주차

Example #2

Page 76: 12주차

Example #2 - Answer

Page 77: 12주차

Example #3

Page 78: 12주차

Example #3 - Answer

Page 79: 12주차

Example #4

Page 80: 12주차

Example #4

Page 81: 12주차

Example #5

Page 82: 12주차

Example #5 – Answer

Page 83: 12주차

Example #6

Ex18) A test of the breaking strengths of 6 ropes manufactured by a company showed a

mean breaking strength of 7750 lb and a standard deviation of 145 lb, whereas the

manufacturer claimed a mean breaking strength of 8000 lb. Can we support the

manufacturér’s claim at a level of significance of (a) 0.05, (b) 0.01? (c) What is the P value

of the test?

Page 84: 12주차

Example #6 - Answer