Ø F-Distribution
ØTypes!of!Variation
ØOne-Factor!ANOVA
ØANOVA!Table
ØMultiple!Comparisons
One-Factor Analysis of Variance
Lecture!16
Sections!20.6�20.8
Motivation: One-Factor ANOVA
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?
• Answer: From!the!side-by-side!boxplots:• Means!appear!to!be!__________________________
• But!there!is!a!great!deal!of!_________________________________________
• Sample!size!is!___________________
• Takeaway: Need!an!inferential!technique!that!can!compare!3!means!simultaneously
One-Factor Analysis of Variance (ANOVA)
• One-Factor Analysis of Variance (ANOVA): statistical!technique!used!to!compare!the!means!of!three!or!more!populations
• Uses!two!sources!of!variability!to!compare!means
• Between Group Variation: measures!that!amount!of!variability!between!the!sample!means!of!individual!groups
• �How!different!are!the!sample!means!from!one!another?�
• Within Group Variation: measures!the!amount!of!variability!that!exists!within!the!samples
• �How!different!are!the!individual!observations!from!one!another!within!each!group?�
Comparing Types of Variation
Small!Between!Group!Variation• Means!_____________________!(___,!___,!and!___)
Large!Within!Group!Variation• Observations!within!groups!_______________
• Range!for!each!sample!is!about!____
Large!Between!Group• Means!____________________!(___,!___,!and!___)
Small!Within!Group• Observations!within!groups!________________________
• Range!for!each!sample!is!about!____
One-Factor ANOVA: Hypotheses and Conditions
• Hypotheses: Let!! be!the!number!of!groups!being!compared• "#: $% = $& = ' = $(• "): At!least!two!means!are!not!equal
• Assumptions and Conditions:• Independence: Both!the!groups!being!compared!and!the!individuals!sampled!must!be!independent!of!one!another
• Randomization: Subjects!come!from!a!random!sample
• Equal Variance: Variances!of!the!populations!from!which!the!samples!have!been!drawn!are!approximately!equal
• Nearly Normal: Distribution!of!all!sample!means!are!approximately!normal
Example: One-Factor ANOVA
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?
• Hypotheses:• "#: ________________________
• "): ______________________________________________________________________________
Example: One-Factor ANOVA
Example: One-Factor ANOVA
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?
• Conditions:• Independence: Groups!are!__________________!unless!a!student!is!__________!____________;!while!at!the!__________________,!students!are!likely!independent
• Randomization: Subjects!randomly!sampled!______________________________
• Equal Variance: Boxplots!show!_____________________________________;!largest!variance!is!______________________________________________________________________
• Nearly Normal: All!boxplots!look!_________________!so!distribution!of!all!____________________!are!________________________________
Grand Mean
• Grand Mean: the!mean!of!all!observations,!disregarding!the!group!from!which!the!observations!were!sampled
*+ =,% -+% . ,& -+& .'. ,( -+(
,% . ,& .'. ,(
where!-+/ is!the!mean!of!the!observations!from!group!0 and!,/ is!
the!number!of!observations!sampled!from!group!0• Used!in!the!calculation!of!the!between!group!variation!because!it!helps!us!understand!how!different!the!sample!means!are.
Warning: Do not average the sample means! This tactic to find
the grand mean only works if all of the sample sizes are the same.
One-Factor ANOVA: Types of Variation
• Between Group Variation: How!different!are!the!sample!means?
1123 = ,% -+% 4 *+ & . ,& -+& 4 *+ & .'. ,( -+( 4 *+ &
• Within Group Variation: How!different!are!the!observations!within!each!group?
115 = ,% 4 6 7%& . ,& 4 6 7&
& .'. ,( 4 6 7(&
Sample!
SizeSample!
Variance
Sample!
Size
Sample!
Mean
Example: One-Factor ANOVA
• Grand Mean:
*+ = _____________________________________!= __________
• Between Group Variation:
1123 = __________________________________________________________________
= _______________________________________
= _____________
• Within Group Variation:
115 = ___________________________________________________________________
= ________________________________________
= ______________
F-Distribution and Test Statistic
• F-Distribution: continuous!probability!distribution!that!has!the!following!properties:
• Unimodal!and!right-skewed
• Always!non-negative
• Two!parameters!for!degrees!of!freedom• One!for!numerator!and!one!for!denominator
• Used!to!compare!the!ratio!of!two!sources!of!variability
• Test Statistic:
8(9 ;<(<% =>123
>15=
1123?@! 4 6A
115?@, 4 ! 4 6A
where!! is!the!number!of!categories!and!, is!the!total!sample!size
Between!Group
Within!Group
Example: One-Factor ANOVA
• Degrees of Freedom:• Numerator: ________________________
• Denominator: _______________________
• Mean Squared Treatment:
>123 = _________________ = ____________
• Mean Squared Error:
>15 = _________________ = ____________
Example: One-Factor ANOVA
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?
• Mechanics:• Test Statistic:
8 = _______________ = _________
• Degrees of Freedom: ________________
• P-Value: ________!(Using!software)
______
______
Example: One-Factor ANOVA
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?
• Conclusion: With!a!p-value!of!______,!we!_____________________________!and!conclude!that!the!mean!math!SAT!scores!are!___________________!___________________________________________________________________________.
Limitation: One-Factor ANOVA can only determine if a significant
difference between two means exists – not where that difference exists.
ANOVA Table
• ANOVA Table: summary!of!the!sums!of!squares,!degrees!of!freedom,!and!mean!squared!terms!from!an!ANOVA
Source Sums of Squares DF Mean Squares Test Statistic
Between Group 1123 ! 4 6 >123 =1123
! 4 68 =
>123
>15
Within Group 115 , 4 ! >15 =115
, 4 !
Total 112 , 4 6
Note 1: Between group and within group sums of squares sum to total sums of squares
Note 2: Degrees of freedom in numerator and denominator sum to total degrees of freedom
Example: One-Factor ANOVA
• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.
• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?
• Hypotheses:• "#: ____________________________________________
• "): _______________________________________________________________________________
Using Excel
Example: One-Factor ANOVA
• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.
• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?
• Task: Complete!the!ANOVA!table
Source Sums of Squares DF Mean Squares Test Statistic
Between Group 4627 3 1543 1.414
Within Group 74,206 68 1091
Total 78,833 71
Using Excel
Critical value assuming
5% level of significance
Example: One-Factor ANOVA
• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.
• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?
• Mechanics:• Degrees of Freedom: _________________
• Test Statistic: _______________
• P-Value: __________
Example: One-Factor ANOVA
• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!Randomly!sample!72!customers!and!record!the!total!amount!of!their!purchase!before!the!coupon.
• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?
• Conclusion: With!a!p-value!of!________,!we!___________________________!_____________!and!conclude!that!the!amount!of!money!people!spend!at!this!department!store!_______________________________________________!_____________________________________.
Drawback of ANOVA
•When!a!significant!difference!is!found!the!only!conclusion!that!can!be!drawn!at!this!point!is!that!at!least!two!means!are!not!equal.
• Problem:Many!different!ways!of!rejecting!"#• $% B $&,!$% = $C,!$& = $C• $% = $&,!$% B $C,!$& = $C• $% = $&,!$% = $C,!$& B $C• $% B $&,!$% B $C,!$& = $C• $% B $&,!$% = $C,!$& B $C• $% = $&,!$% B $C,!$& B $C• $% B $& B $C
• Solution: ______________________________!will!tell!us!which!of!these!scenarios!is!true
___________!of!means!
not!equal
____________!of!means!
not!equal
____________!are!equal
Number of possibilities
increases exponentially as
the number of groups
being compared increases
Multiple Comparisons
• Multiple Comparisons: procedure!used!to!determine!exactly!which!pairs!of!means!are!significantly!different
• Extension!of!ANOVA
• Calculate!a!confidence!interval!for!each!pair!of!means,!but�
• �makes!adjustment!to!the!confidence!interval!based!on!how!many!comparisons!need!to!be!made
• Many!different!techniques• Fisher�s!Least!Significant!Difference!Method
• Bonferroni!Adjustment!Method
• Tukey�s!Multiple!Comparison!Method
Note: Your textbook mentions the Bonferroni adjustment
method in passing but does a poor job on elaborating. Do
not rely on the textbook for notes on multiple comparisons.
Fisher’s Least Significant Difference (LSD) Method
• Fisher’s LSD Method: for!each!possible!pairing!of!means,!calculate!the!following!confidence!interval:
DE/ 4 DEF ± G;<( >156
,/.6
,F
• Interval!does!not!contain!zero:!Conclude means!are!significantly!different
• Interval!contains!zero: Conclude!means!not!significantly!different
Sample!sizes!of!groups!
being!compared
Mean!squared!
error!from!ANOVA
Denominator!df
from!ANOVA
Difference of Two Means
G;<&7%&
,%.7&&
,&
• More!degrees!of!freedom!leads!to!smaller!multiplier
• Weighted!sum!of!two!variances!
Fisher’s LSD Method
G;<( >156
,/.6
,F
• Fewer!degrees!of!freedom!leads!to!larger!multiplier
• MSE!is!combination!of!3!or!more!variances
Difference of Two Means vs. Multiple Comparisons
Takeaway: Because G;<& H G;<( and IJK
;J.
IKK
;KH >15
%
;L.
%
;M, the margin of
error for Fisher’s LSD Method will always be wider than doing a confidence
interval for the difference of two means.
Review: SAT ANOVA Example
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: Are!the!mean!math!SAT!scores!equal!across!all!three!majors?
• Conclusion: Strong!evidence!at!least!two!means!differed
• Statistics:
Source SS DF MS F
Between 85,120 2 42,560 8.44
Within 60,520 12 5043
Total 145,640 14
Group Mean Sample Size
Comp. Sci. 720 5
Economics 640 5
History 536 5
ANOVA Table:
Example: Multiple Comparisons on SAT Data
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: How!many!confidence!intervals!do!we!need!to!calculate?
• Answer: ____• ________________________________________
• ________________________________________
• ________________________________________
Note: As seen before, this number will grow exponentially as the
number of groups being compared increases. Use software to find
these confidence intervals.
Example: Multiple Comparisons on SAT Data
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: What!is!the!margin!of!error!for!the!confidence!intervals?
• Answer:• t-Statistic: G%& = ____________
• Mean Squared Error: _________
• Sample Sizes: ____________________________
• Margin of Error:
_____________________________________________
Important Note: The margins of
error will only be the same if the
sample sizes taken from each
group are ___________. Otherwise,
each confidence interval will
have its own _____________________
Example: Multiple Comparisons on SAT Data
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: What!are!the!confidence!intervals!for!the!difference!in!means!between!each!pair?
• Answer:• CS vs. Economics: _______________________________!=!_____________________
• CI!__________________________
• CS vs. History: _______________________________!=!_____________________
• CI!__________________________
• Economics vs. History: _______________________________!=!_____________________
• CI!__________________________
Example: Multiple Comparisons on SAT Data
• Scenario: Comparing!average!math!SAT!scores!for!5!students!in!three!different!majors:!computer!science,!economics,!and!history
• Question: What!conclusions!can!be!drawn!about!the!mean!math!SAT!scores!of!the!three!majors
• Answer:• Both!____________________!and!_____________!majors!score!significantly!higher!than!__________!majors!on!the!math!portion!of!the!SAT.
• While!__________________________!majors!had!a!larger!sample!mean!than!_____________!majors,!the!difference!between!them!was!______________________
Review: Coupon ANOVA Example
• Scenario: Comparing!average!amount!spent!for!coupon!discounts!of!15%,!20%,!25%,!and!30%!off!entire!purchase
• Question: Do!customers!spend!different!amounts!before!applying!the!coupon!depending!on!the!percentage!off!they!received?
• Conclusion: Little!to!no!evidence!of!a!significant!difference
• Statistics:
Source SS DF MS F
Between 4,627 3 1,543 1.414
Within 74,206 68 1091
Total 78,833 71
Group Mean Sample Size
15% Off 149.56 25
20% Off 140.64 21
25% Off 142.71 16
30% Off 165.30 10
ANOVA Table:
Example: Mult. Comp. When Failing to Reject ANOVA
• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase.!!After!doing!the!ANOVA,!we!concluded!there!was!no!evidence!any!of!the!mean!amounts!differed
• Question: Without!calculating!out!the!multiple!comparisons!confidence!intervals,!what!do!we!know!about!all!of!them?
• Answer: _____________________________• ANOVA!allowed!us!to!conclude!________________________________________
• Takeaway: If!we!fail!to!reject!the!null!hypothesis!in!ANOVA,!there!is!_________________________________________________________________________
Using Excel
Formulas for 15% vs. 20% in cells C13, D13, and
E13. Change the rows to get confidence intervals
for other groups
Example: Group Sample Sizes
• Scenario: Large!department!store!gives!out!scratch-off!coupons!at!the!door!for!15%,!20%,!25%,!or!30%!off!the!entire!purchase
• Question: What!relationship!exists!with!the!widths!of!the!intervals?
• Answer: Confidence!intervals!comparing!groups!with!_______________________________________________________
Groups Sample
Sizes
Interval
Width
15% vs. 20% 25 and 21 39.025
15% vs. 25% 25 and 16 42.209
15% vs. 30% 25 and 10 49.329
20% vs. 25% 21 and 16 43.749
20% vs. 30% 21 and 10 50.654
25% vs. 30% 16 and 10 53.145
Top Related