1/54 Statistics Descriptive Statistics— Tables and Graphics.

88
1/54 Statistics Descriptive Statistics—Tables and Graphics
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    229
  • download

    2

Transcript of 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Page 1: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

1/54

Statistics

Descriptive Statistics—Tables and Graphics

Page 2: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

2/54

Summarizing Qualitative Data Summarizing Quantitative Data The Stem-and-Leaf Display(枝葉圖 ) Cross-tabulations and Scatter Diagrams

Contents

Page 3: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

STATISTICS in PRACTICE

The Colgate-Palmolive Company uses statistics in its quality assurance program for home laundry detergent products.

One concern is customersatisfaction with the quantity ofdetergent in a carton. To control theproblem of heavy detergent powder, limits are placed on the acceptable range of powder density.

Page 4: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

STATISTICS in PRACTICE

Statistical samples are taken and the density of each powder sample is measured.

Data summaries are then providedfor operating personnel so thatcorrective action can be taken if necessary to keep the density within the desired quality.

Page 5: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Summarizing Qualitative Data Frequency Distribution Relative Frequency Distributions Percent Frequency Distributions Bar Graphs Pie Charts

Page 6: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

A frequency distribution is a tabular summary of data showing the frequency (or number) of items in each of several nonoverlapping classes.

A frequency distribution is a tabular summary of data showing the frequency (or number) of items in each of several nonoverlapping classes.

The objective is to provide insights about the data that cannot be quickly obtained by looking only at the original data.

The objective is to provide insights about the data that cannot be quickly obtained by looking only at the original data.

Summarizing Qualitative Data -- Frequency Distribution

Page 7: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Data from a sample of 50 Soft Drink Purchases

Summarizing Qualitative Data -- Frequency Distribution

Page 8: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Frequency Distribution

Soft Drink FrequencyCoke Classic 19Diet Coke 8Dr. Pepper 5Pepsi-Cola 13Sprite 5 Total 50

Summarizing Qualitative Data -- Frequency Distribution

Page 9: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Marada Inn, Guests staying at Marada Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below average, or poor.

Summarizing Qualitative Data -- Frequency Distribution

Page 10: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

The ratings provided by a sample of 20 guests are:

Below Average Above Average Above Average Average Above Average Average Above Average

Average Above Average Below Average Poor Excellent Above Average Average

Above Average Above Average Below Average Poor Above Average Average Average

Summarizing Qualitative Data -- Frequency Distribution

Page 11: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PoorBelow AverageAverageAbove AverageExcellent

2 3 5 9 1

Total 20

Rating Frequency

Summarizing Qualitative Data -- Frequency Distribution

Page 12: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

The relative frequency of a class is the fraction or proportion of the total number of data items belonging to the class.

The relative frequency of a class is the fraction or proportion of the total number of data items belonging to the class.

Frequency of the classRelative frequency of a class= n

Summarizing Qualitative Data – Relative Frequency

Page 13: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

A relative frequency distribution is a tabular summary of a set of data showing the relative frequency for each class.

A relative frequency distribution is a tabular summary of a set of data showing the relative frequency for each class.

Summarizing Qualitative Data – Relative Frequency Distribution

Page 14: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Relative and Percent Frequency Distribution of Soft Drink Purchases

Soft Drink Relative FrequencyCoke Classic .38Diet Coke .16Dr. Pepper .10Pepsi-Cola .26Sprite .10 Total 1.00

Summarizing Qualitative Data – Relative Frequency Distribution

Page 15: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Summarizing Qualitative Data – Percent Frequency Distribution

The percent frequency of a class is the relative frequency multiplied by 100.

The percent frequency of a class is the relative frequency multiplied by 100.

A percent frequency distribution is a tabular summary of a set of data showing the percent frequency for each class.

A percent frequency distribution is a tabular summary of a set of data showing the percent frequency for each class.

Page 16: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Percent Frequency Distribution of Soft Drink Purchases

Soft Drink Percent FrequencyCoke Classic 38Diet Coke 16Dr. Pepper 10Pepsi-Cola 26Sprite 10 Total 100

Summarizing Qualitative Data – Percent Frequency Distribution

Page 17: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PoorBelow AverageAverageAbove AverageExcellent

.10 .15 .25 .45 .05

Total 1.00

10 15 25 45 5 100

RelativeFrequency

PercentFrequencyRating

.10(100) = 10

1/20 = .05

Summarizing Qualitative Data – Relative and Percent Frequency Distribution

Page 18: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Summarizing Qualitative Data – Bar Graph

A bar graph is a graphical device for depicting qualitative data.

On one axis (usually the horizontal axis), we specify the labels that are used for each of the classes.

A frequency, relative frequency, or percent frequency scale can be used for the other axis (usually the vertical axis)

Using a bar of fixed width drawn above each class label, we extend the height appropriately.

Page 19: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Bar Graph of Soft Drink Purchases

Summarizing Qualitative Data – Bar Graph

Page 20: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PoorPoor BelowAverageBelowAverage

AverageAverage AboveAverage AboveAverage

ExcellentExcellent

Fre

qu

en

cy

Fre

qu

en

cy

RatingRating

1122

33

44

55

66

77

88

991010

Marada Inn Quality Ratings

Summarizing Qualitative Data – Bar Graph

Page 21: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Summarizing Qualitative Data – Pie Chart

The pie chart is a commonly used graphical device for presenting relative frequency distributions for qualitative data.

First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the relative frequency for each class.

Since there are 360 degrees in a circle, a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle.

Page 22: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Pie Chart of Soft Drink Purchases

Summarizing Qualitative Data – Pie Chart

Page 23: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

BelowAverage 15%

BelowAverage 15%

Average 25%Average 25%

AboveAverage 45%

AboveAverage 45%

Poor10%Poor10%

Excellent 5%Excellent 5%

Marada Inn Quality Ratings Marada Inn Quality Ratings

Summarizing Qualitative Data – Pie Chart--Example

Page 24: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Summarizing Qualitative Data – Pie Chart--Example

Insights Gained from the Preceding Pie Chart One-half of the customers surveyed gave Marada

a quality rating of “above average” or “excellent” (looking at the left side of the pie). This might please the manager.

For each customer who gave an “excellent” rating, there were two customers who gave a “poor” rating (looking at the top of the pie). This should displease the manager.

Page 25: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Summarizing Quantitative Data

Frequency Distribution Relative Frequency and Percent Frequency

Distributions Dot Plot Histogram Cumulative Distributions Ogive(折線圖 )

Page 26: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

The three steps necessary to define the classes for a frequency distribution with quantitative data are:

1. Determine the number of nonoverlapping classes. we recommend using between 5 and 20 classes. 2. Determine the width of each class.

3. Determine the class limits.

Largest data value –Smallest data valueApproximate class width= Number of classes

Summarizing Quantitative Data-- Frequency Distribution

Page 27: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Summarizing Quantitative Data-- Frequency Distribution

Use between 5 and 20 classes. Data sets with a larger number of elements

usually require a larger number of classes. Smaller data sets usually require fewer classes

Page 28: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Guidelines for Selecting Width of Classes

Use classes of equal width.

Approximate Class Width =

Largest data value –Smallest data value

Number of classes

Summarizing Quantitative Data-- Frequency Distribution

Page 29: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: These data show the time in days required to complete year-end audits for a sample of 20 clients of Sanderson and Clifford, a small public accounting firm with the data rounded to the nearest day. YEAR-END AUDIT

TIMES (IN DAYS)12 14 19 18 15 15 18 1720 27 22 2322 21 33 2814 18 16 13

Summarizing Quantitative Data-- Frequency Distribution

Page 30: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example:

1. Number of classes = 5

2. provides an approximate class width of

(33 — 12)/5= 4.2.

3. We therefore decided to round up and use a class width of

five days in the frequency distribution.

Audit Time Frequency (days) 10-14 4 15-19 8 20-24 5 25-29 2 30-34 1 Total 20

Summarizing Quantitative Data-- Frequency Distribution

Page 31: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

The manager of Hudson Auto would like to have a better understanding of the cost of parts used in the engine tune-ups performed in the shop. She examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide.

Summarizing Quantitative Data-- Frequency Distribution

Page 32: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Sample of Parts Cost for 50 Tune-ups

91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73

Summarizing Quantitative Data-- Frequency Distribution

Page 33: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

For Hudson Auto Repair, if we choose six classes:

50-59 60-69 70-79 80-89 90-99 100-109

2 13 16 7 7 5Total 50

Parts Cost ($)Frequency

Approximate Class Width = (109 - 52)/6 = 9.5 10

Summarizing Quantitative Data-- Frequency Distribution

Page 34: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

50-59 60-69 70-79 80-89 90-99 100-109

PartsCost ($)

.04 .26 .32 .14 .14 .10Total 1.00

RelativeFrequency

4 26 32 14 14 10 100

Percent Frequency

2/50 .04(100)

Summarizing Quantitative Data-- Relative Frequency andPercent Frequency Distributions

Page 35: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

• Only 4% of the parts costs are in the $50-59 class.

• The greatest percentage (32% or almost one-third) of the parts costs are in the $70-79 class.

• 30% of the parts costs are under $70.

• 10% of the parts costs are $100 or more.

• Insights Gained from the Percent Frequency

Distribution

Summarizing Quantitative Data-- Relative Frequency andPercent Frequency Distributions

Page 36: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Dot Plot One of the simplest graphical summaries of

data is a dot plot. A horizontal axis shows the range of data

values. Then each data value is represented by a dot

placed above the axis. Example: Dot Plot for The Audit Time Data

Page 37: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Dot Plot

Page 38: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

50 60 70 80 90 100 11050 60 70 80 90 100 110

Cost ($)Cost ($)

Dot Plot

Tune-up Parts Cost

. . . ..... .......... .. . .. . . ... . .. . . . . ..... .......... .. . .. . . ... . .. .

. . . .. . . . . .. . . . . .. .. .. .. . . . .. .. .. .. . .

Page 39: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Histogram Another common graphical presentation of quantitative data is a histogram. The variable of interest is placed on the horizontal axis. A rectangle is drawn above each class interval with its height corresponding to the interval’s frequency , relative frequency, or percent frequency. Unlike a bar graph, a histogram has no natural separation between rectangles of adjacent classes.

Page 40: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Histogram

Example: Histogram for The Audit Time Data

Page 41: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Histogram

22

44

66

88

1010

1212

1414

1616

1818

PartsCost ($) PartsCost ($)

Fre

qu

ency

Fre

qu

ency

50-59 60-69 70-79 80-89 90-99 100-11050-59 60-69 70-79 80-89 90-99 100-110

Tune-up Parts Cost

Page 42: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Histogram provides information about the shape. Symmetric

Left tail is the mirror image of the right tail Examples: heights and weights of people

HistogramR

ela

tive

Freq

uen

cyR

ela

tive

Freq

uen

cy

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Page 43: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Rel

ativ

e F

requ

ency

Rel

ativ

e F

requ

ency

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Histogram

Moderately Skewed Left A longer tail to the left Example: exam scores

Page 44: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Rel

ativ

e F

requ

ency

Rel

ativ

e F

requ

ency

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Moderately Right Skewed A Longer tail to the right Example: housing values

Histogram

Page 45: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Rel

ativ

e F

requ

ency

Rel

ativ

e F

requ

ency

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Histogram

Highly Skewed Right A very long tail to the right Example: executive salaries

Page 46: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Cumulative frequency distribution - shows the number of items with values less than or equal to the upper limit of each class.. Cumulative relative frequency distribution – shows the proportion of items with values less than or equal to the upper limit of each class.

Cumulative Distributions

Cumulative percent frequency distribution – shows the percentage of items with values less than or equal to the upper limit of each class.

Page 47: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Cumulative Distributions

Example: Cumulative Frequency, Cumulative Relative Frequency and Cumulative Percent

Frequency Distributions for the Audit Data.

Page 48: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Cumulative Distributions

Hudson Auto Repair

< 59 < 69 < 79 < 89 < 99< 109

Cost ($) CumulativeFrequency

CumulativeRelativeFrequency

CumulativePercent Frequency

2 15 31 38 45 50

.04 .30 .62 .76 .90 1.00

4 30 62 76 90 100

2 + 13

15/50 .30(100)

Page 49: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Ogive

• An ogive is a graph of a cumulative distribution.• The data values are shown on the horizontal

axis.• Shown on the vertical axis are the:

• cumulative frequencies, or• cumulative relative frequencies, or• cumulative percent frequencies

Page 50: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Ogive

• The frequency (one of the above) of each class is plotted as a point.

• The plotted points are connected by straight lines.

Page 51: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Ogive

These gaps are eliminated by plotting points halfway between the class limits.

Thus, 59.5 is used for the 50-59 class, 69.5 is used for the 60-69 class, and so on.

Hudson Auto Repair Because the class limits for the parts-

cost data are 50-59, 60-69, and so on, there appear to be one-unit gaps from 59 to 60, 69 to 70, and so on.

Page 52: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Ogive Example: Ogive for the Audit Time Data.

Page 53: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PartsCost ($) PartsCost ($)

2020

4040

6060

8080

100100

Cu

mu

lati

ve P

erc

en

t Fr

eq

uen

cyC

um

ula

tive P

erc

en

t Fr

eq

uen

cy

50 60 70 80 90 100 11050 60 70 80 90 100 110

(89.5, 76)

Ogive with

Cumulative Percent Frequencies Tune-up Parts CostTune-up Parts Cost

Page 54: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Exploratory Data Analysis

The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly. One such technique is the stem-and-leaf display.

Page 55: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stem-and-Leaf Display

A stem-and-leaf display shows both the rank order and shape of the distribution of the data.

It is similar to a histogram on its side, but it has the advantage of showing the actual data values.

The first digits of each data item are arranged to the left of a vertical line.

Page 56: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stem-and-Leaf Display

To the right of the vertical line we record the last digit for each item in rank order.

Each line in the display is referred to as a stem.

Each digit on a stem is a leaf.

Page 57: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stem-and-Leaf Display

Example: Number of Questions Answered Correctly on An Aptitude Test

Page 58: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stem-and-Leaf Display

Stem: The numbers to the left of the vertical line (6, 7, 8, 9, 10, 11, 12, 13, and 14).

Leaf: each digit to the right of the vertical line.

Page 59: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stem-and-Leaf Display Although the stem-and-leaf display may

appear to offer the same information as a histogram, it has two primary advantages. The stem-and-leaf display is easier to construct by

hand. Within a class interval, the stem-and-leaf display

provides more information than the histogram because the stem-and-leaf shows the actual data.

Page 60: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Hudson Auto Repair

The manager of Hudson Auto

would like to have a better

understanding of the cost

of parts used in the engine

tune-ups performed in the

shop. She examines 50 customer invoices for

tune-ups. The costs of parts , rounded to the

nearest dollar, are listed on the next slide.

Page 61: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Hudson Auto Repair Sample of Parts Cost for 50 Tune-ups

91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73

Page 62: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stem-and-Leaf Display

56789

10

2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9

a stema leaf

Page 63: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stretched Stem-and-Leaf Display

If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the display by using two stems for each leading digit(s).

Whenever a stem value is stated twice, the first value corresponds to leaf values of 0 - 4, and the second value corresponds to leaf values of 5 - 9.

Page 64: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stretched Stem-and-Leaf Display

5 5 91 47 7 7 8 91 35 8 90 0 2 35 5 5 6 7 8 9 9 91 1 2 2 3 4 45 6 7 8 8 8 9 9 92 2 2 2725

566778899

1010

Page 65: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Stem-and-Leaf Display

Leaf Units

• Where the leaf unit is not shown, it is assumed to equal 1.

• Leaf units may be 100, 10, 1, 0.1, and so on.• In the preceding example, the leaf unit was 1.

• A single digit is used to define each leaf.

Page 66: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Leaf Unit = 0.1

If we have data with values such as

8 91011

Leaf Unit = 0.16 8

1 42

0 7

8.6 11.7 9.4 9.1 10.2 11.0 8.8

a stem-and-leaf display of these data will be

Page 67: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Leaf Unit = 10

If we have data with values such as

16 17 18 19

Leaf Unit = 108

1 90 3

1 7

1806 1717 1974 1791 1682 1910 1838

a stem-and-leaf display of these data will be

The 82 in 1682is rounded downto 80 and isrepresented as an 8.

Page 68: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Cross-tabulations and Scatter Diagrams

Thus far we have focused on methods that are used to summarize the data for one variable at a time.

Often a manager is interested in tabular and graphical methods that will help understand the relationship between two variables.

Cross-tabulation and a scatter diagram are two methods for summarizing the data for two (or more) variables simultaneously.

Page 69: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Crosstabulation

The left and top margin labels define the classes for the two variables.

Crosstabulation can be used when:• one variable is qualitative and the other is• quantitative,• both variables are qualitative, or• both variables are quantitative.

A crosstabulation is a tabular summary of data for two variables.

Page 70: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Crosstabulation

Example: Data from Zagat’s Restaurant Review. Data on a restaurant’s quality rating and

typical meal price are reported. Quality rating is a qualitative variable with

rating categories of good, very good, and

excellent. Meal price is a quantitative variable

that ranges from $10 to $49.

Page 71: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Crosstabulation

Page 72: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Crosstabulation

Example: Crosstabulation of Quality Rating and Meal Price for 300 Los Angeles Restaurants

Page 73: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PriceRange Colonial Log Split A-FrameTotal

< $99,000> $99,000

18 6 19 12 55

45

30 20 35 15Total 100

12 14 16 3

Home Style

Crosstabulation• Example: Finger Lakes Homes

The number of Finger Lakes homes sold for each style and price for the past two years is shown below. quantitativ

e variable

qualitative variable

Page 74: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Crosstabulation Insights Gained from Preceding Crosstabulation

• Only three homes in the sample are an A-Frame style and priced at more than $99,000.

• The greatest number of homes in the sample (19) are a split-level style and priced at less than or equal to $99,000.

Page 75: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PriceRange Colonial Log Split A-FrameTotal

< $99,000> $99,000

18 6 19 12 55

45

30 20 35 15Total 100

12 14 16 3

Home Style

CrosstabulationFrequency distributionfor the price variable

Frequency distributionfor the home style variable

Page 76: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Crosstabulation: Row or Column Percentages

Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.

Page 77: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PriceRange Colonial Log Split A-Frame Total

< $99,000

> $99,000

32.73 10.91 34.55 21.82 100

100

Note: row totals are actually 100.01 due to rounding.

26.67 31.11 35.56 6.67

Home Style

(Colonial and > $99K)/(All >$99K) × 100 = (12/45) × 100

Crosstabulation: Row Percentages

Page 78: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

PriceRange Colonial Log Split A-Frame

< $99,000> $99,000

60.00 30.00 54.29 80.0040.00 70.00 45.71 20.00

Home Style

100 100 100 100Total

(Colonial and > $99K)/(All Colonial) × 100 = (12/30) × 100

Crosstabulation: Column Percentages

Page 79: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Crosstabulation: Simpson’s Paradox

Data in two or more crosstabulations are often aggregated to produce a summary crosstabulation.

We must be careful in drawing conclusions about the relationship between the two variables in the aggregated crosstabulation.

Simpson’ Paradox: In some cases the conclusions based upon an aggregated crosstabulation can be completely reversed if we look at the unaggregated data.

Page 80: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

• One variable is shown on the horizontal axis and the other variable is shown on the vertical axis.

• A scatter diagram is a graphical presentation of the relationship between two quantitative variables.

Scatter Diagram and Trendline

Page 81: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

• The general pattern of the plotted points suggests the overall relationship between the variables.

Scatter Diagram and Trendline

• A trendline is an approximation of the

relationship.

Page 82: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Scatter Diagram and Trendline A Positive Relationship

x

y

Page 83: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Scatter Diagram and Trendline A Negative Relationship

x

y

Page 84: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Scatter Diagram and Trendline

No Apparent Relationship

x

y

Page 85: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Example: Panthers Football Team

Scatter Diagram

The Panthers football team is interested

in investigating the relationship, if any,

between interceptions made and points scored.

13213

1424181730

x = Number ofInterceptions

y = Number of Points Scored

Page 86: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Scatter Diagramy

x

Number of Interceptions

Num

ber

of

Poin

ts S

core

d

510

15

2025

30

0

35

1 2 30 4

Page 87: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

• Insights Gained from the Preceding Scatter Diagram

• The relationship is not perfect; all plotted points in the scatter diagram are not on a straight line.

• Higher points scored are associated with a higher number of interceptions.

• The scatter diagram indicates a positive relationship between the number of interceptions and the number of points scored.

Example: Panthers Football Team

Page 88: 1/54 Statistics Descriptive Statistics— Tables and Graphics.

Tabular and Graphical Procedures

Qualitative DataQualitative Data Quantitative DataQuantitative Data

TabularMethods TabularMethods

TabularMethods TabularMethods

Graphical MethodsGraphical Methods

Graphical MethodsGraphical Methods

• Frequency Distribution• Rel. Freq. Dist.• Percent Freq. Distribution• Crosstabulation

• Bar Graph• Pie Chart

• Frequency Distribution• Rel. Freq. Dist.• Cum. Freq. Dist.• Cum. Rel. Freq. Distribution • Stem-and-Leaf Display• Crosstabulation

• Dot Plot• Histogram• Ogive• Scatter Diagram

DataData