ch13-solns-all_skuce_2e

8/12/2019 ch13-solns-all_skuce_2e

1/38

Instructors Solutions Manual - Chapter 13Chapter 13 Solutions

Develop Your Skills 13.1

1. The scatter diagram is shown below.

y=6.6519x+4.7013

0

20

40

60

80

100

120

140

5 7 9 11 13 15 1

TotalSales($000)

NumberofSalesContacts

HendrickSoftwareSales

7

The least-squares regression line is:total sales ($000) = 6.6519(number of sales contacts) + 4.7013

Interpretation: Each new sales contact results in an increase in sales of

approximately $6,652.

The y-intercept should not be interpreted, since the sample data did not contain anyobservations of 0 sales contacts.

2. The equation of the least-squares regression line ismonthly spending on restaurant meals = 0.024144(monthly income)+$44.90

Interpretation: Each new dollar in monthly income increases spending on restaurantmeals by about 2.4.

Copyright 2011 Pearson Canada Inc. 351


2/38

Instructors Solutions Manual - Chapter 133. A scatter diagram is shown below.

y=30.21x148770

$0

$200,000

$400,000

$600,000

$800,000

$1,000,000

$1,200,000

$1,400,000

$1,600,000

$0 $10,000 $2 0,000 $30,000 $ 40,000 $50,000

Sales

PromotionExpenditure

SmithandKleinManufacturing

The least-squares regression line is:annual sales = 30.21(annual promotion spending) - $148,770

Interpretation: Each new dollar in promotion spending results in an increase inannual sales of approximately $30.21.

The y-intercept should not be interpreted, since the sample data did not contain anyobservations of $0 annual promotion spending.

4. The response variable is the semester average mark, and the explanatory variable isthe total number of hours spent working during the semester. The relationship isunlikely to be positive.

y = 0.1535x + 90.241

suggests that a student who worked no hours would get a mark of 90%, which seemsa little high (but this intercept may not be reasonable to interpret this way, depending

on the range of hours worked in the sample data).

It also suggests that for each hour worked, the students mark would increase by0.1535, which seems unlikely. It is more likely that the student's mark woulddecrease for each hour worked.



3/38

Instructors Solutions Manual - Chapter 135. Because of the way the researcher has posed the question, the response variable is

revenues, and the explanatory variable is the number of employees.

The scatter diagram is shown below:

y=0.1338x+140.56

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

0 5,000 10,000 15,000 20,000 25,000 30,000 35,000

GlobalRes

earchRevenues(US$Millions)

FullTimeEmployees

Top25GlobalResearch

Organizations,2007

The least-squares regression line is:revenue (US$millions) = 0.1338(number of full-time employees) + $140.56 USmillion

Interpretation: Each additional thousand employees results in increased revenue ofUS$0.1338 million (or US$133,800).

The y-intercept should not be interpreted, since the sample data did not contain anyobservations of 0 employees.



4/38

Instructors Solutions Manual - Chapter 13Develop Your Skills 13.2

6. The scatter diagram showed an apparently linear relationship between software salesand the number of sales contacts (see Develop Your Skills 13.1, Exercise 1).

20

15

10

5

0

5

10

15

20

0 5 10 15 20Residuals

NumberofSalesContacts

Numberof

Sales

Contacts

ResidualPlot

The residual plot shows residuals centred on zero, with fairly constant variability.There is no indication that the error terms are not independent. The data werecollected over a random sample of months, but the dates of collection are notincluded, so it is not possible to check for independence of the residuals over time.A histogram of the residuals appears to be approximately normal.

01

2

3

4

5

6

7

8

9

Frequency

Residual

HendrickSoftwareSalesResiduals



5/38

Instructors Solutions Manual - Chapter 13A check of the scatter diagram and the standardized residuals does not reveal anyoutliers. There are no obvious influential observations. It appears that the sampledata meet the requirements of the theoretical model.

7. The scatter diagram does not contain much of a pattern, but if there is a relationship,

it appears to be linear.

y=0.0241x+44.903

$0

$50

$100

$150

$200

$250

$1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500

MonthlySpendingonRestaurantMeals

MonthlyIncome

SpendingonRestaurantMealsand

Income

150

100

50

0

50

100

150

$

$1,000

$2,000

$3,000

$4,000

$5,000

Residuals

MonthlyIncome

MonthlyIncome

Residual

Plot

The residual plot shows a fairly constant variability, although the residuals appear tobe a little larger on the positive side (except in the area of monthly incomes ofaround $3,500). There is no obvious dependence among the residuals.



6/38

Instructors Solutions Manual - Chapter 13

A histogram of the residuals appears to be approximately normal.

0

5

10

15

20

25

30

Frequency

Residual

ResidualsforModelofRestaurant

SpendingandMonthlyIncome

A check of the scatter diagram and the standardized residuals reveals six points thatcould be considered outliers. They are circled on the scatter diagram below.

$0

$50

$100

$150

$200

$250

$1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500

MonthlySpendingonRestaurantMeals

MonthlyIncome

SpendingonRestaurantMealsand

Income



7/38


8/38

Instructors Solutions Manual - Chapter 13However, at this point in the analysis, it would be useful to go back to the beginning.It does not appear that monthly income is a strong predictor of monthly restaurantspending. There is too much variability in the restaurant spending data, for thevarious income levels, for us to develop a useful model.

8. The scatter diagram shows the points arranged in a linear fashion. However, thescatter around the regression line appears to widen as the amount of promotionalspending increases.This shows quite clearly in the residual plot.

300000

200000

100000

0

100000

200000

300000

$0 $10,000 $20,000 $30,000 $40,000 $50,000

Residuals

PromotionExpenditure

PromotionExpenditure Residual

Plot

At this point, it is clear that the data do meet the requirements of the theoreticalmodel. [For completeness, we will continue to check the other requirements.]



9/38

Instructors Solutions Manual - Chapter 13This is time-series data, and so the residuals should be plotted against time. Theresulting plot shows a definite pattern over time, with the residuals widening in morerecent years. This again indicates a problem; the current model does not meet therequirements of the theoretical model.

250000

200000150000

100000

50000

0

50000

100000

150000

200000

250000

1980

1982

1984

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

Residual

ResidualsOverTime,SmithandKlein

Manufacturing

At this point, it is clear that the model should be re-specified. Introducing time as anexplanatory variable would probably be of interest.



10/38

Instructors Solutions Manual - Chapter 139. With the two erroneous data points removed, the scatter diagram looks as shown

below.

y=0.144x+89.175

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400

SemesterAverageMark

TotalHoursatPaidJobDuringSemester

Hours

of

Work

and

Semester

Marks

The relationship appears to be linear.The residual plot is shown below.

20

15

10

5

0

5

10

15

0 100 200 300 400Residuals

TotalHoursatPaidJobDuringSemester

TotalHours

at

Paid

Job

During

Semester ResidualPlot

The residuals appear centred on zero, with fairly constant variability, althoughvariability seems greatest in the middle of the range of hours worked.



11/38

Instructors Solutions Manual - Chapter 13There is no indication that the residuals are dependent.

A histogram of the residuals is shown below.

0

2

4

6

8

10

12

14

Frequency

Residual

Residualsfor

Semester

Mark

and

Hours

ofWorkData

The histogram is quite normal in shape.

A check of the standardized residuals does not reveal any that are -2 or +2,although there is one observation with a standardized residual of -1.99. This is theobservation (72, 65). [If we could, we would check this data point to make sure thatit is accurate.] This point is quite obvious in both the scatter diagram and residual

plot (the point is circled in these two graphs).

There are no obvious influential observations, except perhaps for the almost-outlier.Removing this point from the data set does not affect the least squares regression linesignificantly.

Despite the one troublesome point, the data set does appear to meet the requirementsof the theoretical model.



12/38

Instructors Solutions Manual - Chapter 1310. The relationship between revenues and number of employees appears to be linear.

The residual plot is shown below.

600

400

200

0

200

400

600

800

1000

1200

0 5000 10000 15000 20000 25000 30000 35000

Residuals

FullTimeEmployees

FullTime

Employees

Residual

Plot

The residuals do not appear to be centred on zero, and the variability is not constant.At this point, it appears that this sample data set does not appear to meet therequirements of the theoretical model.


0

2

4

6

8

10

12

14

16

Freq

uency

Residuals

ResidualsforTop25GlobalResearch

Organizations



13/38

Instructors Solutions Manual - Chapter 13The histogram of residuals confirms what we saw in the residual plot. The residualsare highly skewed to the right.

There is one observation with a standardized residual of 3.8. The corresponding pointis circled on the residual plot above.


11. Since the sample data meet the requirements, it is acceptable to proceed with thehypothesis test.H0: 1= 0 (that is, there is no linear relationship between the number of sales

contacts and sales)H1: 1> 0 (that is, there is a positive linear relationship between the number of sales

contacts and sales)= 0.05From the Excel output, t = 7.64The p-value is 9.38E-08, which is very small. The p-value for the one-tailed test is

only half of this value, and is certainly < . In other words, there is almost no chanceof getting sample results like these, if in fact there is no linear relationship betweenthe number of sales contacts and sales. Therefore, we can (with confidence), rejectthe null hypothesis and conclude there is evidence of a positive linear relationshipbetween the number of sales contacts and sales data for the Hendrick Software SalesCompany.

12. We already expect that the model will not be particularly useful. The number of datapoints with standardized residuals either +2 or -2 are a concern. However, thehypothesis test provides some evidence that there is a linear relationship betweenmonthly income and monthly spending on restaurant meals.

H0: 1= 0 (that is, there is no linear relationship between monthly income andmonthly spending on restaurant meals)

H1: 1> 0 (that is, there is a positive linear relationship between the number of salescontacts and sales)

= 0.05From the Excel output, t = 4.6. The p-value is on the output is 1.338E-05, and the p-value for the one-tailed test is half of this. Reject H0and conclude there is evidenceof a positive linear relationship between monthly income and monthly spending onrestaurant meals.

13. Since the sample data do not meet the requirements of the theoretical model, it is notappropriate to conduct a hypothesis test.



14/38

Instructors Solutions Manual - Chapter 1314. Since the sample data meet the requirements, it is acceptable to proceed with the

hypothesis test.H0: 1= 0 (that is, there is no linear relationship between the number of hours

worked during the semester and the semester average grade)H1: 1< 0 (that is, there is a negative linear relationship between the number of

hours worked during the semester and the semester average grade)= 0.05From the Excel output, t = -10.01The p-value is 2.47086E-12, which is very small. The p-value for the one-tailed testis only half of this value, and is certainly < . In other words, there is almost nochance of getting sample results like these, if in fact there is no linear relationshipbetween the number of hours worked during the semester and the semester averagegrade. Therefore, we can (with confidence), reject the null hypothesis and concludethere is evidence of a negative linear relationship between the number of hoursworked during the semester and the semester average grade.

15. Since the sample data do not meet the requirements of the theoretical model, it is notappropriate to conduct a hypothesis test.


16. From the Excel output, R2= 0.72. This means that 72% of the variation in sales is

explained by the number of sales contacts. This suggests a fairly strong linearassociation between the two variables, which is not surprising.

Assuming the original data was collected correctly, it is possible that the otherfactors affecting sales have been randomized. In such a case, it would seemreasonable to conclude that increasing sales contacts would lead to increased sales.However, there will likely be limits to the positive impact that could be created.Presumably, salespeople contact their best prospective clients first, so additionalcontacts may not be as productive. As well, increasing the number of contacts mayreduce the quantity of time spent with each contact, which could have a detrimentaleffect on sales.

17. The R2value for this data set is only 0.18. This is not surprising, because the scatterdiagram of the relationship revealed scarcely any perceivable pattern. Only 18% ofthe variation in monthly spending on restaurant meals is explained by income.Earlier investigations suggested this model was not worth pursuing, and the low R

2

value reinforces that.

18. The R2value is fairly high, at 0.83. This means that 83% of the variation in Smithand Kleins sales is explained by sales promotion spending. However, while there isa strong association between the two variables, the linear regression model is not agood one.



15/38

Instructors Solutions Manual - Chapter 1319. The R2value, at 0.72, suggests that 72% of the variation in semester average marks

is explained by hours spent working during the semester. (Note that this is for theamended data set, where the two erroneous grades have been removedsee DevelopYour Skills 13.2, Exercise 9). Obviously, there are many factors that affect semesteraverage marks, for example, ability, study habits, past educational experience, and so

on. If the original data were collected in a truly random fashion, these factors mayhave been randomized.

It seems reasonable to conclude that students who work less will have more time fortheir studies, and it seems reasonable to think that marks improve with time spentstudying. However, this data set does not guarantee that reducing work will lead toimproved marks.

20. The R2value is 0.93. Notice that this value looks very promising. Remember,though, that the model did not meet the requirements of the theoretical model.Remember, a high R2value does not guarantee a cause-and-effect relationship, or a

useful model.


21. Since the requirements are met, it is appropriate to create a confidence interval.The Excel output is shown below (in two parts, to better fit on the page).

ConfidenceIntervaland PredictionIntervals Calculations

Point 98% =ConfidenceLevel(%)

Number NumberofSalesContacts

1 10 PredictionInterval ConfidenceInterval

Lowerlimit Upperlimit Lowerlimit Upperlimi

44.96826 97.471443 66.068659 76.37104

With 98% confidence, the interval ($66,069, $76,371) contains the average sales for10 sales contacts.

22. We have already established this is not a good model. However, even if it were agood model, we would not use it to predict monthly spending on restaurant mealsbased on a monthly income of $6,000. The highest monthly income in the sample

data set is $4,056, and so we should not rely on our model to make predictions for amonthly income of $6,000.

23. Since the requirements are not met, it is not appropriate to create a confidenceinterval.



16/38

Instructors Solutions Manual - Chapter 1324. The Excel output is shown below (note that this is for the amended data set, where

the two erroneous grades have been removedsee Develop Your Skills 13.2,Exercise 9).

Confidence

Interval

and

Prediction

Intervals

CalculationsPoint 95% =ConfidenceLevel(%)

Number TotalHoursatPaidJobDuringSemester


Lowerlimit Upperlimit Lowerlimit Upperlimit

46.027952 74.74128231 58.1586403 62.61059452

With 95% confidence, the interval (58.2, 62.6) contains the average semester averagemark, when students work 200 hours in paid employment during the semester.

25. Since the requirements are not met, it is not appropriate to construct a predictioninterval.

Chapter Review Exercises

1. The hypothesis test is only valid if the required conditions are met. If you don't checkconditions, you may rely on a hypothesis test when it is misleading.

2. Regression prediction intervals are wider than confidence intervals because theinterval has to account for the distribution of y-values around the regression line. Theregression confidence interval has to take into account only that the sample

regression line may not match the true population regression line.

3. A lower standard error means that confidence and prediction intervals will benarrower. Predictions made with the model will therefore be more useful.

4. You should not make predictions outside the range of the sample data on which theregression relationship is based because the relationship may be very different there.For example, a linear model may provide a good approximation of a portion of arelationship that is actually a curved line. However, if the line is extended beyondthis portion, it could be quite misleading.

5. It is always tempting to just remove problem data points. However, if you do this,you will often find that the remaining data points also have outliers. If you persist inthe practice of removing troublesome data points, you may not have much data left!

Careful thinking is a better approach. The outlier may be telling you something reallyimportant about the actual relationship between the explanatory and responsevariables. You wouldn't want to miss this important clue to what is really going on.



17/38

Instructors Solutions Manual - Chapter 136. The scatter diagram is shown below.

y=0.0374x+18017

$10,000

$12,000

$14,000

$16,000

$18,000

$20,000

$22,000

0 20,000 40,000 60 ,000 80 ,000 100,0 00 120,000

ListPrice

OdometerReading

ListPriceandOdometerReadingfor2006

HondaCivic

Sedan

(as

of

Fall

2008)

The relationship is:$list price = -0.0374 (odometer reading in kilometers) + $18,017

For this small car, the base asking price is $18,017, which is reduced by about 3.7for every kilometer on the odometer. However, note that this base asking priceshould not be trusted for any cars with fewer than 8,600 kilometres, since no cars in

the data set had odometer readings below that.



18/38

Instructors Solutions Manual - Chapter 137. We have already examined the scatter diagram, which suggests a negative linear

relationship.

The residual plot is shown below. It has the desired appearance of constantvariability, with the residuals centred on zero.

4000

3000

2000

1000

0

1000

2000

3000

0 20000 40000 60000 80000 100000 120000

Residuals

Odometer

Odometer ResidualPlot

A histogram of the residuals is shown below. The histogram is not perfectlynormally-distributed, but it is approximately so.

0

12

3

4

5

6

7

8

Frequency

Residual

ResidualsforHondaCivicListPrice

Model,BasedonOdometer



19/38

Instructors Solutions Manual - Chapter 13There are no standardized residuals +2 or -2.

It appears the sample data meet the requirements of the theoretical model, and so itwould be appropriate to use odometer readings to predict the list prices of these usedcars.

A 95% prediction interval for the list price for one of these cars with 50,000kilometres on the odometer is ($12,683, $19,608). The Excel output is shown below.

ConfidenceIntervaland PredictionIntervals Calculations

Point 95% =Confidence Level(%)

Number Odometer


Lower

limit Upper

limit Lower

limit Upper

limit12683.4909 19607.9242 15259.8312 17031.584

8. A scatter diagram showing the two stock market indexes is shown below. Note thatthe data used are the "adjusted close" figures. You must take care to match thedatesthere are a few instances when one market is open and the other is not.Observations that did not have a match were removed from the data set.

y=1.2553x 894.84

7,000

7,500

8,000

8,500

9,000

9,500

10,000

10,500

11,000

6,000 6,500 7,000 7,500 8,000 8,500 9,000 9,500

S&P/TSXCompositeIndex

DowJonesIndustrialAverage

TSX

and

DJI,

January

June,

2009

The estimated relationship is as follows:TSX Composite Index = 1.255 (DJI) 895



20/38

Instructors Solutions Manual - Chapter 13Note that the choice of variable on the x or y axis is somewhat arbitrary here.Because Canada's economy is so dependent on exports to the US, the DJI is placed asthe "explanatory" variable, but the cause and effect is not direct.

9. The coefficient of determination for the TSX and the DJI over the first six months of

2009 is 0.72. This measure suggests that 72% of the variation in the TSX isexplained by variation in the DJI.

10. This data set is not a random sample, because it includes all matched observationsover the period studied. Could this be considered a random sample? Probably not.The credit crisis and the recession that were having impacts on the stock markets inthe first six months of 2009 made this period unreliable as a model of how the twoindexes behave during more normal times. However, it is interesting to examine thepatterns in the indexes over the period.

The indexes were more closely related at the beginning of 2009 than they were later

in the period. A time-series plot reveals this quite clearly.

6,000

6,500

7,000

7,500

8,0008,500

9,000

9,500

10,000

10,500

11,000

02

Jan09

16

Jan09

30

Jan09

13

Feb

09

27

Feb

09

13

Mar09

27

Mar09

10

Apr09

24

Apr09

08

May09

22

May09

05

Jun09

19

Jun09

IndexV

alues

TSXandDJI,JanuaryJune2009

DJI

TSX

The required conditions are not met (as we might expect, given the graph above).



21/38

Instructors Solutions Manual - Chapter 13The residual plot clearly shows non-constant variability.

1500

1000

500

0

500

1000

6,000 6,500 7,000 7,500 8,000 8,500 9,000 9,500

Residuals

DJI

DJIResidualPlot

As well, the histogram of residuals shows marked negative skewness.

0

5

10

15

20

25

30

35

40

Frequency

Residual

Residuals,TSXandDJIData,January

June2009



22/38

Instructors Solutions Manual - Chapter 13A plot of the residuals over time clearly shows a time-related pattern.

1500

1000

500

0

500

1000

02

Jan09

16

Jan09

30

Jan09

13

Feb

09

27

Feb

09

13

Mar09

27

Mar09

10

Apr09

24

Apr09

08

May09

22

May09

05

Jun09

19

Jun09

Residuals

ResidualsOverTime,TSXandDJIData,January

June

2009



23/38

Instructors Solutions Manual - Chapter 1311. A scatter diagram is shown below.

y=

0.9586x

+0.4464

0

10

20

30

40

50

60

70

80

90100

0 20 40 60 80 10

MarkonFinalExam

MarkonTest#2

StudentMarksinStatistics

0

The estimated relationship is as follows:Mark on final exam = 0.9586 (Mark on Test #2) + 0.4464

In other words, it appears the mark on the final exam is about 96% of the mark onTest #2.



24/38

Instructors Solutions Manual - Chapter 1312. The residual plot has the desired appearance.

15

10

5

0

5

10

0 20 40 60 80 100

Residuals

Markon

Test#2

Markon

Test#2

Residual

Plot

A histogram of the residuals appears approximately normally-distributed.

0

2

4

6

8

10

Frequency

Residual

ResidualsforFinalExamMarks

PredictionModel

There are no obvious influential observations or outliers. It appears that the sampledata conform to the requirements of the theoretical model.



25/38

Instructors Solutions Manual - Chapter 1313. Since the sample data meet the requirements, it is acceptable to proceed with the

hypothesis test.H0: 1= 0 (that is, there is no linear relationship between the mark on Test #2 and

the final exam mark in Statistics)H1: 1> 0 (that is, there is a positive linear relationship between the mark on Test #2

and the final exam mark in Statistics)= 0.05

From the Excel output, t = 16.5

The p-value is 2.96E-14, which is very small. The p-value for the one-tailed test isonly half of this value, and is certainly < 5%. In other words, there is almost nochance of getting sample results like these, if in fact there is no linear relationshipbetween the mark on Test #2 and the final exam mark in Statistics. Therefore, rejectH0 and conclude there is strong evidence of a positive linear relationship betweenthe mark on Test #2 and the final exam mark in Statistics.

14a. The Excel output is shown below.

PredictionInterval ConfidenceInterval

Lowerlimit Upperlimit Lowerl imit Upperlimit

51.78719489 73.7293732 60.5028627 65.013705 b. The 95% confidence interval estimate for the average exam mark of students who

had a mark of 65% on the second test in the Statistics course is (60.5, 65).

c. The 95% prediction interval estimate for the exam mark of a student who had a markof 65% on the second test in the Statistics course is (51.8, 73.75). This interval iswider, because it has to take into the account the variability in individual marks ofthe students. The regression prediction interval is always wider than the confidenceinterval. The prediction interval has to take account of the distribution of exammarks around the regression line.



26/38


27/38

Instructors Solutions Manual - Chapter 1316. As the scatter diagram created for Exercise 15 indicates, there appears to be a fairly

strong positive linear relationship between the recorded and audited inventoryvalues.


60

40

20

0

20

40

60

80

$ $200 $400 $600 $800 $1,000

Re

siduals

RecordedPartsInventoryValue

RecordedPartsInventory

Value ResidualPlot

The residual plot shows residuals fairly randomly distributed around zero, with about

the same variability for all x-values. There are two residuals that show unusualvariability. They are circled in the plot.

The data were all collected at about the same point in time, so there is no need tocheck residuals against time.

A review of the standardized residuals reveals two outliers, observation #1 andobservation #25 (these are the two points that are circled in the residual plot). Sincethe auditor has realized that he misread the written records for both data points, wewill amend the data, and re-do the analysis.



28/38


29/38

Instructors Solutions Manual - Chapter 13The residual plot for the amended data plot is shown below.

40302010

010203040

$ $200 $400 $600 $800 $1,000

Residuals

RecordedParts

Inventory

Value

RecordedPartsInventory

ValueResidual

Plot

The residual plot for the amended data set looks acceptable.

A histogram of the residuals for the amended data set is shown below.

0

1

2

3

4

5

6

7

8

9

Frequency

Residual

ResidualsforAriesCarPartsModel

The histogram of residuals shows some positive skewness, and this is a cause forconcern, suggesting caution in the use of the model.



30/38

Instructors Solutions Manual - Chapter 13A check of the standardized residuals does not reveal any outliers. There are noobviously influential observations. It appears the corrected data set meets therequirements for the linear regression model, although the distribution of theresiduals is not as normal in shape as is desired.

17. While we have some concern about the distribution of residuals, we will proceedwith the hypothesis test.

H0: 1= 0 (that is, there is no linear relationship between the recorded inventoryvalues and the audited inventory values)

H1: 10 (that is, there is a linear relationship between the recorded inventoryvalues and the audited inventory values)

= 0.05

An excerpt of Excels regression output is shown below.

SUMMARYOUTPUT

RegressionStatistics

MultipleR 0.995213711

RSquare 0.99045033

AdjustedRSquare 0.990160946

StandardError 16.61634358

Observations 35

ANOVA

df SS MS F

Regression 1 944994.372 944994.372 3422.616936

Residual 33 9111.394836 276.1028738

Total 34 954105.7668

Coefficients StandardError t Stat Pvalue

Intercept 25.22708893 8.612571593 2.929100636 0.006122286

RecordedParts

InventoryValue 0.978281557 0.016721865 58.50313612 6.47389E35

From the Excel output, t = 58.503.The p-value is 6.47389E-35, which is very small, and certainly < 5%. In otherwords, there is almost no chance of getting sample results like these, if in fact there isno linear relationship between the recorded inventory values and the auditedinventory values. Therefore, reject the null hypothesis and conclude there isevidence of a linear relationship between the recorded and audited inventory values.



31/38


32/38


33/38

Instructors Solutions Manual - Chapter 1320. A scatter diagram for the data is shown below.

y=0.6421x+4.9775

R=0.7989

3035

40

45

50

55

60

65

70

75

50 60 70 80 90 100Scoreon

TestGivenDuringJobInterview

FinallOverallAverageGrade

PerformanceofGraduatesonTest

GivenDuring

Job

Interview

It appears there is a positive linear relationship between the final overall averagegrade and the score on the test given during the job interview. The regressionrelationship is as follows:

score on test given during job interview= 0.6421(final overall average grade) + 4.98

This is promising. Since the grades are marked out of 100, and the test scores are outof 70, the slope would be 0.70 if the relationship was perfect.



34/38

Instructors Solutions Manual - Chapter 1321. As discussed in Exercise 20 above, there appears to be a positive linear relationship

between the final overall average grade and the score on the test given during the jobinterview.


8

6

4

2

0

2

4

6

8

50 60 70 80 90 100Residuals

FinalAverageMark

FinalAverageMark ResidualPlot

The residuals appear randomly distributed around zero, with the same variability forall x-values.


0

2

4

6

8

10

12

Frequency

Residual

ResidualsforTestScoreModel

The residuals appear approximately normally distributed.



35/38


36/38

Instructors Solutions Manual - Chapter 1324. Refer back to the output shown above in the solution to Exercise 23.

With 98% confidence, we estimate that the interval (44.0, 62.3) contains the testscore of a student with an overall average mark of 75.

It is difficult to decide if the company should continue to administer its own test.The answer depends on how reliable a predictor of future performance the test hasbeen, and what the costs of administering the tests have been. If the company testmakes a major distinction between the predicted performance of someone with a testscore of 44 and someone with a test score of 62, then the overall average grade maynot be a good substitute. However, there is fairly strong relationship between thetwo variables. Perhaps the company could pilot using the overall average grade witha random sample of graduates, to see how well they do.

25. No, it would not be appropriate to use package weight as a predictor of shipping cost.We can see from the residual plot that variability increases as package weight

increases.

26. It is often suggested that the Canadian stock market is very closely tied to the priceof oil. A data set of weekly values for the Toronto Stock Exchange Composite Index(TSX) and the Canadian spot price of oil in dollars per barrel for the period fromJanuary 2000 to June 2009 was examined. The scatter diagram (shown below),suggests that while there may be a relationship between the two variables, it is notlinear.

y=76.584x+6039.5

R=0.6902

4,000

6,000

8,000

10,000

12,000

14,000

16,000

$0 $20 $40 $60 $80 $100 $120 $140 $160

S&PTSXCompositeIndex

WeeklyCanadianParSpotPrice(DollarsperBarrel)

TSXandCanadianOilPrices,January

2007June2009



37/38

Instructors Solutions Manual - Chapter 13The non-linearity is evident in the residual analysis, as well.

5000

4000

3000

2000

1000

0

1000

2000

3000

4000

0 20 40 60 80 100 120 140 160

Residuals

WeeklyCanadianParSpotPriceFOB (DollarsperBarrel)

WeeklyCanadianParSpotPriceFOB

(Dollarsper

Barrel)

Residual

Plot

5000

4000

3000

2000

1000

0

1000

2000

3000

4000

0

3/01/2000

0

3/06/2000

0

3/11/2000

0

3/04/2001

0

3/09/2001

0

3/02/2002

0

3/07/2002

0

3/12/2002

0

3/05/2003

0

3/10/2003

0

3/03/2004

0

3/08/2004

0

3/01/2005

0

3/06/2005

0

3/11/2005

0

3/04/2006

0

3/09/2006

0

3/02/2007

0

3/07/2007

0

3/12/2007

0

3/05/2008

0

3/10/2008

0

3/03/2009

Residual

ResidualsOverTime,TSXandOilPriceModel

There appears to be a time-related pattern in the residuals. This is also apparent in thepatterns of extreme residuals (those with standardized residuals either +2 or -2).They predictably occur in the period of August in 2000, January July 2007, July2008 and September-October 2008. While the model could probably be improved bythe addition of a time variable, it is not clear how this could be used for predictive



38/38

Instructors Solutions Manual - Chapter 13purposes. It would be probably be more useful to investigate what other explanatoryvariables were affecting the stock market over this period. As well, non-linearmodels could be explored.

ch13-solns-all_skuce_2e

Documents

Transcript of ch13-solns-all_skuce_2e