통자발표2

Factors That Can Affect Model

Performance

Seonwoo Lee

Contents

1. Introduction

2. Type III Errors

3. Measurement Errors

4. Discretizing Continuous Outcomes

5. When Should You Trust Your Model’s Prediction?

6. The Impact of a Large Sample

2

Several of the preceding chapters have focused on technical pitfalls of predictive models,

such as over-fitting and class imbalances.

Ture success may depend on aspects of the problem that are not directly related on the

model itself.

This chapter discusses several important aspects of creating and maintaining predictive

models.

Introduction

3

Type III Error

One of the most common mistakes in modeling is to develop a model that answers the

wrong question, otherwise known as a Type III error (Kimball, 1957).

There can be a tendency to focus on the technical details and inadvertently overlook true

nature of the problem.

It is very important to focus on the overall strategy of the problem at hand and not just

the technical tactics of the potential solution.

Type III Errors

4

Example Business Application

The main goal is almost always to maximize profit in business.

When the outcome is categorical (e.g., purchase / no-purchase or churn / retention), it is

key to tie the model performance and class prediction back to the expected profit.

Type III Errors

5

Example Response Modeling

Recall the direct marketing example discussed in Chapter 11.

This campaign do not sample from the appropriate population.

It only utilized customers who had been contacted.

Any model built from these data is limited to predicting the probability of a purchase.

Type III Errors

6


Siegel (2011) outlines four possible cases:

Type III Errors

No contactResponse Non response

Contact Response A BNon response C D

To increase profits, a model that accurately predicts which customers are in cell B is the

most useful.

7


Techniques that attempt to understand the impacts of customer response are called

1. Uplift modeling

2. True lift modeling

3. Net lift modeling

4. Incremental lift modeling

5. True response modeling

Type III Errors

8

Measurement Errors

Measurement Error

Measurement error is the difference between a measured value of quantity and its true

value.

Measurement error can be divided into two components:

1. Random error

2. Systematic error

9

Measurement Errors

Measurement Error in the Outcome

This type gives rise to an upper bound on model performance for which no pre-

processing, model complexity, or tuning can overcome.

If a measured categorical outcome is mislabeled in the training data 10% of the time, it is

unlikely that any model could truly achieve more than a 90% accuracy rate.

10

Example Linear Regression Model

···

, where ~i. i. d. 0, .

If we knew the true model structure, then would represent the lowest possible error

achievable or the irreducible error.

We do not usually know the true model structure, so this value becomes inflated to

include model error.

Measurement Errors

11

Example Linear Regression Model

If the outcome contains significant measurement error, the irreducible error is increased.

The and have respective lower and upper bounds due to this error.

The better we understand the measurement system and its limits, the better we can

foresee the limits of model performance.

Measurement Errors

12

Measurement Errors

13

Measurement Error in the Outcome

There are two important take-aways:

1. No model can predict this type of error.

2. As error increases, the models become virtually indistinguishable in terms of their

predictive performance.

Measurement Errors

14

Measurement Errors

Measurement Error in the Predictors

Since many predictors are measured, they can contain some level of measurement error

associated with the measurement system.

Any error in the predictors is likely to be propagated directly through the model prediction

equation and results in poor performance.

15

Measurement Errors


The effect of randomness in the predictors can be drastic, depending on several factors:

1. The amount of randomness

2. The importance of the predictors

3. The type of model being used

16

Measurement Errors

17

Measurement Errors


Measurement error in the predictors can cause considerable issues, especially in terms

of reproducibility of the results on future data sets.

Future results may be poor because the underlying predictor data are different than the

values used in the training set.

18

Introduction

In many fields, even if the original response is on a continuous scale, it may be desirable

to work with a categorical response.

This could be due to the fact that the underlying distribution of the response is truly

bimodal.

Discretizing Continuous Outcomes

19


The left histogram is symmetric

The right histogram is clearly bimodal.

20

Introduction

When the response is bimodal (or multimodal), categorizing the response is appropriate.

If the response follows a continuous distribution, then categorizing the response is

difficult and induces a loss of information.


21

Reasons for Discretization

1. Practical reason

Decision makers may prefer to know whether or not a compound is predicted to be

soluble enough rather than the compound’s predicted log solubility value.

2. High degree of error

Scientist may believe that the continuous response contains a high degree of

error, so much so that only response values in either extreme of the distribution are

likely to be correctly categorized.


22


23

Working with the original scale provides

more accurate predictions for all models.

Introduction

The predictive modeling process assumes that the mechanism that generated the

current, existing data will continue to generate new data.

The new data will have similar characteristics and will occupy similar parts of the

predictor space as the data on which the model was built.

We have taken appropriate steps to create test sets that had similar properties across

the predictor space as the training set (Section 4.3).

When Should You Trust Your Model’s Prediction?

24

Introduction

If the new data are generated by the same mechanism as training set, we can have the

confidence that the model will make sensible predictions for the new data.

If the new data are not generated by the same mechanism, or if the training set was too

small or sparse to adequately cover the range of space, then predictions from the model

may not be trustworthy.


25

Extrapolation

Extrapolation is defined as using a model to predict samples that are outside the range

of the training data (Armitage and Berry, 1994).

There may be regions within the predictors’ range where no training data exist.

Extrapolated prediction may not be trustworthy and can lead to poor decision making.


26

Similarity of the New Data to the Training data

Many time though, the practitioner does not know if the mechanism is the same for the

new data as the training data.

There are a few tools that can be employed to understand the similarity.


27

Applicability Domain

The applicability domain of a model is the region of predictor space where the model

makes predictions with a given reliability (Netzeva et al., 2005).

If the new data being predicted are similar enough to the training set, the assumption

would be that these points would have reliability that is characterized by the model

performance estimates.


28

Dimension Reduction Techniques

A gross comparison of the space covered by the predictors from the training set and the

new set can be made using routine dimension reduction techniques such as principal

components analysis or multidimensional scaling (Davison, 1983).

If the training data and new data are generated from the same mechanism, then the

projection of these data will overlap.


29


30

Quantifying the Likelihood

When projecting many predictors into two dimensions, intricate predictor relationships as

well as sparse and dense pockets of space can be masked.

Hastie et al. (2008) describe an approach for quantifying the likelihood that a new

sample is a member of the training data.

Instead of method introduced by Hastie et al., the authors propose two slight alterations

to this method.


31


32

Introduction

An underlying presumption is that the more samples we have, the better model we can

produce.

A large number of samples can be beneficial, especially if the samples contain

information throughout the predictor space.

Measurement errors can minimize any advantages that may be brought by an increase

in the number of samples.

An increase in the number of samples can have less positive consequences.

The Impact of a Large Sample

33

Less Positive Consequences

1. Many of the predictive models have significant computational burdens as the number of

samples and predictors grows.

A single tree

Ensembles of trees

2. There are diminishing returns on adding more of the same data from the same

population.

The Impact of a Large Sample

34

통자발표2

Documents

Transcript of 통자발표2