Hypothesis-Testing Model-Complexity. Hypothesis Testing …..

Post on 11-Jan-2016

281 views 5 download

Transcript of Hypothesis-Testing Model-Complexity. Hypothesis Testing …..

Hypothesis-Testing

Model-Complexity

Hypothesis Testing …..

Domain of groundwater model ...

…topographic contours ...

… a dam ...

… irrigated area ...

… channel system ...

… extraction bores ...

… native woodland ...

… observation bores

Inflow from uphill

Supplied “from outside”

Inflow from uphill

Groundwater interaction with rivers

Supplied “from outside”

Inflow from uphillGroundwater interaction with dam

Groundwater interaction with rivers

Supplied “from outside”

Inflow from uphillGroundwater interaction with dam

Groundwater interaction with rivers

Leakage from channels

Supplied “from outside”

Inflow from uphillGroundwater interaction with dam

Groundwater interaction with rivers

Leackage from channels

Aquifer extraction

Supplied “from outside”

Inflow from uphillGroundwater interaction with dam

Groundwater interaction with rivers

Leackage from channels

Groundwater recharge

Aquifer extraction

Supplied “from outside”

More often than not, a definitive model cannot be built.

Recognize this, focus on the question that is being asked and, if necessary, use the model for hypothesis testing.

Remember that model calibration is a form of data interpretation. The whole modelling process is simply advanced data processing.

Cattle Ck.

Cattle Creek Catchment

Soils and current land use

Model grid; fixed head and drainage cells shown coloured

Groundwater levels in June 1996

Groundwater levels in January 1991

Modelled and observed water levels after model calibration.

264

279

4

1000

1000 10001000 253

77

10

171000

56

9

1000

2

2

2 3

7

18

3

27

24

Calibrated transmissivities

Cattle Creek Catchment

CANE EXPANSION

New Development CURRENT

Increased cane productionLeakage from balancing storage:

2.5 mm/d at calibration2.5 mm/d for prediction

46R10P8

Increased cane productionLeakage from balancing storage:

2.5 mm/d at calibration2.5 mm/d for prediction

46R15P8

Increased cane productionLeakage from balancing storage:

2.5 mm/d at calibration2.5 mm/d for prediction

Zone 17 absent

48R14P8

Increased cane productionLeakage from balancing storage:

0.0 mm/d at calibration0.0 mm/d for prediction

46R3P7

Increased cane productionLeakage from balancing storage:

0.0 mm/d at calibration0.0 mm/d for prediction

46R4P7

Increased cane productionLeakage from balancing storage:

0.0 mm/d at calibration0.0 mm/d for prediction

Zone 17 absent

48R8P7

Increased cane productionLeakage from balancing storage:

2.5 mm/d at calibration2.5 mm/d for prediction

46R10P10

Increased cane productionLeakage from balancing storage:

2.5 mm/d at calibration2.5 mm/d for prediction

46R11P10

Increased cane productionLeakage from balancing storage:

2.5 mm/d at calibration2.5 mm/d for prediction

Zone 17 absent

48R14P10

P E

d

M

Ks

Simple ModelRunoff

P E

d

M

Ks

Runoff

Simple Model

•M Soil Moisture Capacity (mm/m depth)•d Effective Rooting Depth•Ki Initital loss•fcap Field Capacity•Ks Saturated Hydraulic Conductivity

MP E

d

M

Ks

Simple ModelRunoff

•M Soil Moisture Capacity (mm/m depth)•d Effective Rooting Depth•Ki Initital loss•fcap Field Capacity•Ks Saturated Hydraulic Conductivity

p1

p2

A probability contour:-

“Fixing” a parameter

p1

p2

This has the potential to introduce bias into key model predictions.

A probability contour:-

p1

p2

Also, what if this parameter is partly a surrogate for an unrepresented process?

A probability contour:-

p1

p2

“Fixing” a parameter

A probability contour:-

p1

p2

“Fixing” a parameter

A probability contour:-

• Not only does uncertainty arise from parameter nonuniqueness; it also arises from lack of certainty in model inputs/outputs and model boundary conditions.

• The model can be used as an instrument for data interpretation, allowing various hypotheses concerning inputs/outputs and boundary conditions to be tested.

• Where did the idea ever come from that there should be one calibrated model?

modeller

construction calibration prediction

“the deliverable”

prediction

“the deliverable”

prediction

“the deliverable”

modeller

construction calibration prediction

“Dual calibration”

Observation bore

Pumped bore

K = 5Sy = 0.1

K = 5Sy = 0.1

K = 25Sy = 0.3In

flow

= 2

750

Fix

ed h

ead

= 5

0

A River Valley

Recharge × 10-3

0 100 200 300

0

1

2

Recharge rate

0 100 200 300

0

1000

2000 Discharge

Discharge

0 100 200 300

0

1000

2000

Pumping rate

Water level

Water level

0 100 200 300

48

52

56

0 100 200 300

48

52

56

Borehole hydrographs

The finite-difference grid

The finite-difference grid

and parameter zonation

0 100 200 300

48

52

56

K=5; Sy=0.1

K=5; Sy=0.1

K=25; Sy=0.3

Calibrated parameters

Field dataModel-calculated

Field andmodel-generatedboreholehydrographs

0 100 200 300

48

52

56

K=10.2; Sy=0.21

K=10.2; Sy=0.21

K=18.8; Sy=0.21

Field dataModel-calculated

Calibrated parameters

Field andmodel-generatedboreholehydrographs

Simulation of Drought Conditions

• Decrease inflow from left from 2750 to 2200 m3/day.

• Increase pumping from left bore from (1500, 1000, 0, 1500)

to 2000 m3/day.

• Increase pumping from right bore from

(2000,1000,500,1500) to 3000 m3/day.

• Run model for 91 days.

• Same initial heads, ie. 54 m.

For “true parameters”, water level in right bore after this run is 43.9m.

Is it possible that the water level in the left bore will be as low as 42m?

Use PEST with “model” comprised of two MODFLOW runs, one under calibration conditions and one under predictive conditions.

In the latter case there is only one “observation”, viz water level in right pumped cell is 42m at end of run (weight is the sum of the weights used for all water levels over calibration period).

Methodology

Model

Input files

Output files

PEST

writes model input files

reads model output files

Modelcalibration conditions

Input files

Output files

PEST

Input files

Modelpredictive conditions

Output files

0 100 200 300

48

52

56

K=22; Sy=0.14

K=16; Sy=0.16

K=9.8; Sy=0.28

Field dataModel-calculated

Field andmodel-generatedboreholehydrographs overcalibration period.

Water level in right pumped bore at end ofdrought = 42m.

Calibrated parameters

Is it possible that the water level in the left bore will be as low as 40m?

0 100 200 300

48

52

56

K=22; Sy=0.14

K=16; Sy=0.16

K=9.8; Sy=0.28

Field dataModel-calculated

Water level in right pumped bore at end ofdrought = 40m.

Field andmodel-generatedboreholehydrographs overcalibration period.

Calibrated parameters

0 100 200 300

48

52

56

K=5; Sy=0.099

K=14; Sy=0.11

K=20; Sy=0.32

Field dataModel-calculated

Water level in right pumped bore at end ofdrought = 40m.

K=4.6; Sy=0.090

Calibrated parameters

Field andmodel-generatedboreholehydrographs overcalibration period.

Is it possible that the water level in the left bore will be as low as 36m?

0 100 200 300

48

52

56

K=8.8; Sy=0.13

K=15; Sy=0.14

K=18; Sy=0.29

Field dataModel-calculated

Water level in right pumped bore at end ofdrought = 36m.

K=2.7; Sy=0.19

Calibrated parameters

Field andmodel-generatedboreholehydrographs overcalibration period.

We are not calibrating a groundwater model.

We are calibrating our regularisation

methodology.

Some Lessons

• if possible, include in the calibration dataset measurements of the type that you need to predict

• intuition and knowledge of an area plays just an important part in modelling as does the model itself

• focus on what the model needs to predict when building the model…..

There should be no such thing as a model for an area, only for a specific problem.

So how should we model?

open cut mine

open cut mine

underground mine

underground mine

waterholes

A model area

extraction bores

open cut mine

open cut mine

underground mine

underground mine

waterholes

A model area

extraction bores

monitoring bores

guaging stations

A model area

A model area

A model area

A model area

A model area

A model area

A model area

A model area

A model area

A model area

Sources of Uncertainty Close to Waterholes

• conductance of bed (and heterogeneity thereof)

• change in bed conductance with wetted perimeter

• change in bed conductance with inflow/outflow and season

• relationship between area and level

• relationship between level and flow

• rate of evaporation

• hydraulic properties of rocks close to ponds

• behaviour during flood events

• change in hydraulic characteristics after flood events

• uncertainty in future flows

• inflow to ponds from neighbouring surface catchment

• lack of borehole data to define groundwater mounds

• uncertainties in streamflow

Let’s start again…..

Complexity leads to parameter uncertainty.

Parameter correlation can be enormous due to inadequate data.

Parameter uncertainty may lead to predictive uncertainty.

The more that the prediction depends on system “fine detail”, the more this is likely to occur.

Predictive uncertainty must be analysed.

Complexity must be “focussed” - dispense with non-essential complexity.

No model should be built independently of the prediction which it has to make.

A model area

A model area

A model area

A model area

A model area

A model area

open cut mine

open cut mine

underground mine

underground mine

waterholes

Sensitive area

open cut mine

open cut mine

underground mine

underground mine

waterholes

Sensitive area

open cut mine

open cut mine

underground mine

underground mine

waterholes

Sensitive area

A model is not a database! A model is a data processor.

Ubiquitous complexity in a “do-everything model”

Ubiquitous complexity in a “do-everything model”

Focussed complexity in a prediction-specific model

Focussed complexity in a prediction-specific model

Model Complexity

For reasons which we have already discussed, a complex model is really a simply model in disguise.

Complex models:-

More parameters Longer run times Greater likelihood of numerical

instability More costly Destroys user’s intuition

The level of complexity is set by system properties to which the prediction is most sensitive.

p1

p2

Objective functionminimum

Objective function contourslinear model

p1

p2

A probability contour:-

p1

p2

11

A probability contour:-

p1

p2

11

2

2

A probability contour:-

p1

p2

A probability contour:-

p1

p2

p1+p2

A probability contour:-

p1

p2

p1+p2 p1-p2

Ideally, simplification of a model should be done in such a way that only the parameters that “don’t matter” are dispensed with.

There are many cases where a specific prediction depends on at least one of the values of the individual parameters - the parameters that cannot be resolved by the parameter estimation process.

In fact, that is often why we are using a physically based model; if calibration alone sufficed for full parameterisation, then a black box would be all we need.

p1

p2

Over-simplified model design introduces bias, for we are effectively assuming values for unrepresented parameters.

p1

p2

A probability contour:-

“Fixing” a parameter

p1

p2

A probability contour:-

“Fixing” a parameter

p1

p2

A probability contour:-

“Fixing” a parameter

Increasing model complexity

pote

ntia

l er r

o r in

pre

dict

ion

complexity

bias

But we don’t know how much bias we are introducing.

?

Increasing model complexity

complexity

bias

predictive uncertainty

These levels are equalpo

tent

ial e

r ro r

in p

redi

ctio

n

Increasing model complexity

complexity

bias

predictive uncertainty

These levels are equalpo

tent

ial e

r ro r

in p

redi

ctio

n

The point where no further complexity is warranted, is the point where the uncertainty of a specific model prediction no longer rises.

Essential and non-essential complexity are prediction-dependent.

Complexity does not guarantee the “right answer” - it guarantees that the right answer will lie within the limits of predictive uncertainty.

Complexity without uncertainty analysis is a waste of time. A complex model can be just as biased as a simple model.

Use a simple model and add the “predictive noise” – far cheaper.

A complex model allows you to replace “predictive noise” with science. But if you don’t do it, what is the point of a complex model.

An Example….

NO RTH C AR O LINA

Neuse R iver basin

Contentnea C reekwatershed

N C County BoundariesSandy RunM iddle Sw am pLittle C ontentneaC ontentneaN euse

(77 km 2)

(140 km 2)

(470 km 2)

(2600 km 2)

(14500km 2)

1

10

100

1000

10000

1-Jan-83 1-Mar-83 1-May-83 1-Jul-83 1-Sep-83 1-Nov-83 1-Jan-84

Observed and modelled flows

0.E+00

1.E+09

2.E+09

3.E+09

4.E+09

5.E+09

6.E+09

7.E+09

8.E+09

1970 1972 1974 1976 1978 1980 1982 1984 1986

Observed and modelled monthly volumes

0

0.2

0.4

0.6

0.8

1

10 100 1000 10000

Flow (cu ft /sec)

Exc

ee

de

nce

fra

ctio

n

Observed and modelled exceedence fractions

ParameterLZSN 2.0UZSN 2.0INFILT 0.0526BASETP 0.200AGWETP 0.00108LZETP 0.50INTFW 10.0IRC 0.677AGWRC 0.983

1

10

100

1000

10000

1-Jan-83 1-Mar-83 1-May-83 1-Jul-83 1-Sep-83 1-Nov-83 1-Jan-84

Observed and modelled flows

0.E+00

1.E+09

2.E+09

3.E+09

4.E+09

5.E+09

6.E+09

7.E+09

8.E+09

1970 1972 1974 1976 1978 1980 1982 1984 1986

Observed and modelled monthly volumes

0

0.2

0.4

0.6

0.8

1

10 100 1000 10000

Flow (cu ft/sec)

Exc

eede

nce

frac

tion

Observed and modelled exceedence fractions

Parameter Set 1 Set 2 Set 3 Set 4 Set 5 Set 6LZSN 2.0 2.0 2.0 2.0 2.0 2.0UZSN 2.0 1.79 2.0 2.0 1.76 2.0INFILT 0.0526 0.0615 0.0783 0.0340 0.0678 0.0687BASETP 0.200 0.182 0.199 0.115 0.179 0.200AGWETP 0.00108 0.0186 0.0023 0.0124 0.0247 0.0407LZETP 0.50 0.50 0.20 0.72 0.50 0.50INTFW 10.0 3.076 1.00 4.48 4.78 2.73IRC 0.677 0.571 0.729 0.738 0.759 0.320AGWRC 0.983 0.981 0.972 0.986 0.981 0.966

1

10

100

1000

10000

1-Jan-93 1-Mar-93 1-May-93 1-Jul-93 1-Sep-93 1-Nov-93 1-Jan-94

Observed and modelled flows over validation period

0.E+00

1.E+09

2.E+09

3.E+09

4.E+09

5.E+09

6.E+09

7.E+09

8.E+09

9.E+09

1.E+10

1986 1987 1988 1989 1990 1991 1992 1993 1994 1995

Observed and modelled monthly volumes over validation period

0

0.2

0.4

0.6

0.8

1

10 100 1000 10000

Flow (cu ft/sec)

Exc

ee

de

nce

fra

ctio

n

Observed and modelled exceedence fractions over validation period

1

10

100

1000

10000

1-Jan-93 1-Mar-93 1-May-93 1-Jul-93 1-Sep-93 1-Nov-93 1-Jan-94

Observed and modelled flows over validation period

1

10

100

1000

10000

1-Jan-93 1-Mar-93 1-May-93 1-Jul-93 1-Sep-93 1-Nov-93 1-Jan-94

Observed and modelled flows over validation period

Parameterisation using PEST’s predictive analyser

1

10

100

1000

10000

1-Jan-83 1-Mar-83 1-May-83 1-Jul-83 1-Sep-83 1-Nov-83 1-Jan-84

Observed and modelled flows over calibration period

ParameterLZSNUZSNINFILTBASETPAGWETPLZETPINTFWIRCAGWRC

ParameterLZSNUZSNINFILTBASETPAGWETPLZETPINTFWIRCAGWRCDEEPFR

Observed and modelled flows over validation period

Parameterisation using PEST’s predictive analyser

1

10

100

1000

10000

1-Jan-93 1-Mar-93 1-May-93 1-Jul-93 1-Sep-93 1-Nov-93 1-Jan-94

1

10

100

1000

10000

1-Jan-83 1-Mar-83 1-May-83 1-Jul-83 1-Sep-83 1-Nov-83 1-Jan-84

Observed and modelled flows over calibration period