VRT
Click here to load reader
Transcript of VRT
![Page 1: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/1.jpg)
Variance reduction techniques
![Page 2: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/2.jpg)
2
Introduction
• Simulation models should be coded such that they are efficient.
• Efficiency in terms of programming ensures expedited execution and minimized storage requirements.
• “Statistical” efficiency is related to the variance of the output random variables from a simulation.
• If we can somehow reduce the variance of an output random variable of interest without disturbing its expectation, we can obtain greater precision.
• That is, smaller confidence intervals, for the same amount of simulating, or, alternatively, achieve a desired precision with less simulating.
![Page 3: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/3.jpg)
3
Introduction
• The method of applying Variance Reduction Techniques
(VRTs) usually depends on the particular model(s) of interest.
• Hence, a complete understanding of the model is required for
proper use of VRTs.
• Typically, it is impossible to know beforehand how great a
variance reduction might be realized.
• Or worse, whether the variance will be reduced at all in
comparison with straight-forward simulation.
• If affordable, primary runs should be made to compare results
of applying a VRT with those from straight-forward
simulation.
![Page 4: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/4.jpg)
4
Introduction
• Some VRTs are themselves going to increase computing costs, and this decrease in computational efficiency must be traded off against the potential gain in statistical efficiency.
• Almost all VRTs require some extra effort on the part of the analyst. This could just be to understand the technique, or sometimes little more!
![Page 5: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/5.jpg)
5
Common random numbers (CRN)
• Probably the most used and popular VRT.
• More commonly used for comparing multiple systems rather
than for analyzing a single system.
• Basic principle: “we should compare alternate configurations
under similar experimental conditions.”
• Hence we will be more confident that the observed differences
are due to differences in the system configuration rather than
the fluctuations of the “experimental conditions.”
• In our simulation experiment, these experimental conditions
are the generated random variates that are used to drive the
model through the simulated time.
![Page 6: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/6.jpg)
6
Common random numbers (CRN)
• The name for this techniques is because of the possibility in many situations of using the same basic U(0,1) random numbers to drive each of the alternate configurations through time.
• To see the rationale for the use of CRN, consider two systems to be compared with output performance parameters X1j and X2j, respectively for replication j.
• Let µ i = E[Xi] be the expected output measure for system i.
• We are interested in ζ = µ1 - µ2.
• Let Zj = X1j – X2j for j = 1,2 … n.
• Then, E[Zj] = ζ. That is,
. ofestimator unbiasedan is 1 ζn
Z
Z
n
j
j
n
∑=
=
![Page 7: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/7.jpg)
7
Common random numbers (CRN)
• Since Zj’s are IID variables:
• If the simulations of two different configurations are done
independently, with different random numbers, X1j and X2j
will be independent. So the covariance will be zero.
• If we could somehow simulate the two configurations so that
X1j and X2j, are positively co-related, then Cov(X1j, X2j) > 0 so
that the variance of the sample mean is reduced.
• So its value is closer to the population parameter ζ.
.),(2)()()(
][2121
n
XXCovXVarXVar
n
ZVarZVar
jjjjj
n
−+==
nZ
![Page 8: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/8.jpg)
8
Common random numbers (CRN)
• CRN is a technique where we try to introduce this positive correlation by using the same random numbers to simulate all configurations.
• However, success of using CRN is not guaranteed.
• We can see that as long as the output performance measures for two configurations X1j and X2j, react monotonically to the common random numbers, CRN works.
• However, if X1j and X2j react in opposite directions to the random variables, CRN backfires.
• Another drawback of CRN is that formal statistical analyses can be complicated by the induced correlation.
![Page 9: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/9.jpg)
9
Common random numbers (CRN)
Synchronization
• To implement CRN, we must match up, or synchronize, the random numbers across different configurations on a particular replications.
• Ideally, a specific random variable should be used for a specific purpose on one configuration is used for exactly same purpose on all configurations.
• For example, say we are comparing different configurations of a queuing system.
• If a random number is used to generate service time for one system, the same random number should be used to generate service times for the other systems.
![Page 10: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/10.jpg)
10
Antithetic variates
• Antithetic variates (AV) is a VRT that is more applicable in simulating a single system.
• As with CRN, we try to induce correlation between separate runs, but now we seek negative correlation.
• Basic idea: Make pairs of runs of the model such that a “small” observation on one of the run in the pair tends to be offset by a “large” observation on the other.
• So the two observations are negatively correlated.
• Then, if we use the average of the two observations in the pair as a basic data point for analysis, it will tend to be closer to the common expectation µ of an observation than it would be if the two observations in the pair were independent.
![Page 11: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/11.jpg)
11
Antithetic variates
• AV induces negative correlation by using complementary
random numbers to drive the two runs of the pair.
• If U is a particular random number used for a particular
purpose in the first run, we use 1 – U for the same purpose in
the second run.
• This number 1 – U is valid because if U ~ U(0,1) then
(1 – U) ~ U(0,1).
• Note that synchronization is required in AV too – use of
complementary random numbers for the same purpose in a
pair.
![Page 12: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/12.jpg)
12
Antithetic variates
• Suppose that we make n pairs of runs of the simulation model
resulting in observations (X1(1), X1
(2)) … (Xn(1), Xn
(2)), where
Xj(1) is from the first run of the jth pair, and Xj
(2) is from the
antithetic run of the jth pair.
• Both Xj(1) are Xj
(2) are legitimate that is E(Xj(1)) = E(Xj
(2)) = µ .
• Also each pair is independent of every other pair.
• In fact, total number of replications are thus 2n.
• For j = 1, 2 …n, let Xj = (Xj(1) + Xj
(2))/2 and let average of
Xj’s be the unbiased point estimator for population mean µ .
).()()( )1(
njj XEXEXE ===µ
![Page 13: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/13.jpg)
13
Antithetic variates
• Since, Xj’s are IID,
• If the two runs within a pair were made independently, then Cov(Xj
(1) , Xj(2)) = 0.
• On the other hand, if we could induce negative correlation between Xj
(1) are Xj(2) then Cov(Xj
(1) , Xj(2)) < 0, which reduces
Var[ ].
• This is the goal of AV.
( ) ( ) ( ).
4
,2)()(
)2()1()2()1(
n
XXCovXVarXVar
n
XVarXVar
jjjjj
n
++==
r
nX
![Page 14: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/14.jpg)
14
Antithetic variates
• Like CRN, AV doesn’t guarantees that the method will work
every time.
• For AV to work, its response to a random number used for a
particular purpose needs to monotonic – in either direction.
• How about combining CRN and AV?
• We can drive each configuration using AV and then use CRN
to simulate multiple configurations under similar conditions.
![Page 15: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/15.jpg)
15
Control variates
• The method of Control Variates (CV) attempts to take
advantage of correlation between certain random variables to
obtain a variance reduction.
• Suppose we are interested in the output parameter X.
Particularly, we want µ = E[X].
• Suppose Y be another random variable involved in the same
simulation that is thought to be correlated with X – either
positively or negatively.
• Also suppose that we know the value ν = E[Y].
![Page 16: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/16.jpg)
16
Control variates
• If X and Y are positively correlated, then it is highly likely that
in a particular simulation run, Y > ν would lead to X > µ .
• Thus, if in a run, we notice that Y > ν, we might suspect that X
is above its expectation µ as well, and accordingly we adjust X
downward by some amount.
• Alternatively, if we find Y < ν, then we would suspect X < µ
as well and adjust X upward accordingly.
• This way, we use our knowledge of Y’s expectation to pull X
(up or down) towards its expected value µ , thus reducing
variability about µ from one run to next.
• We call Y a control variate of X.
![Page 17: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/17.jpg)
17
Control variates
• Unlike CRN or AV, success of CV does not depend on the
correlation being of a particular sign.
• If Y and X are negatively correlated, CV would still work.
• Now, we would simply adjust X upward if Y > ν and
downward if Y < ν.
• To implement this idea, we need to quantify the amount of the
upward or downward adjustment of X.
• We will express this quantification in terms of the deviation Y
– ν, of Y from its expectation.
![Page 18: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/18.jpg)
18
Control variates
• Let a be a constant that has the same sign as correlation
between Y and X.
• We use a to scale the deviation Y – ν to arrive at an adjustment
to X and thus define the “controlled” estimator:
Xc = X – a(Y – ν).
• Since µ = E[X] and ν = E[Y], then for any real number a,
E(Xc) = µ .
• So is an unbiased estimator of µ and may have lower variance.
Var(Xc) = Var(X) + a2 Var(Y) – 2a Cov(X, Y).
• So has less variance if and only if: a2 Var(Y) < 2a Cov(X, Y).
![Page 19: VRT](https://reader038.fdocument.pub/reader038/viewer/2022100802/577cce901a28ab9e788e15e7/html5/thumbnails/19.jpg)
19
Control variates
• We need to choose the value of a carefully so that the
condition is always satisfied.
• The optimal value is:
• In practice, though, its bit difficult than it appears!
• Depending on the source and nature of the control variate Y,
we may not know Var(Y) and certainly not know Cov(X, Y).
• Hence obtaining the optimal value of a might be difficult.
.)(
),(*
YVar
YXCova =