Statistical Analysis of the CAPM
I. Sharpe–Linter CAPM
Brief Review of the Sharpe–Lintner CAPM
• The Sharpe–Lintner CAPM assumes that
(i) all investors act according to the µ− σ rule,
(ii) can lend and borrow any desired amount at a
common risk–free rate rf ,
(iii) and exhibit perfect agreement with respect to the
probability distribution of asset returns.
• Under these (key) assumptions, the market port-
folio is mean–variance efficient, implying that it is
characterized by weight vector
xm =Σ−1(µ− rf1N)
1′NΣ−1(µ− rf1N)
. (1)
1
• The central equation of the Sharpe–Lintner CAPM
is a direct consequence of (1) and is given by
µi − rf = βi(µm − rf), i = 1, . . . , N, (2)
where
– rf is the risk–free rate,
– µm is the expected return of the market portfolio,
and
– βi = COV (Ri, Rm)/σ2m, where
– Ri is the return of asset i and Rm is the return
of the market portfolio.
• Equation (2) states that there is a linear relation
between the excess return of asset i (over the
risk–free) rate and the excess return of the market
portfolio, with zero intercept.
• Equation (2) also implies efficiency.
2
Framework for Estimation and Testing
• The CAPM relationship (2) is expressed in terms of
expected values, which are not observable.
• To obtain a model with observable quan-
tities, we describe excess returns using the
excess return market model :
rit = αi + βirm,t + ϵit i = 1, . . . , N (3)
E(ϵit) = 0, i = 1, . . . , N (4)
E(ϵitϵjt′) =
σij if t = t′
0 if t = t′i, j = 1, . . . , N (5)
E(rm,tϵi,t) = 0, i = 1, . . . , N. (6)
• Here rit is the excess return on asset i in period t
(over risk–free rate), and rm,t is the excess return
on the market portfolio in period t (over risk–free
rate).
3
• At first glance, the market model we will be using
looks similar to the Single–Index Model (SIM), but
there are important differences:
– All returns involved are excess returns over the
risk–free rate rf .
– According to equation (5), the asset–specific error
terms may be correlated. Thus, we allow for a
non-diagonal covariance matrix, Σ, of the vector
ϵt = [ϵ1t, . . . , ϵNt]′,
COV (ϵt) = Σ =
σ21 σ12 · · · σ1N
σ12 σ22 · · · σ2N
... ... . . . ...
σ1N σ2N · · · σ2N
Conditional on the excess return of the market,
we then also have
COV (rt) = Σ, (7)
where rt = [r1t, . . . , r2t]′.
4
– Note, however, that we still assume that there
is no correlation over time, i.e. E(ϵtϵ′t′) = 0
for t = t′, and that the covariance matrix Σ is
constant over time.
• We will assume that the betas are constant over
time.1
• We will also assume that the error terms follow a
multivariate normal distribution, i.e.,
ϵtiid∼ N(0,Σ). (8)
• The Sharpe–Lintner CAPM implies that the inter-
cept in the excess return market model is zero, i.e.,
α = 0.
1This is by no means self-evident and can, in principle, be tested using
econometric techniques for detecting structural breaks. For an overview with
a view towards CAPM applications, see Schmid/Trede: Finanzmarktstatistik,
Springer.
5
• That is, a test of this model corresponds to a test
of the hypothesis
H0 : αi = 0, i = 1, . . . , N. (9)
• To perform such a test, it is necessary to estima-
te the parameters of the model and to derive an
appropriate test statistic.
• Write our excess return market model as
rt = α+ βrm,t + ϵt, t = 1, . . . , T,
ϵtiid∼ N(0,Σ),
where α = [α1, . . . , αN ]′, and β = [β1, . . . , βN ]′.
6
• The density of excess returns, conditional on themarket return, rm,t, is
f(rt|rm,t)
=exp
−1
2(rt − α − βrm,t)′Σ−1(rt − α − βrm,t)
(2π)N/2|Σ|1/2
,
and the joint density is
f(r1, . . . , rT |rm,1, . . . , rT,1) (10)
=T∏
t=1
f(rt|rm,t)
=
exp
−1
2
T∑t=1
(rt − α − βrm,t)′Σ−1(rt − α − βrm,t)
(2π)NT/2|Σ|T/2
• To estimate the unknown parameters, α, β, and
Σ, of this density, we use the method of maximum
likelihood.
• To do so, we define the log–likelihood function, i.e.,
the log of the joint density viewed as a function of
the unknown parameters.
7
• The maximum likelihood estimator is then found
by maximizing this function with respect to its
arguments, i.e., the unknown parameters.
• From (10), the log–likelihood function is
logL(α,β,Σ) (11)
= −NT
2log(2π)− T
2log |Σ|
−1
2
T∑t=1
(rt −α− βrm,t)′Σ−1
×(rt −α− βrm,t),
which we want to maximize with respect to α, β
and Σ.
8
• From (11), it is clear that the estimates of α and β
are determined by minimizing
S =
T∑t=1
(rt −α− βrm,t)′Σ−1(rt −α− βrm,t)
=
T∑t=1
r′tΣ
−1rt − 2r′tΣ−1(α+ βrm,t)
+(α+ βrm,t)′Σ−1(α+ βrm,t)
.
• The first order conditions are
∂S
∂α= −2Σ−1
T∑t=1
(rt −α− βrm,t) = 0,
∂S
∂β= −2Σ−1
T∑t=1
rm,t(rt −α− βrm,t) = 0.
• We get the standard OLS estimators.
9
• That is,
α = r − βrm
and
β =
∑Tt=1(rt − r)(rm,t − rm)∑T
t=1(rm,t − rm)2
=
∑Tt=1(rt − r)(rm,t − rm)
T σ2m
=
∑Tt=1(rm,t − rm)rt
T σ2m
where
r =1
T
T∑t=1
rt, rm =1
T
T∑t=1
rm,t,
σ2m =
1
T
T∑t=1
(rm,t − rm)2.
10
• To find the MLE of Σ, we make use of the following
differentiation rules:
(i) Let X and A be n× n matrices, so that
tr(XA) =n∑
i=1
n∑j=1
xijaji. (12)
For symmetric X (xij = xji), we therefore have
∂tr(XA)
∂xij=
aii i = j
aij + aji i = j.(13)
Hence
∂tr(XA)
∂X= A+A′ − diag(A),
and∂tr(XA)
∂X= 2A− diag(A), (14)
if A is also symmetric.
11
(ii) To find an expression for the derivative of |X|,recall that
|X| = xi1Ci1 + xi2Ci2 + · · ·+ xinCin, (15)
where Cij is the cofactor of xij in X, i, j =
1, . . . , n.
Again for symmetric X, we thus have
∂ log |X|∂X
=1
|X|∂|X|∂X
= 2X−1 − diag(X−1). (16)
12
The log–likelihood function can be written as
logL = −NT
2log(2π)− T
2log |Σ|
−1
2
T∑t=1
tr(ϵ′Σ−1ϵ
)= −NT
2log(2π) +
T
2log |Σ−1| (17)
−1
2
T∑t=1
tr(Σ−1ϵϵ′
), (18)
where
– ϵt = rt − α− βrm,t,
– (17) uses |A−1| = |A|−1, and
– (18) uses the permutation rule tr(ABC) =
tr(BCA).
13
Thus, using (14) and (16), we require
∂ logL
∂Σ−1 =T
2[2Σ− diag(Σ)]
−1
2
[2
T∑t=1
ϵtϵ′t − diag
(T∑
t=1
ϵtϵ′t
)]= 0,
implying
Σ =1
T
T∑t=1
ϵtϵ′t (19)
=1
T
T∑t=1
(rt − α− βrm,t)(rt − α− βrm,t)′.
• The OLS estimators of α and β are unbiased and
normally distributed with covariance matrices
14
COV (β) =1
T σ4m
COV
T∑
t=1
(rm,t − rm)rt
=1
T σ4m
T∑t=1
(rm,t − rm)2COV (rt)
=1
T σ2m
Σ, and
COV (α) = COV (r − βrm)
= COV
1
T
T∑t=1
rt − rm
T∑t=1
(rm,t − rm)rtT σ2
m
=1
T 2σ4m
T∑t=1
[σ2m − rm(rm,t − rm)]2COV (rt)
=1
T 2σ4m
T [σ4m + r2mσ2
m]Σ
=1
T
(1 +
r2mσ2m
)Σ. (20)
15
• It can be shown that T Σ has a Wishart distribution,
WN(T − 2,Σ), which is a matrix generalization of
the χ2 distribution.2
Moreover, Σ is independent of both α and β.
2See, for example, Zellner (1971).
16
Testing for mean–variance efficiency (α = 0)
• We discuss two tests of the null hypothesis α = 0,
in historical order.
• The first is a likelihood ratio (LR) test relying on
asymptotic arguments,
• The second is an exact finite–sample F-test. Sub-
sequently, the relation between the tests will be
considered.
Likelihood Ratio (LR) Test
• To conduct the likelihood ratio test, we first com-
pute the Maximum Likelihood Estimator under the
null hypothesis that α = 0, which is a regressi-
on through the origin. Denote the corresponding
estimators by β0 and Σ0. They are given by
β0 =
∑Tt=1 rtrm,t∑Tt=1 r
2m,t
, (21)
17
and
Σ0 =1
T
T∑t=1
ϵ0t ϵ0t (22)
=1
T
T∑t=1
(rt − β0rm,t)(rt − β0rm,t)′,
where ϵ0t = rt − β0rm,t.
• The Likelihood Ratio Test is based on the com-
parison between the log–likelihood values of the
unconstrained model and the constrained model.
• More precisely, the LR test statistic is given by
LR = −2(logL0 − logL1), (23)
where logL0 is the log–likelihood function of the
constrained model, and logL1 is the log–likelihood
function of the unconstrained model, each evaluated
at the respective MLEs.
18
• The asymptotic distribution of LR defined in (23)
is χ2 with degrees of freedom equal to the num-
ber of parameter restrictions implied by the null
hypothesis.
• In our situation, this corresponds to N degrees of
freedom (N is the number of assets), because the
CAPM implies that αi = 0 for i = 1, . . . , N .
19
• Now
logL1 = −NT
2log(2π)− T
2log |Σ1|
−1
2
T∑t=1
tr(Σ
−1
1 ϵϵ′)
= −NT
2log(2π)− T
2log |Σ1|
−1
2
T∑t=1
tr
(1
T
T∑t=1
ϵtϵ′t
)−1
ϵϵ′
= −NT
2log(2π)− T
2log |Σ1|
−T
2tr
(
T∑t=1
ϵtϵ′t
)−1 T∑t=1
ϵϵ′
= −NT
2log(2π)− T
2log |Σ1| −
T
2tr(IN)
= −NT
2(log(2π) + 1)− T
2log |Σ1|.
20
• By the same line of arguments,
logL0 = −NT
2(log(2π) + 1)− T
2log |Σ0|.
Consequently,
LR = T[log |Σ0| − log |Σ1|
]asy∼ χ2(N). (24)
21
F Test
• The finite–sample F test is based on the following
result:
Result: If N–dimensional random variable X
is N(0,Ω), the N × N random matrix A is
Wishart(T,Ω), and X and A are independent, then
T −N + 1
NX ′A−1X ∼ FN,T−N+1, (25)
i.e., the quantity [(T −N + 1)/N ]X ′A−1X has an
F distribution with N degrees of freedom in the
numerator and T −N +1 degrees of freedom in the
denominator.
22
• Using, in (25),
X =√T [1 + r2m/σ2
m]−1/2α (26)
and
A = T Σ, (27)
and recalling the results we have for α (in particular,
normality and (20)), the statistic
J =T −N − 1
N
(1 +
r2mσ2m
)−1
α′Σ−1
α (28)
has an F distribution with N degrees of freedom in
the numerator and T − N − 1 degrees of freedom
in the denominator, i.e.,3
J ∼ FN,T−N−1. (29)
3Gibbons/Ross/Shanken (1989): A Test of the Efficiency of a Given
Portfolio. Econometrica 57, 1121-1152.
23
Economic Interpretation of the CAPM F Test
• Apart from following a known finite–sample distri-
bution, the test statistic J defined in (28) also has
economic interpretation.
• Recall that the key testable implication of the CAPM
is that the market portfolio is a µ − σ efficient
portfolio.
• In the presence of a risk–free rate, this means that
the market portfolio is the tangency portfolio.
24
• It can be shown that4
J =
(T −N − 1
N
)θ⋆2 − θ2m
1 + θ2m, (30)
where θ⋆ is the Sharpe ratio of the ex post (i.e.,
using the sample mean vector and the sample co-
variance matrix) efficient portfolio formed from the
risky assets under study (including our market proxy)
and θm is the Sharpe ratio of the portfolio used as
a market proxy in our analysis.
• Equation (30) is particularly interesting because it
uncovers what we are actually testing: We test
whether our market proxy is so far away from the
ex post efficient portfolio that we are not willing to
believe that it is the population tangency portfolio,
where the distance is measured in terms of the
Sharpe Ratio.
4Gibbons/Ross/Shanken (1989): A Test of the Efficiency of a Given
Portfolio. Econometrica 57, 1121-1152.
25
Proof of (30)
• Comparing (28) and (30), the equality between
these quantities follows if we show that α′Σ−1
α =
θ⋆2 − θ2m.
• Let r = [rm, r′]′. The (sample) covariance matrix of
these variables is
V =
[σ2m σ2
mβ′
σ2mβ Σ + σ2
mββ′
]. (31)
• We know that the efficient portfolio using the assets
in r is characterized by the weight vector
w =V −1r
1′V −1r, (32)
and, thus, it has squared Sharpe ratio
θ⋆2 =(w′r)2
w′V w=
(r′V −1r)2
r′V −1r= r′V −1r. (33)
26
• Next, it is easily checked that the inverse of (31) is
V −1 =
[σ−2m + β′Σ−1β −β′Σ−1
−Σ−1β Σ−1
](34)
• Using (34), we get by straightforward computation,
and using (33),
θ⋆2 = r′V −1r = [rm, r′]V −1[rm, r′]′
=r2mσ2m
+ (r − βrm)′Σ−1(r − βrm)
= θ2m + α′Σ−1α,
recalling that α = r − βrm.
27
Relation between F and LR tests
• The finite–sample F test can also be interpreted as
a likelihood ratio test.
• To see this, first note that for the unconstrained
MLE of β, denoted by β1,
β1 =
∑t(rm,t − rm)rt
T σ2m
=
∑t rm,trt − rm
∑t rt
T σ2m
=r2mσ2m
β0 −rmr
σ2m
=r2mσ2m
β0 −rmσ2m
(r − β1rm)− r2mσ2m
β1
=r2mσ2m
β0 −rmσ2m
α− r2mσ2m
β1.
Rearranging and using the basic identity
σ2m = r2m − r2m =
1
T
T∑t=1
r2m,t − r2m,
28
shows that
β0 = β1 +rm
σ2m + r2m
α. (35)
Inserting (35) into Σ0 (see equation (22)) and
noting that the normal equations (12) and (12)
imply
T∑t=1
(rt − α− β1rm,t)′(1− rmrm,t
r2m + σ2m
)α = 0,
we arrive at
Σ0 = Σ1 +
(σ2m
r2m + σ2m
)αα′. (36)
• The Sherman–Morrison formula for the determinant
is as follows: For nonsingular A and conformable
vectors u and v
|A+ uv′| = |A|(1 + v′A−1u). (37)
29
• Formula (37) can be shown as follows.
• Consider first the case A = I.
• Then, since
(I u
0 1 + v′u
)=
(I 0
v′ 1
)(I + uv′ u
0 1
)(I 0
−v′ 1
),
we have
det(I + uv′) = 1 + v′u. (38)
• Next, recalling that det(AB) = det(A) det(B),
det(A+ uv′) = det(A)(I +A−1uv′)
= det(A)(I + (A−1u)v′)
= det(A)(1 + v′A−1u
).
30
When this formula is applied to (36), we obtain
|Σ0| = |Σ1|[1 +
σ2m
r2m + σ2m
α′Σ−1
1 α
]Thus, (24) may be written as
LR = T log|Σ0||Σ1|
= T log
[1 +
σ2m
r2m + σ2m
α′Σ−1
1 α
]= T log
[N
T −N − 1J + 1
],
where J is the F–statistic given by (28), or, equi-
valently,
J =T −N − 1
N
[exp
LRT
− 1
], (39)
which, as (39) is a monotonic transformation of
LR, shows that J may also be interpreted as a
likelihood ratio test.
31
• As the F–test based on (28) is exact, it is, for
realistic sample sizes, clearly preferable compared
to the likelihood ratio test relying on asymptotic
arguments.
• However, for the zero–beta version, an exact test is
much more difficult to obtain, and it may be useful
to consider what is lost when relying on asymptotic
arguments.
32
Finite–sample size of likelihood ratio test fornominal size 5%
N = number of assets,
T = sample size
For example, for N = 10, and T = 60, the critical
value for a LRT with asymptotic size 5% is 18.307,
which corresponds to a critical value of the exact F–
test of
cF =T −N − 1
N
(exp
LRT
T
− 1
)=
49
10
(exp
18.307
60
− 1
)= 1.748.
The actual size of the LRT in this situation is therefore
1− F cdf(1.748; 49, 10) = 0.096.
33
T N = 10 N = 20 N = 40
60 0.096 0.211 0.805
120 0.070 0.105 0.275
180 0.062 0.082 0.164
240 0.059 0.073 0.124
360 0.056 0.064 0.092
• The table shows that the finite–sample size of the
tests is larger than the asymptotic size of 5%.
• As a consequence, the large–sample tests will reject
the null hypothesis too often.
34
Roll’s Critique
• Roll (1977)5 emphasizes that tests of the CAPM
really only reject the mean–variance efficiency of
the market proxy we use in the test (recall equation
(30)).
• This implies that the CAPM is essentially untesta-
ble, because “the theory is not testable unless the
exact composition of the true market portfolio is
known and used in the tests. This implies that the
theory is not testable unless all individual assets are
included in the sample”.
5R. Roll (1977). A Critique of the Asset Pricing Theory’s Tests. Part I: On
Past and Potential Testability of the Theory. Journal of Financial Economics
4, 129-176.
35
• Roll argues that using a proxy for the market port-
folio is subject to two difficulties: “First, the proxy
itself might be mean–variance efficient even when
the true market portfolio is not. This is a real dan-
ger since every sample will display efficient portfolios
that satisfy perfectly all of the theory’s implicati-
ons. (...) On the other hand, the chosen proxy may
turn out to be inefficient; but obviously, this alone
implies nothing about the true market portfolio’s
efficiency”.
• Thus, what we essentially test is a joint hypothesis:
The CAPM and the hypothsis that the portfolio
used in the tests as the market proxy is the true
market portfolio.
• Clearly, it is extremely difficult to measure the “mar-
ket portfolio”, because this entity can, in principle,
include not just traded financial assets, but also
consumer durables, real estate, and human capital.
36
• On the other hand, often our interest is not to test
the CAPM (i.e., efficiency of the market portfolio)
but simply whether a specific portfolio is mean–
variance efficient within a given universe of assets.
37
References
• Campbell/Lo/MacKinlay (1997). The Econometrics
of Financial Markets. Princeton University Press:
Princeton.
• E. F. Fama and K. R. French (2004). The Capital
Asset Pricing Model: Theory and Evidence. Journal
of Economic Perspectives, 18, 25–46.
• F. Schmid and M. Trede (2006(?)). Finanzmarkt-
statistik, Springer, Kapitel 7.
• Zellner, A. (1971). Introduction to Bayesain Infe-
rence in Econometrics. New York: John Wiley &
Sons.
38
Top Related