SAM_2012_Pollak
Theoretical background for portfolio theory
COVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMIZATION
Ilya Pollak
Purdue University
School of Electrical and Computer Engineering
West Lafayette, IN 47907
USA
ABSTRACT
This overview paper reviews covariance estimation problems and re-
lated issues arising in the context of portfolio optimization. Given
several assets, a portfolio optimizer seeks to allocate a fixed amount
of capital among these assets so as to optimize some cost function.
For example, the classical Markowitz portfolio optimization frame-
work defines portfolio risk as the variance of the portfolio return,
and seeks an allocation which minimizes the risk subject to a target
expected return. If the mean return vector and the return covariance
matrix for the underlying assets are known, the Markowitz problem
has a closed-form solution.
In practice, however, the expected returns and the covariance
matrix of the returns are unknown and are therefore estimated from
historical data. This introduces several problems which render the
Markowitz theory impracticable in real portfolio management appli-
cations. This paper discusses these problems and reviews some of
the existing literature on methods for addressing them.
Index Terms— Covariance, estimation, portfolio, market, fi-
nance, Markowitz
1. INTRODUCTION
The return of a security between trading day t1 and trading day t2
is defined as the change in the closing price over this time period,
divided by the closing price on day t1. For example, the daily (i.e.,
one-day) return on trading day t is defined as (p(t)−p(t−1))/p(t−
1) where p(t) is the closing price on day t and p(t−1) is the closing
price on the previous trading day. Note that if t is a Monday or the
day after a holiday, the previous trading day will not be the same as
the previous calendar day.
Suppose an investment is made into N assets whose return vec-
tor is R, modeled as a random vector with expected return µ =
E[R] and covariance matrix Λ = E[(R − µ)(R − µ)T ]. In other
words, R = (R(1), . . . , R(N))T where R(n) is the return of the n-th
asset. It is assumed throughout the paper that the covariance matrix
Λ is invertible. This assumption is realistic, since it is quite unusual
in practice to have a set of assets whose linear combination has re-
turns exactly equal to zero. Even if an investment universe contained
such a set, the number of assets in the universe could be reduced to
eliminate the linear dependence and make the covariance matrix in-
vertible.
Out of these N assets, a portfolio is formed with allocation
weights w = (w(1), . . . , w(N))T . The n-th weight is defined as the
amount invested into the n-th asset, as a fraction of the overall invest-
ment into the portfolio: if the overall investment into the portfolio is
$D, and $D(n) is invested into the n-th asset, then w(n) = D(n)/D.
Therefore, by definition, the weights sum to one:
w
T
1 = 1, (1)
where 1 is an N -vector of ones. Note that some of the weights may
be negative, signifying short positions obtained by selling borrowed
units of an asset. It is assumed throughout the paper that short selling
can be done freely for all N assets.
It is easily shown then that the total portfolio return is wT R.
The expected portfolio return is therefore wT µ, and the variance of
the return is wT Λw.
1.1. Classical Markowitz Portfolio Optimization
The classical Markowitz portfolio framework [37] defines portfo-
lio risk as the variance of the portfolio return, and seeks a portfolio
weight vector w which minimizes the portfolio risk subject to a tar-
get expected return µtgt:
Find w∗ to minimize wT Λw (2)
subject to wT µ = µtgt (3)
Using Lagrange multipliers to perform minimization (2) subject to
the constraints (1) and (3) yields [37, 45]:
w
∗
= Λ
−1
mA
−1
c, (4)
where m = (µ 1), A = mT Λ−1m, and c = (µtgt 1)T .
The global minimum-variance portfolio (GMVP) is obtained by
dropping the mean constraint (3) and instead minimizing portfolio
risk (2) subject only to the weight normalization constraint (1). This
yields the following weight vector:
wgmvp =
Λ−11
1T Λ−11
. (5)
1.2. Practical Difficulties with the Framework
In practice, the expected returns and the covariance matrix of the
returns are not known and are therefore estimated from historical
data [14, 25, 26, 27, 32, 42, 7, 33, 2, 34, 35, 10, 53]. This introduces
three well-known problems:
• Over long time periods, financial data is typically nonstation-
ary. This limits the amount of data that can be used to mean-
ingfully estimate the mean and the covariance of the asset re-
turn vector. On the other hand, sample covariance has many
parameters and requires large amounts of data to estimate.
For example, if a portfolio includes 1000 stocks, then sample
covariance has roughly 500,000 parameters to be estimated.
Therefore, alternative estimators of the covariance are often
required.
• The optimal portfolio weights are very sensitive to the esti-
mated means and covariances [25, 26, 27, 39, 3, 28, 9, 4].
In other words, a small change of the estimates may lead to
a drastic change of portfolio weights computed by replacing
the mean vector and the covariance matrix in Eqs.(4) or (5)
with their estimates.
• The optimal portfolio tends to amplify large estimation errors
in certain directions [25, 39]. This stems from the fact that if
the variance of an asset (or a sub-portfolio of assets) is sig-
nificantly underestimated and thus appears to be small, the
optimal portfolio will assign a large weight to it. Similarly,
a large weight will be assigned if the mean return of an asset
or a sub-portfolio appears to be large as a result of being sig-
nificantly overestimated. As a result, the risk of the estimated
optimal portfolio is typically underpredicted and its return is
overpredicted [30, 31].
In Section 2, we review a number of covariance estimation meth-
ods that have been used in the literature to overcome the first of these
three problems. However, regardless of which covariance estimation
method is used, the second and third problem still remain. Several
methods for addressing them are reviewed in Section 3.
2. COVARIANCE ESTIMATION METHODS FOR
MARKOWITZ PORTFOLIOS
Apart from using the sample covariance, the current portfolio opti-
mization literature has three main types of approaches for estimating
the covariance matrix: imposing a parametric model; bootstrapping;
and shrinkage methods. To empirically evaluate a covariance esti-
mation method, typically the following testing procedure is adopted:
1: Two parameters are selected, the length of the training win-
dow L and the length of the holding window K.
2: Mean returns and the covariance matrix are estimated from
the historical data for the time period [t − L, t].
3: Portfolio weights are calculated based on the estimated co-
variance and mean returns, and the resulting portfolio is held
for the time period [t, t + K].
4: Steps 2 and 3 are repeated with t replaced by t + K, t +
2K, . . . , t + pK.
5: The mean and standard deviation of the portfolio return are
estimated based on the entire testing period.
Using this procedure, several covariance estimators described below
are evaluated in [41].
2.1. Simple Estimators
Several basic covariance estimators are often used as benchmarks in
the literature.
Diagonal estimators assume that the asset returns are pairwise
uncorrelated. The variances of individual asset returns are usually
estimated as sample variances. An even simpler multiple-of-identity
model is one of the approaches used in [34] where the estimate of
the covariance matrix is a scalar multiple of the identity matrix, with
the mean of the sample variances used as the diagonal entry.
The constant correlation estimator assumes that every pair of
stocks in the portfolio has the same correlation coefficient [14]. For
a portfolio with N assets, this model has N +1 parameters: N return
variances and one correlation, all of which are estimated using the
corresponding sample quantities.
Both the constrained and unconstrained optimal weights require
inverse covariance matrices, as evident from Eqs. (4,5). In many
practical situations the number of stocks is larger than the number of
historical returns per stock, resulting in a singular sample covariance
matrix. The pseudoinverse method estimates the inverse covariance
matrix as the pseudoinverse of the sample covariance [34].
2.2. Covariance Estimation via Linear Factor Models
Linear factor models reduce the number of model parameters by rep-
resenting the return of each asset in the portfolio as a linear combina-
tion of a small number of factors such as the return of a market index,
the return of an industry index, etc. Specifically, consider a portfo-
lio with N assets whose return vector is R = (R(1), . . . , R(N))T .
Let J be the total number of factors used in the model, and denote
their return vector by Rf = (R
(1)
f , . . . , R
(J)
f )
T . The corresponding
linear factor model is:
R
(n)
= αn +
NX
n=1
βnj R
(j)
f + Un, for n = 1, . . . , N. (6)
Here, the random variables Un are assumed to be zero-mean and un-
correlated both with each other and with the factor returns Rf . The
parameters of this model are the non-random coefficients αn and
βnj , as well as the standard deviations σn of the random variables
Un.
Let α = (α1, . . . , αN )T , U = (U1, . . . , UN )T , and let B be
the N × J matrix whose (n, j)-th entry is βnj . With this notation,
the N scalar equations of (6) can be written as the following vector
equation:
R = α + BRf + U. (7)
Since α and B are deterministic, and since U is assumed to be un-
correlated with Rf , taking covariance matrices of both sides of the
above equation yields the following:
cov(R) = Bcov(Rf )B
T
+ cov(U) (8)
Estimating the covariance matrix of R therefore involves estimating
the N J entries of the matrix B, the J(J + 1)/2 distinct entries of
the factor covariance matrix, and the N parameters of the diagonal
covariance of U, for a total of (N + J/2)(J + 1) parameters. If
J << N , this number of parameters is significantly smaller than
the N (N + 1)/2 parameters of the sample covariance.
For example, consider a portfolio with N = 1000 assets. The
sample covariance matrix for such a portfolio has 500,500 distinct
parameters to be estimated. A linear factor model with 10 factors
has only (1000 + 5)(10 + 1) = 11055 parameters. Consider the
fact that one year’s worth of daily returns for 1000 assets amounts to
about 250,000 data points, since there are approximately 250 trad-
ing days in a year. This is far too little data to be able to estimate the
500,500 distinct entries of the sample covariance, yet may be suffi-
cient to construct reasonable estimates for the 11,055 parameters of
the factor model.
2.3. Examples of Linear Factor Models
In order for a factor model to work, it must effectively capture the
behavior of the returns of many assets in a portfolio using a small
number of factor returns. Many research efforts have therefore fo-
cused on developing effective factor models [24]. The earliest of
these [48] is a model consisting of a single market factor. A broad
market index, such as S&P 500, is usually taken as a proxy for the
market return.
Factors defined by industries or industry sectors are also widely
used. For example, [34] uses a model with 49 factors: the mar-
ket factor and 48 industry factors corresponding to the 48 industries
defined in [16]. The return for an industry factor is defined as the
return of an equal-weighted portfolio composed of all the stocks in
that industry. This model is effectively a two-factor model, since
the industry factor coefficient βnj is zero unless stock n belongs to
industry j. Hence, the sum in Eq. (6) only has two nonzero terms:
one for the market factor and one for the industry that the n-th stock
belongs to. A similar model is used in [1]: all the stocks are grouped
into 15 sectors, each sector is associated to an exchange-traded fund,
and the return for a sector factor is taken to be the return for of the
corresponding exchange-traded fund.
In addition to clustering by industry or sector, many other ways
of grouping stocks have been used to construct factors. A widely
used method is to classify companies according to their size and
book-to-market ratio. For example, in addition to the market fac-
tor, Fama and French [15] use two more portfolios as factors. One
portfolio is long high book-to-market stocks (so-called value stocks)
and short low book-to-market stocks (so-called growth stocks), and
the other portfolio is long small companies and short large compa-
nies, as measured by the companies’ market capitalization.1
The Chen-Roll-Ross model [8], in addition to the market fac-
tors, contains a number of macroeconomic factors as well as indica-
tors from the bond market: the inflation rate, industrial production,
consumption growth rate, oil prices, one-month US Treasury Bill
rate (TB), return on long-term US government bonds less TB, and
others.
Another widely used method for constructing factor models it to
perform the Karhunen-Loève transform, also known in the literature
as the principal component analysis (PCA), and only keep a small
number of components that correspond to the largest eigenvalues of
the covariance matrix of the return vector [43, 34, 1, 50]. Simply ze-
roing out the smaller eigenvalues, however, would result in a singular
covariance matrix, and therefore care must be taken when applying
this approach to portfolio optimization and other applications that
require inverting the estimated covariance matrix [1, 50].
2.4. Shrinkage Methods
Shrinkage methods [34, 46] strive to achieve a compromise between
the instability of the sample covariance estimator and the biases in-
troduced by model-based estimators. A shrinkage estimator is a con-
vex combination of the sample covariance and a so-called shrinkage
target which can be, for example, any one of the estimators discussed
above. The mixing weight, also called shrinkage intensity, can be
obtained through cross-validation.
3. MITIGATING PORTFOLIO SENSITIVITY TO
COVARIANCE ESTIMATES
.
All covariance estimation methods produce estimation errors,
making their direct use in the Markowitz framework problematic.
This is due to the sensitivity of the optimal portfolio weights to es-
timation errors, discussed above in Section 1.2. To reduce this sen-
sitivity, two types of strategies have been used: modifying the opti-
mization criterion and combining several portfolio weight estimates.
1Market capitalization is equal to the price per share times the number of
shares outstanding.
Examples of the former strategy are robust portfolios and norm-
constrained portfolios; examples of combining weight estimates are
resampled portfolios and portfolio mixtures.
1. Robust portfolios replace the original optimization problem
(1,2,3) with various robustified versions [19, 52, 6, 18]. For
example, in [19] it is proposed to select the portfolio weight
vector w to minimize
max
Λ∈S
w
T
Λw,
subject to the weight normalization constraint (1) and the fol-
lowing constraint:
min
µ∈M
w
T
µ ≥ µtgt,
where S and M are uncertainty sets for the covariance ma-
trix and the mean return, respectively. This framework intro-
duces the additional difficulty of having to estimate the uncer-
tainty sets. This is done in [19] using the confidence intervals
around the estimates of the covariance and the mean.
2. Norm-constrained portfolios add a constraint on the norm
of the portfolio weights to the optimization problem (1,2,3)
[23, 22, 12, 5, 21], such as ‖w‖ ≤ δ where δ is a parameter.
The basic idea is that an explicit constraint on the weights
would reduce the amount of possible change of the weight
vector in response to the changes of the parameter estimates.
3. Resampled portfolios randomize the portfolio selection pro-
cedure and compute the average weights from many random-
ized simulations [25, 28, 17, 47, 40].
4. Portfolio mixtures combine the estimated optimal portfolio
with either a fixed portfolio or a portfolio which depends on a
small number of estimated parameters [20, 29, 10, 11, 51, 44].
4. CONCLUSIONS
This paper has described several covariance estimation methods uti-
lized in the field of portfolio optimization. Despite a large body of
existing literature, many open problems remain, both in the area of
designing better covariance estimators, and in developing portfolio
construction algorithms which are less sensitive to parameter esti-
mation errors and are hence more practicable.
5. REFERENCES
[1] M. Avellaneda and J.-H. Lee. “Statistical arbitrage in the US equities
market”, Quantitative Finance, 10(7):761–782, 2010.
[2] C. Bengtsson and J. Holst. “On portfolio selection: Improved covariance
matrix estimation for Swedish asset returns”, Working paper, Lund Uni-
versity and Lund Institute of Technology, 2002.
[3] M.J. Best and R.R. Grauer. “On the sensitivity of mean-variance-
efficient portfolios to changes in asset means: Some analytical and com-
putational results”, The Review of Financial Studies, 4(2):315–342,
1991.
[4] M. Broadie. “Computing efficient frontiers using estimated parameters”,
Annals of Operations Research, 45(1):21–58, 1993.
[5] J. Brodie, I. Daubechies, C. De Mol, D. Giannone, and I. Loris. “Sparse
and stable Markowitz portfolios”, Proceedings of the National Academy
of Sciences, 106(30):12267–12272, Jul. 2009.
[6] S. Ceria and R.A. Stubbs. “Incorporating estimation errors into portfo-
lio selection: Robust portfolio construction”, Axioma Research Paper
No. 003, May 2006.
[7] L.K.C. Chan, J. Karceski, and J. Lakonishok. “On portfolio optimiza-
tion: Forecasting covariances and choosing the risk model”, The Review
of Financial Studies, 12(5):937–974, Winter 1999.
[8] N.F. Chen, R. Roll, and S.A. Ross. “Economic forces and the stock
market”, Journal of Business, 59(3):383-403, 1986.
[9] V.K. Chopra and W.T. Ziemba. “The effects of error in means, variances,
and covariances on optimal portfolio choice”, The Journal of Portfolio
Management, 19(2):6–11, 1993.
[10] D.J. Disatnik and S. Benninga. “Shrinking the Covariance Matrix”,
Journal of Portfolio Management, 33(4):55-63, 2007.
[11] V. DeMiguel, L. Garlappi, and R. Uppal. “Optimal versus naive diver-
sification: How inefficient is the 1/N portfolio strategy?” The Review of
Financial Studies, 22(5):1915–1953, 2009.
[12] V. DeMiguel, L. Garlappi, F.J. Nogales, and R. Uppal. “A gener-
alized approach to portfolio optimization: Improving performance by
constraining portfolio norms”, Management Science, 55(5):798–812,
May 2009.
[13] R. Duchin and H. Levy. “Markowitz Versus the Talmudic Portfo-
lio Diversification Strategies”, The Journal of Portfolio Management,
35(2):71–74, Winter 2009.
[14] E. J. Elton and M. J. Gruber. “Estimating the dependence structure of
share prices”, The Journal of Finance, 28(5):1203–1232, Dec. 1973.
[15] E. F. Fama and K. R. French. “Size and book-to-market factors in earn-
ings and returns”, The Journal of Finance, 50(1):131–155, Mar. 1995.
[16] E. F. Fama and K. R. French. “Industry costs of equity”, Journal of
Financial Economics, 43(2):153–193, Feb. 1997.
[17] J. Fletcher and J. Hillier. “An examination of resampled portfolio effi-
ciency”, Financial Analysts Journal, 57(5):66–74, Sep.–Oct. 2001.
[18] L. Garlappi, R. Uppal, and T. Wang. “Portfolio selection with param-
eter and model uncertainty: A multi-prior approach”, The Review of
Financial Studies, 20(1):41–81, Jan. 2007.
[19] D. Goldfarb and G. Iyengar. “Robust portfolio selection problems”,
Mathematics of Operations Research, 28(1):1–38, Feb. 2003.
[20] V. Golosnoy and Y. Okhrin. “Multivariate shrinkage for optimal portfo-
lio weights”, em European Journal of Finance, 13(5):441-458, Jul. 2007.
[21] J. Gotoh and A. Takeda. “On the role of norm constraints in portfolio
selection”, Department of Industrial and Systems Engineering Discus-
sion Paper Series No. 09–03, Chuo University, Apr. 2010.
[22] R. Jagannathan and T. Ma. “Risk reduction in large portfolios: Why
imposing the wrong constraints helps”, The Journal of Finance,
58(4):1651–1684, Aug. 2003.
[23] R. Jansen and R. van Dijk. “Optimal benchmark tracking with small
portfolios”, The Journal of Portfolio Management, 28(2):33–39, Winter
2002.
[24] E. Jay, P. Duvaut, S. Darolles, and A. Chretien. “Multi-factor models
and signal processing techniques: survey and example”, IEEE Signal
Processing Magazine, 28(5):61–71, Sep. 2011.
[25] J.D. Jobson and B. Korkie. “Putting Markowitz theory to work”, The
Journal of Portfolio Management, 7(4):70–74, Summer 1981.
[26] P. Jorion. “International portfolio diversification with estimation risk”,
The Journal of Business, 58(3):259–278, Jul. 1985.
[27] P. Jorion. “Bayes-Stein estimation for portfolio analysis”, The Journal
of Financial and Quantitative Analysis, 21(3):279–292, Sep. 1986.
[28] P. Jorion. “Portfolio optimization in practice”, Financial Analysts Jour-
nal, 48(1):68–74, Jan.–Feb. 1992.
[29] R. Kan and G. Zhou. “Optimal portfolio choice with parameter uncer-
tainty”, Journal of Financial and Quantitative Analysis, 42(3):621-656,
Sep. 2007.
[30] N. El Karoui. “High-dimensionality effects in the Markowitz problem
and other quadratic programs with linear equality constraints: Risk un-
derestimation”, Preprint, 2009.
[31] N. El Karoui. “On the realized risk of Markowitz portfolios”, Preprint,
2009.
[32] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters. “Noise dressing
of financial correlation matrices”, Physical Review Letters, 83(7):1467–
1470, 1999.
[33] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters. “Random matrix
theory and financial correlations”, International Journal of Theoretical
and Applied Finance, 3(3):391-398, 2000.
[34] O. Ledoit and M. Wolf. “Improved estimation of the covariance matrix
of stock returns with an application to portfolio selection”, Journal of
Empirical Finance, 10(5):603–621, Dec. 2003.
[35] O. Ledoit and M. Wolf. “A well-conditioned estimator for large-
dimensional covariance matrices”, Journal of Multivariate Analysis,
88(2):365–411, Feb. 2004.
[36] H. Levy and R. Duchin. “Markowitz’s Mean-Variance Rule and the Tal-
mudic Diversification Recommendation”, In The Handbook of Portfo-
lio Construction: Contemporary Applications of Markowitz Techniques,
J.B. Guerard, Jr., Ed., Springer, 2009.
[37] H. M. Markowitz. “Portfolio selection”, The Journal of Finance,
7(1):77–91, 1952.
[38] D.S. Matteson and D. Ruppert. “GARCH Models of Dynamic Volatil-
ity and Correlation”, IEEE Signal Processing Magazine, 28(5):72–82,
Sep. 2011.
[39] R.O. Michaud. “The Markowitz optimization enigma: Is ’optimized’
optimal?” Financial Analysts Journal, 45(1):31–42, Jan.–Feb. 1989.
[40] R.O. Michaud and R.O. Michaud. Efficient Asset Management: A Prac-
tical Guide to Stock Portfolio Optimization and Asset Allocation, 2nd
Edition. Oxford University Press, 2008.
[41] K.K. Ng, P. Agarwal, N. Mullen, D. Du, and I. Pollak. “Comparison of
several covariance matrix estimators for portfolio optimization”, In Pro-
ceedings of the IEEE International Conference on Acoustics, Speech,
and Signal Processing, May 22-27, 2011, Prague, Czech Republic.
[42] V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, and H.E. Stan-
ley. “Universal and non-universal properties of cross-correlations in fi-
nancial time series”, Physical Review Letters, 83(7):1471-1474, 1999.
[43] V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, T. Guhr, and
H.E. Stanley. “Random matrix approach to cross correlations in financial
data”, Physical Review E, 65:066126-1–066126-18, Jun. 2002.
[44] I. Pollak. “Weight shrinkage for portfolio optimization”, In Proceed-
ings of the Fourth International Workshop on Computational Advances
in Multi-Sensor Adaptive Processing, December 13-16, 2011, San Juan,
Puerto Rico.
[45] D. Ruppert. Statistics and Data Analysis for Financial Engineering.
Springer, 2010.
[46] J. Schäfer and K. Strimmer. “A Shrinkage Approach to Large-Scale Co-
variance Matrix Estimation and Implications for Functional Genomics”,
Statistical Applications in Genetics and Molecular Biology, Volume 4,
Issue 1, Article 32, 2005.
[47] B. Scherer. “Portfolio resampling: Review and critique”, Financial
Analysts Journal, 58(6):98–109, Nov.–Dec. 2002.
[48] W. F. Sharpe. “A simplified model for portfolio analysis”, Management
Science, 9(2):277–293, Jan. 1963.
[49] W. F. Sharpe. “Mutual fund performance”, Journal of Business,
39(1):119–138, 1966.
[50] M.U. Torun, A.N. Akansu, and M. Avellaneda. “Portfolio Risk in Mul-
tiple Freqiencies”, IEEE Signal Processing Magazine, 28(5):61–71,
Sep. 2011.
[51] J. Tu and G. Zhou. “Markowitz meets Talmud: A combination of so-
phisticated and naive diversification strategies”, Journal of Financial
Econometrics, 99:204-215, 2011.
[52] R.H. Tütüncü and M. Koenig. “Robust asset allocation”, Annals of
Operations Research, 132:157–187, 2004.
[53] J.-H. Won, J. Lim, S.-J. Kim, and B. Rajaratnam. “Maximum likeli-
hood covariance estimation with a condition number constraint”, Tech-
nical Report No. 2009-10, Department of Statistics, Stanford University,
Aug. 2009.