Minimum Variance Portfolio

In the previous post, we introduced the advantages of diversification and defined the relation to Markowitz Portfolio Theory. We also plotted minimum-variance frontiers for different correlations of two assets. In this post, we apply Markowitz by deriving the Minimum Variance Portfolio and implement it using R for a larger set of stocks.

Assumptions

So lets start with some assumptions. If investors have mean-variance preference, that means they have homogeneous beliefs about the distribution of returns, choosing portfolios among two distinct frontier portfolios is as good as choosing portfolios among the original \(n\) assets. The implicit assumption of mean-variance preference is that individual returns are normally distributed, i.e. \(X∼N(\mu X,\sigma^2X)\). If these random variables \(X\) and \(Y\) are also independent, than their sum is normal and defined by their joint multivariate distribution \(Z∼N(\mu X+ \mu Y,\sigma^2 X+\sigma^22Y)\) whenever \(Z=X+Y\)¹.

Even though there will be a set of portfolio weights which minimizes variance regardless of the underlying distributions, correlation is only a complete measure of association if the joint multivariate distribution is normal. In other words, covariance is only an exhaustive measure of co-movement if the joint distributions are themselves normal. We can see this is true because the joint distribution of X and Y is defined by joint normality: \[{\frac {1}{2\pi \sigma _{X}\sigma _{Y}{\sqrt {1-\rho ^{2}}}}}\iint _{X\,Y}\exp \left[-{\frac {1}{2(1-\rho ^{2})}}\left({\frac {X^{2}}{\sigma _{X}^{2}}}+{\frac {Y^{2}}{\sigma _{Y}^{2}}}-{\frac {2\rho XY}{\sigma _{X}\sigma _{Y}}}\right)\right]\,\mathrm {d} X\,\mathrm {d} Y\]

So while the portfolio covariance matrix can always be computed, to the extent that underlying assets have returns which are not normal the optimization is likely to result in spuriously optimal weights.

In this case, homogeneous beliefs are modeled by a quadratic utility function as the first order condition is strictly greater zero and the SOC is strictly less zero. Thus, investors preferences increase in the mean and decrease with higher variance. Higher moments do not matter.

Obviously, a smaller set \(n\) of assets leads to lower transaction costs as long as we buy each individual asset instead of a fund. Again, the assumptions made are quite restrictive. Specifically, assuming that all investors have the same expectations (homogeneous beliefs) and that only the first two moments are considered is a strong simplification.

Back to Markowitz Portfolio Theory. The two fund theorem suggests that a (convex) linear combination of any two distinct MV (efficient) portfolios is again a MV (efficient) portfolio. Resulting from the fact that any two distinct frontier portfolios span the entire portfolio frontier. So in this framework an investment can be carried out in two steps:

Identify any two frontier portfolios
Find the optimal combination between these two portfolios

In order to start with the first step we need to identify the efficient frontier (EF). In particular, there are two possibilities to construct the EF. Either by minimizing the variance or by maximizing the return. Mathematically these two approaches need to yield the same outcome. However, as we use ex post returns as a proxy for expected returns and to compute the covaraince matrix \(\Sigma\), the estimation errors are considered to be lower in the minimum variance (MV) optimization problem, even if there is no effort to maximize risk adjusted returns. The larger errors in the mean-variance optimization come from the fact that future mean returns are often harder to predict than future variance and therefore minimizing the variance leads to better results out-of-sample. For this reason, we only concentrate on min-variance portfolio optimization.

########################################################
#####                   Data                      ######
########################################################
Packages2Load <- c('Matrix', 'quadprog', 'corrplot', 'tseries',
  'fPortfolio', 'matrixcalc', 'metricsgraphics', 'dplyr', "DT", "data.table")
getPackages(Packages2Load) #to get this function go to my blog RBase

#Load data
link <- "https://raw.githubusercontent.com/gabrielkai/Portfolio-Management/master/SP500_Matrix.csv"
returns <- read.csv(link, header = TRUE, sep = ",")

Minimum Variance Portfolio:

We need to assume that the covariance matrix \(\Sigma\) is non-singular, i.e. one that has a matrix inverse, in order to calculate optimal weights as you will see later. A square matrix is non-singular iff its determinant is nonzero, and if all eigenvalues are strictly positive. Such a matrix is called positive definite. In order to retrieve a pos. def. matrix, we need to omit assets that are linear combinations of others, such that the covariance matrix consists of independent assets only and thus has full rank. Another way of thinking about this is that we just need to get rid of assets that do not provide new information which help to span the mean variance frontier. Unfortunately, this restriction is far away from being negligible as we will see later. To obtain the MV portfolio, we minimize the variance of the portfolio or we maximize the expected return. In both cases, the estimation of the expected return and the variance is the most inaccurate part. In this world, we only consider risky assets. \[\min_{\mathrm{w}\in \mathbb{R}^{n}} \frac{1}{2} w'\Sigma w \ \text{ subject to } \ w'\textbf{1} = 1\] This optimization problem is simply solved by using the Lagrange function: \[L(w, \lambda) = \frac{1}{2} w'\Sigma w +\lambda (1 - w'\textbf{1})\] The first order condition FOC wrt \(w\) and \(\lambda\) is given by: \[\frac{\partial L(w, \lambda)}{\partial w} = \Sigma w - \lambda \textbf{1} = 0 \Rightarrow w = \lambda \Sigma^{-1}\textbf{1} \\ \frac{\partial L(w, \lambda)}{\partial \lambda} = 1 - w'\textbf{1} = 0 \Rightarrow \textbf{1}'w = 1 \] If we plug the first into the second we simply obtain \(\lambda = \frac{1}{ \textbf{1}' \Sigma \textbf{1}}\), which leads to the following optimal weight: \[w^{*} = \frac{\Sigma^{-1}\textbf{1}}{\textbf{1}'\Sigma^{-1}\textbf{1}}\]

Which can be implemented as follows. However, there are several ways to do so. The first part of the function computes the global min-var pf with short-selling using the closed form solution. The second part considers short-selling restrictions and uses quadratic optimization instead.These different approaches are for illustration purposes.

#########################################################
#####                 MV function                  ######
#########################################################
MinVarPortfolio_ClosedForm <-
  function(returns #M(any time unit) x N matrix
           )
  {
    N = ncol(returns)
    names = colnames(returns)
    sigma <- cov(returns)      #compute sample covariance
    #need to be symmetric, non - singular, positive definite
    conditions <- isSymmetric.matrix(sigma) & 
      !is.singular.matrix(sigma) & 
      all(round(eigen(sigma)$values, digits = 4) > 0)
    if(!conditions) {
      sigma <- nearPD(sigma)$mat #find the nearest positive definite matrix as a viable alternative
      print("Covariance matrix is not positive definite. Compute the nearest positive definite matrix.")
    }
    
    expReturns = colMeans(returns)
    sigmaInv = solve(sigma)
    #We compute w* as derrived above
    
      iota = matrix(1, N, 1)
      c = crossprod(iota, sigmaInv) 
      b = c %*% expReturns
      b = as.vector(b)
      c = c %*% iota
      c = as.vector(c)
      rg = b/c
      sigma2g = 1/c
      wg = sigma2g * sigmaInv%*% iota
     #trivial efficient pf
      a = crossprod(expReturns, sigmaInv) %*% expReturns
      rt = a/b
      sigma2t = rt^2
      #results
      global.Min.Var.Pf <- list(
                                  ExpRet  = expReturns
                                , ExpStdv = sqrt(diag(sigma))
                                , weights = data.frame(Names= names, Weights = wg)
                                , ExpRet.t = rt
                                , ExpStdv.t = sigma2t
                                )
    
    return(list(global.Min.Var.Pf = global.Min.Var.Pf, SigmaInv = sigmaInv))
}

MinVarPortfolio_optim <-
  function(returns #M(any time unit) x N matrix
           , ShortSelling=TRUE
           )
  {#Form is min(-d^T b + 1/2 b^T D b) with the constraints A^T b >= b_0.
    N = ncol(returns)
    names = colnames(returns)
    sigma <- cov(returns)   
    Dmat <- 2 * sigma
    dvec <- rep.int(0, N) #form not needed, set to 0
    Amat <- cbind(rep.int(1, N)) #weight vector has norm 1
    bvec <- 1 #weights sum up to 1
    if(ShortSelling==FALSE) {
      Amat <- cbind(Amat, diag(1,N)) #each weight needs to be nonnegative
      bvec <- c(bvec, rep(0, N)) 
    }
    result <- quadprog::solve.QP(Dmat=Dmat, dvec=dvec, Amat=Amat, bvec=bvec, meq=1)
    wg  <- matrix(result$solution, N, 1)
    rownames(wg) <- names
    return(wg)
}

The portfolio with the smallest risk is the global MV portfolio, indicated by the largest dot in the graph below. The efficient frontier starts from that point onwards and goes up to the right. Obviously, every portfolio on the bullet below the global MV portfolio is called MV inefficient portfolio. The frontier is the border of all feasible portfolios. The two fund theorem implies that every linear combination of two MV portfolios is again a MV portfolio. That in mind implies that we can ride the upper side of the bullet depending on the risk appetite by holding a convex combination of only two efficient portfolios. A convex combination is a linear combination of points where all coefficients are non-negative and sum to one.

Since we consider the entire SP500 for a period of 5 years, we obtain more stocks than time points \(N \gg T\). As a consequence, the covariance matrix is singular as the matrix of returns does not have full rank. Solutions to overcome this problem, as for instance using Principal Component Analysis. For now, we choose a random subset of stocks to avoid a dependent structure and construct the Global Minimum-variance Portfolio.

#########################################################
#####                  Global MV                   ######
#########################################################
set.seed(12)
#since the return matrix does not have full rank,the cov matrix would be singular
index <- sample.int(ncol(returns), 50, replace = FALSE)  
globalMin <- MinVarPortfolio_ClosedForm(returns[, index])
all.equal(globalMin[[1]]$weights[, 2], as.numeric(MinVarPortfolio_optim(returns[, index])))

## [1] TRUE

The following table presents the results in detail:

In the next step we need to identify a second distinct MV portfolio. Usually, people just consider the optimal portfolio which maximizes the sharp ratio as the second distinct frontier portfolio and then construct the frontier by the linear combination of these two portfolios. So we could just generate a vector \(\alpha\) in \([-1, 1]\) and \(1-\alpha\) of weights in a step-size of 0.001 and obtain a sufficient set of combinations that let us draw the MV line. On the other hand, we could also use the closed form solution for the MV portfolio without risk-less asset defined as: \[\sigma_p = \pm \sqrt \frac{a-2 r_p b + r_p^2 c}{ac-b^2} \ \ \ \ \text{ with}\\ a = \mathop {\mathbb E}[R]'\Sigma^{-1}\mathop {\mathbb E}[R] \\ b = {\textbf{1}'\Sigma^{-1}} \mathop {\mathbb E}[R] \\ c = {\textbf{1}'\Sigma^{-1}\textbf{1}}\]

However, a trivial portfolio that is MV efficient is \(r_p=a/b\) and after plugging in and some rearrangements, we obtain \(\sigma^2_p=a/b^2\). Together with the Global Minimum Variance Portfolio which is given by \(r_p=b/c\), respectively \(\sigma^2_G=1/c\), we can span the entire frontier according to the two fund theorem. This formula lets us construct every point on the MV frontier. For the moment, find below the scatterplot of individual assets and the global MV pf. The larger the size of the dots the larger are the weights in the minimum variance portfolio. The global minimum variance portfolio got the largest weights assigned, just for visualization.