Wednesday, July 25, 2012

Universal portfolio, part 9

Part 8 was discussing the distribution of the absolute wealth of the Universal Portfolio across all possible tuples of length 2, 3 and 4.

However, comparing the absolute wealth against some reference, especially against simple portfolio selection algorithm provides a better view of the exact performance of the Universal algorithm.  Because we want to compute and compute multiple references, the code uses functions to be more generic.  The code remains terse (88 lines, including comments) and could be even terser.

Writing the code in this fashion shows again the strength of R, some specific aspects used here:

  • Passing functions as parameters of other functions
  • Using frames to access information outside the function scope, including assigning new values.
  • Use of ... to write variadic functions (I am still learning that, the current code works but is not elegant)
  • Use of ::: to access non exported functions from a package
The code recalculates the value of Universal final wealth across all 4-tuples and thus takes one day to run, be warned.






The first graph shows the final wealth of Universal versus the final wealth of the best stock in the tuple. Contrary to the absolute wealth, the relative wealth decreases as the number of stocks in the tuple increases and is also generally significantly less than 1.  The red curve shows that for about 70% of the 4-tuples, Universal final wealth is below the wealth of the best stock in the tuple.


Obviously, it is impossible to know in advance which stock will have the best final wealth, so the comparison is slightly unfair.  The next two comparisons however are against two of the simplest causal selection algorithms:

  • Equally weighted buy and hold, UBH for short, U for uniform
  • Equally weighted Constant Rebalanced Portfolio, usually UCRP for Uniform CRP.  This one is interesting as Universal itself is based on a weighted combination of all CRP.
As BH is by construction worse than the best stock, we expect BH to fare better and indeed it does.  First Universal is generally better than BH, and second the relative performance increases as the number  of stocks in the tuple increase.  In general, Universal is much better than UBH.  The cumulative distribution does still shows a rather long and thin tail.


Unfortunately this does not carry to UCRP, UCRP happens to be a very good performer, and a very tough reference to beat for any portfolio selection algorithm.


Against UCRP:
  • Universal generally loses to UCRP (around 90% of the time) and sometimes pretty badly
  • Increasing the tuple size increases the relative performance when UCRP is better than Universal and decreases the performance when the opposite is true.
Interestingly, the textual output shows that the composition of the best absolute or relative Universal portfolios differ significantly for the comparison against UCRP.

Max final wealth 78.4742 for stocks: comme kinar
Max final wealth 111.6039 for stocks: comme kinar meico
Max final wealth 138.5075 for stocks: comme espey kinar meico
Max final relative wealth 4.3379 for stocks: iroqu kinar
Max final relative wealth 5.6186 for stocks: espey iroqu kinar
Max final relative wealth 5.2758 for stocks: coke espey iroqu kinar
Max final relative wealth 5.9302 for stocks: iroqu kinar
Max final relative wealth 8.6374 for stocks: espey iroqu kinar
Max final relative wealth 8.8390 for stocks: espey iroqu kinar meico
Max final relative wealth 1.3043 for stocks: dupont morris
Max final relative wealth 1.1980 for stocks: dupont morris sears
Max final relative wealth 1.1330 for stocks: dupont morris schlum sears

Wednesday, July 18, 2012

Universal portfolio, part 8

We extend the analysis of part 7 by calculating the final wealth for all tuples of 3 and 4 stocks, this is a simple extension but it also shows the most important problem of the Universal portfolio algorithm, its exponential complexity in the number of stocks.

The first part shows the cumulative distribution of the absolute final wealth for all possible tuples of 2, 3 and 4 stocks.  The results are qualitatively the same as before, a thin tail of good results.  It also a shows increase in absolute wealth as the number of stocks increase.

The code remains quite compact, R is simply wonderful for this type of analysis.


library(logopt)
x <- coredata(nyse.cover.1962.1984)
w <- logopt:::x2w(x)
nDays <- dim(x)[1]
nStocks <- dim(x)[2] 
Days <- 1:nDays
iWin <- 1 ; plot(1:10)


FinalWealth <- function(cols) {
xij <- x[,cols]
uc <- universal.cover(xij, 20)
return(uc[length(uc)])
}


TupleSizes <- c(2,3,4)
if (exists("lws") == FALSE) {
lws <- list()
for (i in 1:length(TupleSizes)) {
TupleSize <- TupleSizes[i]
ws <- combn(1:nStocks, TupleSize, FinalWealth)
lines(ecdf(ws), pch=".", col=Colors[i])
lws[[i]] <- ws
}
}


# show the ecdf 
if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ;
plot(ecdf(lws[[3]]), pch=".", col="red", main="Cumulative probability of final wealth",
xlab="Final wealth", ylab="Cumulative probability")
lines(ecdf(lws[[2]]), pch=".", col="green")
lines(ecdf(lws[[1]]), pch=".", col="blue")
legend("bottomright", legend=c("2 stocks","3 stocks","4 stocks"), fill=c("blue","green","red"))
grid()


# show best absolute wealth and its composition
for (i in 1:length(TupleSizes)) {
TupleSize <- TupleSizes[i]
BestTuple <- which.max(lws[[i]])
BestStocks <- combn(1:nStocks, TupleSize)[,BestTuple]
cat(sprintf("Max final wealth %.4f for stocks: ", lws[[i]][BestTuple]))
cat(colnames(x)[BestStocks]) ; cat("\n")
}


This gives the following output and graph


Max final wealth 78.4742 for stocks: comme kinar
Max final wealth 111.6039 for stocks: comme kinar meico
Max final wealth 138.5075 for stocks: comme espey kinar meico



The highlighted line of code is there for a reason, calculating the final wealth for all 4-tuples takes about   one full day on my machine.  So the code is only run once, then the magic of automatically saving the environment takes care of avoiding recalculating the result.


But why is the running time so long?  The main problem of the Universal Portfolio algorithm is that it needs to evaluate an integral on the simplex, and the curse of dimensionality rears its ugly head.  The general approach for a numerical integration in n dimensions is exponential in the number of dimensions.  Furthermore we calculate the wealth for all possible n-tuples and this increases also exponentially in n when n is much smaller than the total number of stocks concerned.


As usual R code can provide a more vivid representation, the following code generates a plot showing the complexity evolution as the number of stocks increase for 36 stocks and a grid spacing of 0.05.  The code also produces a table for small the small values of n.  The code uses the ::: operator, it was new to me and allows to call package functions that are not part of the exposed interface.


library(logopt)
x <- coredata(nyse.cover.1962.1984)
nStocks <- dim(x)[2] 


Sizes <- 1:nStocks
nCs <- 0 * Sizes 
nSs <- 0 * Sizes


for (i in 1:length(Sizes)) {
n <- Sizes[i]
nCs[i] <- choose(nStocks, n) 
nSs[i] <- logopt:::count.grid.points(n, 20)
}


plot(nCs * nSs, type="l", col="blue", log="y",ylim=range(nCs,nSs,nCs*nSs))
lines(nCs, col="red")
lines(nSs, col="green")
grid()


cat(sprintf("n    C(36,n)    S(n,20) C(36,n)*S(n,20)   Ratios to previous value\n"))
for(i in 2:6) {
n <- Sizes[i]
cat(sprintf("%d %10d %10d %15.0f",n,nCs[i],nSs[i], nCs[i] * nSs[i]))
if(i > 2) {
cat(sprintf("%7.1f %7.1f %7.1f", nCs[i]/nCs[i-1], nSs[i]/nSs[i-1],
                        (nCs[i] * nSs[i]) / (nCs[i-1] * nSs[i-1])))
}
cat("\n")
}

n    C(36,n)    S(n,20) C(36,n)*S(n,20)   Ratios to previous value
2        630         21           13230
3       7140        231         1649340   11.3    11.0   124.7
4      58905       1771       104320755    8.2     7.7    63.2
5     376992      10626      4005916992    6.4     6.0    38.4
6    1947792      53130    103486188960    5.2     5.0    25.8

The exponential complexity makes the Universal Porfolio algorithm impractical for large portfolio, a serious limitation.  Some authors have devised more efficient approaches, that either try to approximate Universal Portfolio at reduced complexity or are inspired by the ideas of Universal but with less complexity and general either no bounds or worse bounds.


Saturday, July 7, 2012

Universal portfolio, part 7

After reproducing all original figures and tables from Universal Portfolios, R coupled with modern processors allows to perform some more analysis.

First we calculate the final wealth of the universal portfolio for all possible pairs of stocks, and show the corresponding cumulative probability function, and which pair is the best.  This shows how concise R code can be.


library(logopt)
x <- coredata(nyse.cover.1962.1984)
w <- logopt:::x2w(x) 
nDays <- dim(x)[1]
nStocks <- dim(x)[2] 
Days <- 1:nDays


FinalWealth <- function(cols) {
xij <- x[,cols]
uc <- universal.cover(xij, 20)
return(uc[length(uc)])
}


ws <- combn(1:nStocks, 2, FinalWealth)
plot(ecdf(ws),col="blue",pch=19,cex=0.5)
grid()


# show the best pair
BestIdx <- combn(1:nStocks,2)[,which.max(ws)]
cat(sprintf("Max final wealth: %.4f for pair (%s,%s)\n", max(ws), colnames(x)[BestIdx[1]], colnames(x)[BestIdx[2]]))

The result is the figure below, plus this line

Max final wealth: 78.4742 for pair (comme,kinar)

There are two interesting observations to make:
  • There is a long, thin tail.  In other words, there are some pairs giving good results, but they are relatively rare.
  • The best pair happens to be one of the examples in Cover's article, probably not a coincidence.
An even better metric is the ratio between the final wealth and the wealth of the best stock in the pair.  It is expected that the universal portfolio gets better performance when the underlying stocks perform better.  The code below is very similar to the code above, but calculating the ratio of wealth as a final step.  It needs the same prolog as above (not shown).

WealthRatio <- function(cols) {
xij <- x[,cols]
uc <- universal.cover(xij, 20)
return(uc[length(uc)]/max(w[length(uc),cols]))
}

wrs <- combn(1:nStocks, 2, WealthRatio)
plot(ecdf(wrs),col="blue",pch=19,cex=0.5)
grid()

# show the best and worst pairs
BestIdx <- combn(1:nStocks,2)[,which.max(wrs)]
cat(sprintf("Max increase from best stock: %.4f for pair (%s,%s)\n", max(wrs), colnames(x)[BestIdx[1]], colnames(x)[BestIdx[2]]))
WorstIdx <- combn(1:nStocks,2)[,which.min(wrs)]
cat(sprintf("Min ratio from best stock: %.4f for pair (%s,%s)\n", min(wrs), colnames(x)[WorstIdx[1]], colnames(x)[WorstIdx[2]]))

The result is the figure below, plus these lines

Max increase from best stock: 4.3379 for pair (iroqu,kinar)
Min ratio from best stock: 0.3773 for pair (dupont,morris)
And we can do similar observations:
  • There is a long, thin tail, even thinner than for the wealth itself.  In other words, there are some pairs giving good results, but they are relatively rare.
  • In the majority of the case (about 60%), the wealth of universal portfolio is below the wealth of the best stock in the pair, sometimes significantly so.
  • The best pair happens to be another one of the examples in Cover's article, probably not a coincidence either.
The general feeling is that universal portfolio is a valid approach, but maybe not as performing as could be deduced from the original article that cherry picked the examples.