Harvesting Predictability

This Diagram reflects the findings from my SVR. For each year I trained it on the previous year's stock prices, and then called on it using "current" data to make a prediction on the stock price of IBM for for the "next day". The blue dots are the mean squared error of the SVM's predicted stock price. If you look closely you can see them rise dramatically in 1987, as a result of black monday, and then again in the late 90's. The best fit line for these dots is shown in blue.

The red dots represent the multiplier on their investment that someone would get if they simply bought IBM at the beginning of the year and sold it at the end. This can be seen as a baseline to beat. it's best fit line is shown in red, and is fractionally higher than 1, about 1.07 (as we would expect). The Green dots represent the multiplier on investment that one would get if they bought on any day the SVM said that the stock would go up, and shorted the stock on any day the SVM said it would go down. We can see some absurdly successful years for this strategy, including one year where it would have produced returns of 15x on any money invested. The green, best fit, line clearly demonstrates how this predictability was harvested out of the market in the 90's and entirely gone by the mid 2000's

For those interested in reading more about this particular experiment, you can read my Machine Learning Nano-Degree Capstone Paper, it's about 14 pages long with lots and lots of pretty graphs, and sort of dry and awkward writing3.

If you are interested in playing with the actual code, you can download it. It's a fairly easy to understand and takes the form of a python based jupyter notebook. It runs quite quickly on a modern laptop.

  • 1. A market is fascinatingly complicate, and it definitely creates self defeating prophecies in some cases, as well as creating "self fulfilling prophecies" in other ones.
  • 2. SVMs close to their current form were first introduced with a paper at the COLT 1992 conference (Boser, Guyon and Vapnik 1992), http://www.svms.org/history.html
  • 3. It's dry and awkward because it was written to a very specific format that didn't exactly fit this research. Such is life