When to ‘Buy the Dip’A Gentle Introduction to Hidden Markov Models for Volatility Regime DetectionOsho JhaBlockedUnblockFollowFollowingApr 23Trading, much like sailing, in choppy waters is a recipe for nausea.

Motivation: “Buy the dip” — it’s a frustratingly simple piece of advice.

Like most pieces of advice, it’s easier said than done and the giver of such advice has probably not attempted to practice what they preach.

It induces FOMO, which leads to the “hope trade”, when the “hope trade” goes awry you’re stuck as the “long term investor” who “really believes in the company’s mission”.

Buying the dip is even more difficult during times such as these when the S&P is near/at historic highs.

Looking back at December of 2018, the sell off in stocks presented an amazing buying opportunity for any investor or trader willing to put their capital in play during a rocky time.

The problem of understanding when to ‘buy a dip’ vs.

‘sell a rally’ poses an interesting area of study for algorithm development/data science in finance as there are many ways to define the problem and approach it.

One such approach is the use of Hidden Markov Models (HMMs) to determine periods of high and low volatility of returns.

Using a regime detection methodology to amplify signals from momentum indicators (such as moving averaging crosses, etc) allows for more confidence in trade sizing.

For example, during an uptrend, you can more confidently buy dips when you can understand whether or not you are in an uptrend.

Another use case is the ability to understand structural changes in your data, allowing for more complex modeling techniques or better application of simpler statistical models such as regression analysis.

Regime detection is key to this as financial time series are not stationary and, as we’ll see in our own example below, means and variance of returns change during different periods i.

e.

the distribution changes.

Hidden Markov Models: Hidden Markov Models are…complicated.

I’ve studied them in depth with a world expert on the matter, only to have what I consider a “half way decent” understanding of what I’m doing regarding the math behind them.

HMMs have been used to solve problems in earthquake forecasting, machine based speech translation, and even in financial markets with some positing that hierarchical HMMs are responsible for some of Renaissance Tech’s successful trading algorithms.

In order to understand a Hidden Markov Model, we can first try to understand a Markov Chain.

A Markov Chain is a stochastic process which satisfies the Markov Property — that is to say it is memoryless and the probability of an event depends only on the state attained in the previous event.

It is part of a broader class of state space models and can be formulated in both discrete and continuous time models.

For those who are interested in math history, something which I always found to be both inspiring and a useful contextualization tool, the following video explains the history of the Markov Chain:Hopefully this embedded correctly…The best way to understand a Markov Chain is by drawing one.

Consider the image below:An…oddly…shaped Markov Chain.

The transition matrix for the Markov Chain above.

The diagram shows 3 distinct states: Bull Market, Bear Market, and Stagnant Market.

The arrows show the possible ways that one state can transition to another.

The matrix P is called a transition matrix and displays the probabilities of moving from one state to another.

For example, the probability of moving from Bull Market to a Bull Market is 0.

9, the probability of moving from Bull Market to Bear Market is 0.

075, the probability of moving from Bull Market to Stagnant Market is 0.

5.

An important feature of any transition matrix P is that the sum of the transition probabilities in each row will be 1.

For a Hidden Markov Model, we consider a Markov Process with unobservable states.

The intuition behind this is that there is some unobserved variable that is impacting measurements of an observed variable.

Our goal is to derive hidden volatility regimes that are impacting observed returns.

As an intuitive example, let’s consider a scenario (I took this from wikipedia because why rebuild the wheel):An important aspect to note is that Alice believes there are discrete hidden states, namely, rainy and sunny.

In our market model we are also assuming discrete hidden states, namely, low volatility regimes, high volatility regimes, and neutral volatility regimes.

Suppose instead, that Alice wanted to infer an actual temperature for the day instead of type of weather — this would be a continuous hidden state.

While we won’t go into much detail on that, HMMs are useful in modeling continuous hidden states as well and in particular we use Kalman Filters for solving these problems.

For a set of discrete hidden states, the Viterbi Algorithm is a very commonly used way to solve for the most likely sequence of hidden states and has found many applications in day-to-day technologies.

A full treatment of Hidden Markov Models is well outside the scope of this piece, however, the above should give some intuitive understanding of the problem we are attempting to solve.

Obtaining Data: Now that we have some intuitive basis for understanding our problem, we proceed to collect some market data.

Unfortunately, many older python tutorials that are oriented towards financial time series analysis use pandas data reader or other packages pointing to now deprecated free api’s such as google finance, yahoo finance, and quandl.

Hunting around, I found that IEX is probably the best source of free daily trading bars.

Intraday data is not free but their prices are very cheap especially compared to the other options available.

While there are ways to use pandas data reader to connect to IEX, we use the iex-api-python library that can be found here: http://www.

danielecook.

com/iex-api-python/The code below shows how to use this package to pull data for SPY (further tickers can be configured.

As an added variable for our modeling we choose to calculate a percentage for daily range as this could give us some insights into underlying volatility regimes:The output from the code above is a data frame with time series data for SPY-the spyder ETF which tracks the S&P500, as well as some calculated columns for daily return and trading range:For this particular example, we only pulled down 5 years of history.

Next, we partition this data into a training and test set.

The sets are selected at random — the benefit of this approach is that it allows the model to see a wider variety of data as opposed to one long stretch of volatility regimes.

Since we utilize the rand function, the results here will differ from any successive runs of code.

For better model diagnostics, the best approach would be to output the train and test sets for repeated tweaking but be careful not to overfit on the sample data…use it more as a guide to understand how changes in parameters are changing results.

#create train and test sets#this methodology will randomly select 80% of our datamsk = np.

random.

rand(len(all_historic_data)) < 0.

8train = all_historic_data[msk]test = all_historic_data[~msk]Model and Outputs: Once we have our train and test sets created, we can go ahead and train our model and then fit the model to our test set.

To do this, we utilize the GaussianMixture function as part of the sklearn.

mixture library.

We specify the n_components=3 because we are looking to model 3 discrete hidden states-low volatility, neutral volatility, and high volatility.

Mixture models implement a closely related, unsupervised form of density estimation that use the expectation-maximization algorithm to estimate the means and covariances of our hidden states.

This doesn’t require us to create initialize transition and emission probabilities — instead it finds the most transition probabilities, emission probabilities, and most likely path.

Note, the IEX code contains all necessary imports for this step.

:The model is trained on return, range, and close for each day.

The output of this code shows the mean and variance for each feature during each regime.

Specifically, the mean and variance are listed in the order the variables are passed i.

e.

for return, range, and close.

Our goal is to firstly minimize variance of return and then maximize mean return.

Given this, our hidden states are described as follows (again, your results will probably differ):1:low vol regime – lowest variance for return.

return is positive so signal is long2:high vol regime – highest variance for return.

return is positive so long conservatively because of high variance0:neutral vol regime – 2nd lowest variance.

return is positive so long conservatively or stay out stocksTo get a visual understanding of the hidden states, we can plot our hidden states with respect to the underlying time series in our test set.

The following code will put that together:We then combine our model output with underlying test set so we can make a proper plot of the regimes against the market close:This code can be significantly improved by adding a function that renames the numerical states with our chosen state names by examining the HMM output mean and variance.

Since I don’t have that function on hand, as a reminder, let’s restate our volatility regimes:1:low vol regime – lowest variance for return.

return is positive so signal is long2:high vol regime – highest variance for return.

return is positive so long conservatively because of high variance0:neutral vol regime – 2nd lowest variance.

return is positive so long conservatively or stay out stocksConclusion: Looking at the regimes compared to the chart, we see that the model is not so bad.

It was able to catch a major, multi-year, uptrend.

Some of this may also be the selection bias of only using the past 5 years of data but sometimes using smaller time scales can make a model useful for short term trading.

The real concern around these results would be the rapid regime switching (between 0 and 1) in the earlier part of our data.

The best approach to remedy this would be to look for better features to train the model on and use this as a sub signal to enhance or detract from more typical momentum trading signals.

Another interesting modification to investigate is using this on company fundamental data to model balance sheet items.

.