Markov Chain Monte Carlo

What happens when the probabilities look more like this?Rbb [CC BY-SA 3.

0 (https://creativecommons.

0)], from Wikimedia CommonsIn this situation, the likelihood doesn't have a normal distribution and so we end up with a right-skewed posterior distribution.

Since we can not express this with a formula we must use a Markov Chain Monte Carlo.

The three parts of Markov Chain Monte CarloOne: Monte CarloMonte Carlo simulations model complex systems by generating random numbers.

In the situation of the gif below, the Monte Carlo generates a random point with the parameters of (0–1, 0–1), by identifying the number of points that end up under the curve we are able to approximate the area of the full circle and from that π.

nicoguaro [CC BY 3.

0 (https://creativecommons.

0)], from Wikimedia CommonsTwo: Markov ChainsMarkov Chains are essentially a representation of how a variable “walks” around a graph, or how a random variable changes from one state to another over time.

image source http://www.

mathcs.

emory.

edu/~cheung/The above image is a representation of a Markov Chain of emotional states.

In this chain, if you are Cheerful, there is a 20% chance you will change emotional states to So-so, 20% chance you will become Sad and 60% chance you will stay Cheerful.

Markov Chains are governed by the Markov PropertyF(Xt+1|Xt) = f(Xt+1|Xt,Xt-1,Xt-2,….

,X1)In English: If I know what is happening right now, knowing what happened to get us to this step or the step before, etc.

Example of this are:Mendelian Genetics.

In the example below the children’s bean color is entirely influenced by the bean color of the parent beans.

The 1st generation’s bean color was influenced by the color of the generation before but that does not need to be taken into account when determining the second generation’s color.

Pbroks13 [CC BY-SA 3.

0 (https://creativecommons.

0)]Board games: When playing Monopoly and trying to determine the probability a player will go to a certain space the only information you need is where is the player currently.

Where the player was the turn before doesn’t influence where it is going next except in that it determined where it is this turn.

Three: Acceptance-Rejection samplingThe third part of MCMC is acceptance-rejection sampling.

When we sample new observations we decide if it’s in the correct direction and then decide if we are keeping it or discarding it.

Two common acceptance-rejection algorithms are the Metropolis-Hasting Algorithm and the No-U-Turn Sampler.

The No-U-Turn math is more complicated than I can explain in this medium post but check out this article if you want to dive in.

Here is my high-level explanation of Metropolis-HastingWe are at point x.

We make a guess for the next step.

We will call this x*We then compute the ratio of the probability of x*/x.

This is calculated using the product of the likelihood and prior distributions.

If the ratio of p(x*)/p(x) (also called the acceptance probability) is greater than 1 we accept x* as the new position.

Even if the acceptance probability is less than 1, we don’t automatically reject x*.

We flip a coin by selecting a random number, from a Uniform(0,1) distribution.

If the number is smaller than the acceptance probability we accept x* if it is higher we reject x* and start the process over again.

Putting it all togetherWe randomly generate numbers: This is the Monte Carlo partWe allow the numbers we generated to influence the next generated number: This is the Markov chainWe then decide if the new numbers generated are “moving in the right direction”: The Acceptance-rejection algorithmWe then check for convergence: We determine when our data has converged to a reasonable distribution.

The randomly generated values after the convergence point become our posterior distributionI hope that this helped you understand MCMC at an intermediate level.

1.

Borrowed (with permission) directly from Matt Brems’ lecture.

2.

A great but very mathy MCMC explantation: https://blog.

stata.

com/2016/11/15/introduction-to-bayesian-statistics-part-2-mcmc-and-the-metropolis-hastings-algorithm/.. More details