We are ready for exploratory data analysis!Exploratory Data Analysis (EDA)Let’s see what the closing price looks like:And you get:Closing price of the New Germany Fund (GF)Clearly, you see that this is not a stationary process, and it is hard to tell if there is some kind of seasonality.
Moving averageLet’s use the moving average model to smooth our time series.
For that, we will use a helper function that will run the moving average model on a specified time window and it will plot the result smoothed curve:Now, using a time window of 5 days, we see:Smoothed curve by the previous trading weekAs you can see, we can hardly see a trend, because it is too close to actual curve.
Let’s see the result of smoothing by the previous month, and previous quarter.
Smoothed by the previous month (30 days)Smoothed by the previous quarter (90 days)Trends are easier to spot now.
Notice how the 30-day and 90-day trend show a downward curve at the end.
This might mean that the stock is likely to go down in the following days.
Exponential smoothingNow, let’s use exponential smoothing to see if it can pick up a better trend.
Here, we use 0.
05 and 0.
3 as values for the smoothing factor.
Feel free to try other values and see what the result is.
Exponential smoothingAs you can see, an alpha value of 0.
05 smoothed the curve while picking up most of the upward and downward trends.
Now, let’s use double exponential smoothing:And you get:Double exponential smoothingAgain, experiment with different alpha and beta combinations to get better looking curves.
ModellingLet’s go to modelling the time series to make predictions.
First, as outlined in the previous post, we must turn our series into a stationary process in order to model it.
Therefore, let’s apply the Dickey-Fuller test to see if it is a stationary process:And you get:By the Dickey-Fuller test, the time series is unsurprisingly non-stationary.
Also, looking at the autocorrelation plot, we see that it is very high, and it seems that there is no clear seasonality.
Therefore, to get rid of the high autocorrelation and to make the process stationary, let’s take the first difference.
We simply the time series from itself with a lag of one day:And you get:Awesome!.Our series is now stationary and we can start modelling!SARIMANow, for SARIMA, we will define a few parameters and a range of values for other parameters to generate a list of all possible combinations of p, q, d, P, Q, D, s:From this code cell, you should see that we have 625 different combinations!.Now, we will try each combination and train SARIMA with each so to find the best performing model.
This might take while depending on your computer’s processing power:Once this is done, you can get a summary of the best model using:And you should get something similar to this:Awesome!.Now, we can predict the closing price of the next five trading days and evaluate the MAPE of the model:In this case, we have a MAPE of 0.
79%, which is very good!Now, to compare our prediction with actual data, I took financial data from Yahoo Finance and created a dataframe:And you get:New dataframe created for comparisonPerfect!.Now, we can plot to see how far we were from the actual closing prices:And we see:Comparison of predicted and actual closing pricesIt seems that were a bit off in our predictions.
In fact, we missed an opportunity to make money, since our predictions result in a net loss, whereas the actual closing prices show a net gain.
That’s it for this project!.You saw how to apply the most popular techniques for time series forecasting.
A good improvement for this project would be to consider other variables, such as trading volume, to help predict the closing price of the stock.
I hope you learned a lot during this tutorial.
In the next one, we will use Prophet from Facebook to make time series forecasting!Cheers!.. More details