Time Series Forecasting with ProphetLearn how to use Facebook’s Prophet to predict air qualityMarco PeixeiroBlockedUnblockFollowFollowingFeb 18Photo by Frédéric Paulussen on UnsplashProducing high quality forecasts is hard for many machine learning engineers.
It requires a substantial amount of experience and and very specific skills.
Also, other forecasting tools were too inflexible to incorporate useful assumptions.
For those reasons, Facebook open sourced Prophet, a forecasting tool available in both Python and R.
This tool allows both experts and non-experts to produce high quality forecasts with minimal efforts.
Here, we will use Prophet to help us predict air quality!The full notebook and dataset can be found here.
Let’s make some predictions!We won’t go back to the future, unfortunatelyImport and clean the dataAs always, we start by importing some useful libraries:Then import the dataset and preview it:And you should see the following:First five entries of the datasetAs you can see, the dataset contains information about the concentrations of different gases.
They were recorded at every hour for each day.
You can find a description of all features here.
If you explore the dataset a bit more, you will notice that there are many instances of the value -200.
Of course, it does not make sense to have a negative concentration, so we will need to clean the data before modelling.
First, we get rid of all instances where there is an empty value:After, we need to parse the date column as a date, and turn all measurements into floats:Then, we aggregate the data by day, by taking the average of each measurement:At this point, the data should look like this:We still have some NaN that we need to get rid of.
We can see how many NaN are present in each column with:Let’s get rid of the columns that have more than 8 NaN:Perfect!.Now, we should aggregate the data by week, because it will give a smoother trend to analyze.
Awesome!.Now, we are ready to explore the data a bit more.
Exploratory data analysis (EDA)Let’s plot each column of the dataset:Take the time to look at each plot and identify interesting trends.
For the sake of length, we will only take the concentration of NOx.
Oxides of nitrogen are very harmful, as they react to form smog and acid rain, as well as being responsible for the formation of fine particles and ground level ozone.
These have adverse health effects, so the concentration of NOx is a key feature of air quality.
Therefore, let’s remove all irrelevant columns before moving on to modelling:ModellingWe start by importing Prophet:Then, Prophet requires the date column to be named ds and the feature column to be named y:Now, our data should be like this:Then, we define a training set.
For that we will hold out the last 30 entries for prediction and validation.
Then, we simply initialize Prophet, fit the model to the data, and make predictions!And you should see the following:Great!.Here, yhat represents the prediction, while yhat_lower and yhat_upper represent the lower and upper bound of the prediction respectively.
Prophet allows you to easily plot the forecast:And we get:NOx concentration forecastAs you can see, Prophet simply used a straight downward line to predict the concentration of NOx in the future.
You can also use a command to see if the time series has any interesting features, such as seasonality:And you get:Here, Prophet only identified a downward trend with no seasonality.
Now, let’s evaluate performance of the model by calculating its mean absolute percentage error (MAPE) and mean absolute error (MAE):And you should see that the MAPE is 13.
86% and the MAE is 109.
32, which is not that bad!.Remember that we did not fine tune the model at all.
Finally, let’s just plot the forecast with its upper and lower bounds:And you get:Forecast of the average weekly NOx concentrationGreat!.You learned how to use Prophet for time series forecasting.
Note that Prophet is not suitable for all situations, so make sure to read this before you use Prophet.
I will be more than happy to answer your questions!Cheers!.