and ????, joint strong stationarity is defined by the same condition of strong stationarity, but is simply imposed on the joint cumulative distribution function of the two processes.

Weak stationarity and N-th order stationarity can be extended in the same way (the latter to M-N-th order joint stationarity).

The intrinsic hypothesisA weaker form of weak stationarity, prominent in geostatistical literature (see [Myers 1989] and [Fischer et al.

1996], for example).

The intrinsic hypothesis holds for a stochastic process ????={Xᵢ} if:The expected difference between values at any two places separated by distance r is zero: E[xᵢ-xᵢ₊ᵣ]=0The variance of differences, given by Var[xᵢ-xᵢ₊ᵣ], exists (i.

e.

it’s finite) and depends only the distance r.

This notion implies weak stationarity of the difference Xᵢ-Xᵢ₊ᵣ, and was extended with a definition of N-th order intrinsic hypothesis.

The definitions of stationarity presented so far were non-parametric; i.

e.

, they did not assume a model for the data generating process, and thus apply to any stochastic process.

The related concept of a difference stationarity and unit root processes, however, requires a brief introduction to stochastic process modeling.

The topic of stochastic modeling is also relevant insofar as various simple models can be used to create stochastic processes (see figure 5).

Figure 5: Various non-stationary processes (the purple white noise process is an exception).

Stochastic process modelingA common task in the study of time series data is forecasting future values.

To that end, some assumptions are made on the Data Generating Process (DGP), the mechanism generating the data.

These assumptions often take the form of an explicit model of the process, and are also often used when modeling stochastic processes for other tasks, such as anomaly detection or causal inference.

We will go over the three most common such models.

The autoregressive (AR) model: A time series modeled using an AR model is assumed to be generated as a linear function of its past values, plus a random noise/error:Equation 4: The autoregressive model.

This is a memory-based model, in the sense that each value is correlated with the p preceding values; an AR model with lag p is denoted with AR(p).

The coefficients ????ᵢ are weights measuring the influence of these preceding values on the value x[t], c is constant intercept and εᵢ is a univariate white noise process (commonly assumed to be Gaussian).

The vector autoregressive (VAR) model generalizes the univariate case of the AR model to the multivariate case; now each element of the vector x[t] of length k can be modeled as a linear function of all the elements of the past p vectors:Equation 5: The vector autoregressive model.

where c is a vector of k constants (the intercepts), Aᵢ are time-invariant k×k matrixes and e={eᵢ ; i∈ℤ} is a white noise multivariate process of k variables.

The moving average (MA) model: A time series modeled using a moving average model, denoted with MA(q), is assumed to be generated as a linear function of the last q+1 random shocks generated by εᵢ, a univariate white noise process:Equation 6: The moving average model.

Like for autoregressive models, a vector generalization, VMA, exists.

The autoregressive moving average (ARMA) model: A time series modeled using an ARMA(p,q) model is assumed to be generated as a linear function of the last p values and the last q+1 random “shocks” generated by εᵢ, a univariate white noise process:Equation 7: The ARMA model.

The ARMA model can be generalized in a variety of ways, for example to deal with non-linearity or with exogenous variables, to the multivariate case (VARMA) or to deal with (a specific type of) non-stationary data (ARIMA).

Difference stationary processesWith a basic understanding of common stochastic process models, we can now discuss the related concept of difference stationary processes and unit roots.

This concept relies on the assumption that the stochastic process in question can be written as an autoregressive process of order p, denoted as AR(p):Equation 8: An autoregressive process of order p, or AR(p).

Where εᵢ are usually uncorrelated white-noise processes (for all times t).

We can write the same process as:Equation 9: An AR(p) model written using lag operators.

The part inside the parenthesis on the left is called the characteristic equation of the process.

We can consider the roots of this equation:Equation 10: The characteristic equation of a AR(p) model.

If m=1 is a root of the equation then the stochastic process is said to be a difference stationary process, or integrated.

This means that the process can transformed into a weakly-stationary process by applying a certain type of transformation to it, called differencing.

Difference stationary processes have an order of integration, which is the number of times the differencing operator must be applied to it in order to achieve weak stationarity.

A process that has to be differenced r times is said to be integrated of order r, denoted by I(r).

This coincides exactly with the multiplicity of the root m=1; meaning, if m=1 is a root of multiplicity r of the characteristic equation, then the process is integrated of order r.

Unit root processesA common sub-type of difference stationary process are processes integrated of order 1, also called unit root process.

The simplest example for such a process is the following autoregressive model:Unit root processes, and difference stationary processes generally, are interesting because they are non-stationary processes that can be easily transformed into weakly stationary processes.

As a result, while the term is not used interchangeably with non-stationarity, the questions regarding them sometimes are.

I thought it worth mentioning here, as sometime tests and procedures to check whether a process has a unit root are mistakingly thought of as procedures for testing non-stationarity (as a future post will touch upon).

It is thus important to remember that these are distinct notions, and that while every process with a unit root is non-stationary, and so is every processes integrated to an order r>1, the opposite is far from true.

The following typology figure, partial as it may be, can help understand the relations between different types of stochastic process models and stochastic model characteristics defined in this post:Figure 6: Types of stochastic processesIf you are interested in the concept of stationarity, or have stumbled into the topic while working with time series data, then I hope you have found the post providing a good introduction to the subject.

Some references and useful links are found below.

As I have mentioned, future posts will aim to provide similarly concise overviews of methods of detection and transformation of non-stationarity time series data.

Also, please feel free to get in touch with me regarding any comments and thoughts on the post or the topic.

ReferencesAcademic Literature[Boshnakov, 2011] G.

Boshnakov.

On First and Second Order Stationarity of Random Coefficient Models.

Linear Algebra Appl.

434, 415–423.

2011.

[Cox & Miller, 1965] Cox, D.

R.

; and Miller, H.

D.

, 1965, The Theory of Stochastic Processes: Methuen, London, 398 p.

[Dyrhovden, 2016] Dyrhovden, Sigve Brix.

2016.

Stochastic unit-root processes.

The University of Bergen.

[Fischer et al.

1996] Fischer, M.

Scholten, H.

J.

and Unwin, D.

Editors.

Spatial analytical perspectives on GIS.

Bristol, PA : Taylor & Francis, — GISDATA ; 4.

[Myers, 1989] Myers, D.

E.

, 1989.

To be or not to be .

.

.

stationary?.That is the question.

Math.

Geol.

21, 347–362.

[Nason, 2006] Nason, GP 2006, Stationary and non-stationary time series.

in H Mader & SC Coles (eds), Statistics in Volcanology.

The Geological Society, pp.

129–142.

Online ReferencesA Gentle Introduction to Handling a Non-Stationary Time Series in Python at Analytics VidhyaUnit Root at WikipediaLesson 4: Stationary stochastic processes from Umberto Triacca’s course on stochastic processesRoots of characteristic equation reciprocal to roots of its inverseStochastic Process CharacteristicsTrend-Stationary vs.

Difference-Stationary ProcessesFootnotesThe phrasing here is erroneous, since — as we will soon see — time series cannot be stationary themselves, rather only the processes generating them can.

I have used it, however, so as not to assume any knowledge for the opening paragraphs.

↺The common synonym of weak-sense stationarity as second order stationarity is probably related to (but should not be confused with) second order stochastic process, which is defined as a stochastic process that has a finite second moment (i.

e.

variance).

↺.. More details