Let’s move ahead to understand and explore this data further.

Exploratory Data Analysis on Stock Pricing DataWith the data in our hands, the first thing we should do is understand what it represents and what kind of information does it encapsulate.

Printing the DataFrame’s info, we can see all that it contains:As seen in the screenshot above, the DataFrame contains DatetimeIndex, which means we’re dealing with time-series data.

An index can be thought of as a data structure which helps to modify or reference the data.

Time-series data is a sequence of snapshots of prices taken at consecutive, equally spaced intervals of time.

In trading, EOD stock pricing data captures the movement of the certain parameters about a stock, such as the stock price, over a specified period of time with data points recorded at regular intervals.

Important TerminologyLooking at other columns, let’s try to understand what each column represents:Open/Close — Captures the opening/closing price of the stockAdj_Open/Adj_Close — An adjusted opening/closing price is a stock’s price on any given day of trading that has been revised to include any dividend distributions, stock splits, and other corporate actions that occurred at any time before the next day’s open.

Volume — It records the number of shares which are being traded on any given day of trading.

High/Low — It tracks the highest and the lowest price of the stock during a particular day of trading.

These are the important columns that we will focus on at this point in time.

We can learn about the summary statistics of the data, which shows us the number of rows, mean, max, standard deviations, etc.

Try running the following line of code in the Ipython cell:msft_data.

describe()resample()Pandas’ resample() method is used to facilitate control and flexibility on the frequency conversion of the time series data.

We can specify the time intervals to resample the data to monthly, quarterly, or yearly, and perform the required operation over it.

msft_data.

resample('M').

mean()This is an interesting way to analyze the stock performance in different timeframes.

Calculating returnsA financial return is simply the money made or lost on an investment.

A return can be expressed nominally as the change in the amount of an investment over time.

A return can be calculated as the percentage derived from the ratio of profit to investment.

We have the pct_change() at our disposal for this purpose.

Here is how you can calculate returns:# Import numpy packageimport numpy as np# assign `Adj Close` to `daily_close`daily_close = msft_data[['Adj_Close']]# returns as fractional changedaily_return = daily_close.

pct_change()# replacing NA values with 0daily_return.

fillna(0, inplace=True)print(daily_return)This will print the returns that the stock has been generating on daily basis.

Multiplying the number by 100 will give you the percentage change.

The formula used in pct_change() is:Return = {(Price at t) — (Price at t-1)} / {Price at t-1}Now, to calculate monthly returns, all you need to do is:mdata = msft_data.

resample('M').

apply(lambda x: x[-1])monthly_return = mdata.

pct_change()After resampling the data to months (for business days), we can get the last day of trading in the month using the apply() function.

apply() takes in a function and applies it to each and every row of the Pandas series.

lambda function is an anonymous function in Python which can be defined without a name, and only takes expressions in the following format:Lambda: expressionFor example, lambda x: x * 2 is a lambda function.

Here, x is the argument and x * 2 is the expression that gets evaluated and returned.

Moving Averages in TradingThe concept of moving averages is going to build the base for our momentum-based trading strategy.

In finance, analysts often have to evaluate statistical metrics continually over a sliding window of time, which is called moving window calculations.

Let’s see how we can calculate the rolling mean over a window of 50 days, and slide the window by 1 day.

rolling()This is the magical function which does the tricks for us:# assigning adjusted closing prices to adj_pricesadj_price = msft_data['Adj_Close']# calculate the moving averagemav = adj_price.

rolling(window=50).

mean()# print the resultprint(mav[-10:])You’ll see the rolling mean over a window of 50 days (approx.

2 months).

Moving averages help smooth out any fluctuations or spikes in the data, and gives you a smoother curve for the performance of the company.

We can plot and see the difference:# import the matplotlib package to see the plotimport matplotlib.

pyplot as pltadj_price.

plot()You can now plot the rolling mean()mav.

plot()And you can see the difference for yourself, how the spikes in the data are consumed to give a general sentiment around the performance of the stock.

Formulating a Trading StrategyHere comes the final and interesting part, designing and making the trading strategy.

This will be a step-by-step guide to developing a momentum-based Simple Moving Average Crossover (SMAC) strategy.

Momentum-based strategies are based on a technical indicator which capitalizes on the continuance of the market trend.

We purchase securities that show an upwards trend and short-sell securities which show a downward trend.

The SMAC strategy is a well-known schematic momentum strategy.

It is a long-only strategy.

Momentum, here, is the total return of stock including the dividends over the last n months.

This period of n months is called the lookback period.

There are 3 main types of lookback periods: short term, intermediate term, and long term.

We need to define 2 different lookback periods of a particular time series.

A buy signal is generated when the shorter lookback rolling means (or moving average) overshoots the longer lookback moving average.

A sell signal occurs when the shorter lookback moving average dips below the longer moving average.

Now, let’s see how the code for this strategy will look:# step1: initialize the short and long lookback periodsshort_lb = 50long_lb = 120# step2: initialize a new DataFrame called signal_df with the signal columnsignal_df = pd.

DataFrame(index=msft_data.

index)signal_df['signal'] = 0.

0# step3: create a short simple moving average over the short lookback periodsignal_df['short_mav'] = msft_data['Adj_Close'].

rolling(window=short_lb, min_periods=1, center=False).

mean()# step4: create long simple moving average over the long lookback periodsignal_df['long_mav'] = msft_data['Adj_Close'].

rolling(window=long_lb, min_periods=1, center=False).

mean()# step5: generate the signals based on the conditional statementsignal_df['signal'][short_lb:] = np.

where(signal_df['short_mav'][short_lb:] > signal_df['long_mav'][short_lb:], 1.

0, 0.

0) # step6: create the trading orders based on the positions columnsignal_df['positions'] = signal_df['signal'].

diff()signal_df[signal_df['positions'] == -1.

0]Let’s see what’s happening here.

We have created 2 lookback periods, the short lookback period short_lb is of 50 days and longer lookback period for the long moving average is defined as long_lb of 120 days.

We have created a new DataFrame which is designed to capture the signals which are being generated whenever the short moving average crosses the long moving average using the np.

where and assigning 1.

0 for true and 0.

0 if the condition comes out to be false.

The positions columns in the DataFrame tells us if there is a buy signal or a sell signal, or to stay put.

We're basically calculating the difference in the signals column from the previous row using diff.

And there we have our strategy implemented in just 6 steps using Pandas.

Easy, isn’t it?Now, let’s try to visualize this using Matplotlib.

All we need to do is initialize a plot figure, add the adjusted closing prices, short and long moving averages to the plot, and then plot the buy and sell signals using the positions column in the signal_df above:# initialize the plot using pltfig = plt.

figure()# Add a subplot and label for y-axisplt1 = fig.

add_subplot(111, ylabel='Price in $')msft_data['Adj_Close'].

plot(ax=plt1, color='r', lw=2.

)# plot the short and long lookback moving averagessignal_df[['short_mav', 'long_mav']].

plot(ax=plt1, lw=2.

, figsize=(12,8))# plotting the sell signalsplt1.

plot(signal_df.

loc[signal_df.

positions == -1.

0].

index, signal_df.

short_mav[signal_df.

positions == -1.

0], 'v', markersize=10, color='k')# plotting the buy signalsplt1.

plot(signal_df.

loc[signal_df.

positions == 1.

0].

index, signal_df.

short_mav[signal_df.

positions == 1.

0], '^', markersize=10, color='m') # Show the plotplt.

show()Running the above cell in the Jupyter notebook would yield a plot like the one below:Now, you can clearly see that whenever the blue line (short moving average) goes up and beyond the orange line (long moving average), there is a pink upward marker indicating a buy signal.

A sell signal is denoted by a black downward marker where there there’s a fall of the short_mav below long_mav.

Visualize the Performance of the Strategy on QuantopianQuantopian is a Zipline powered platform which has manifold use cases.

You can write your own algorithms, access free data, backtest your strategy, contribute to the community, and collaborate with Quantopian if you need capital.

We have written an algorithm to backtest our SMA strategy, and here are the results:Here is an explanation of the above metrics:Total return: The total percentage return of the portfolio from the start to the end of the backtest.

Specific return: The difference between the portfolio’s total returns and common returns.

Common return: Returns that are attributable to common risk factors.

There are 11 sector and 5 style risk factors that make up these returns.

The Sector Exposure and Style Exposure charts in the Risk section provide more detail on these factors.

Sharpe: The 6-month rolling Sharpe ratio.

It is a measure of risk-adjusted investment.

It is calculated by dividing the portfolio’s excess returns over the risk-free rate by the portfolio’s standard deviation.

Max Drawdown: The largest drop of all the peak-to-trough movement in the portfolio’s history.

Volatility: Standard deviation of the portfolio’s returns.

Pat yourself on the back as you have successfully implemented your quantitative trading strategy!Where to go From Here?Now that your algorithm is ready, you’ll need to backtest the results and assess the metrics mapping the risk involved in the strategy and the stock.

Again, you can use Quantopian to learn more about backtesting and trading strategies.

.