5  Moving Averages in Time-series Data

We can leverage the concept of shift-based methods to calculate our own trends and moving averages in time-series data.

5.1 Simple vs Exponential Weighted Moving Average

We will implement two different kinds of moving average:

Let’s consider the following time series dataset of stock prices:

from pandas import read_csv

request_url = "https://raw.githubusercontent.com/prof-rossetti/python-for-finance/main/docs/data/daily_adjusted_NFLX.csv"
prices_df = read_csv(request_url)
prices_df.head()
timestamp open high low close adjusted_close volume dividend_amount split_coefficient
0 2024-06-27 679.35 688.570 678.14 684.34 684.34 2404849 0.0 1.0
1 2024-06-26 672.41 682.960 670.02 677.69 677.69 3545815 0.0 1.0
2 2024-06-25 673.44 673.665 662.30 672.41 672.41 2278768 0.0 1.0
3 2024-06-24 685.00 685.500 664.50 669.02 669.02 3916335 0.0 1.0
4 2024-06-21 683.26 686.900 678.10 686.12 686.12 4557086 0.0 1.0

Because we’ll be using shift-based methods, we must first sort the data, as usual, by date in ascending order:

prices_df.sort_values(by=["timestamp"], inplace=True) #
prices_df.head()
timestamp open high low close adjusted_close volume dividend_amount split_coefficient
99 2024-02-05 562.47 575.7500 557.195 562.06 562.06 4143099 0.0 1.0
98 2024-02-06 564.12 566.3399 554.370 555.88 555.88 2840300 0.0 1.0
97 2024-02-07 558.16 567.8000 554.980 559.30 559.30 4373591 0.0 1.0
96 2024-02-08 560.55 563.7000 555.740 558.53 558.53 3175468 0.0 1.0
95 2024-02-09 565.00 566.0000 558.100 561.32 561.32 3055698 0.0 1.0
prices_df["ma_50"] = prices_df["adjusted_close"].rolling(window=50).mean()

prices_df["ema_50"] = prices_df["adjusted_close"].ewm(span=50, min_periods=0, adjust=False, ignore_na=False).mean()

prices_df[["timestamp", "adjusted_close", "ma_50", "ema_50"]]
timestamp adjusted_close ma_50 ema_50
99 2024-02-05 562.06 NaN 562.060000
98 2024-02-06 555.88 NaN 561.817647
97 2024-02-07 559.30 NaN 561.718916
96 2024-02-08 558.53 NaN 561.593860
95 2024-02-09 561.32 NaN 561.583121
... ... ... ... ...
4 2024-06-21 686.12 619.8296 632.294057
3 2024-06-24 669.02 620.6344 633.734290
2 2024-06-25 672.41 621.6260 635.250985
1 2024-06-26 677.69 623.0368 636.915260
0 2024-06-27 684.34 624.3732 638.775054

100 rows × 4 columns

import plotly.express as px

px.line(prices_df, x="timestamp", y=["close", "ma_50", "ema_50"],
        title=f"Adjusted Closing Prices",
        color_discrete_map={
            "close": "royalblue",
            "ma_50": "orange",
            "ema_50":"yellow"
        }
)

You’ll notice there are no values for the first N number of periods in our rolling window average (where N is the size of the window). This is because there aren’t enough values to complete the average. It’s OK!

If you would like to change this behavior to be less methodologically strict, we can apply the min_periods parameter, setting minimum number of periods to zero, in which case as many values will be used until we get to fifty, at which point the true rolling average takes over:

# SETTING MIN PERIODS = ZERO:
prices_df["ma_50_min_0"] = prices_df["adjusted_close"].rolling(window=50, min_periods=0).mean()

px.line(prices_df, x="timestamp",
        y=["close", "ma_50_min_0", "ma_50", "ema_50"],
        title=f"Adjusted Closing Prices",
        color_discrete_map={
            "close": "royalblue",
            "ma_50_min_0": "pink",
            "ma_50": "orange",
            "ema_50":"yellow"
        }
)

5.2 Golden Cross

Reference: https://www.investopedia.com/terms/g/goldencross.asp

“A Golden Cross is a chart pattern in which a relatively short-term moving average crosses above a long-term moving average. It is a bullish breakout pattern that forms when a security’s short-term moving average (such as the 50-day moving average) crosses above its long-term moving average (such as the 200-day moving average) or resistance level.”

For this “golden cross” analysis we will use dataset of Bitcoin prices:

from pandas import read_csv, to_datetime

def fetch_crypto_data():
    request_url = (
        "https://raw.githubusercontent.com/"
        "prof-rossetti/applied-data-science-python-book/"
        "main/docs/data/currency-daily-BTC-USD-20141123.csv"
    )
    df = read_csv(request_url)
    df.index = to_datetime(df["timestamp"])
    df.drop(columns=["timestamp"], inplace=True)
    return df

df = fetch_crypto_data()
df.head()
open high low close volume
timestamp
2024-11-23 99024.03 99044.11 98450.00 98472.47 1067.160666
2024-11-22 98439.10 99860.00 97209.52 99024.03 29936.622150
2024-11-21 94384.15 99073.53 94127.20 98439.09 31188.093948
2024-11-20 92445.04 94989.99 91600.00 94393.15 20329.039987
2024-11-19 90521.32 94105.84 90403.24 92435.89 18267.161112
Data Source

Recent historical prices for Bitcoin, from the AlphaVantage API, accessed on November 23, 2024.

Source: https://www.alphavantage.co/documentation/#currency-daily

Calculating the short-term and long-term moving averages, using 50 and 200 days respectively:

def add_moving_avg(df, window=50):
    df = df.sort_index() # assumes index is datetime-aware
    df[f"MA_{window}"] = df["close"].rolling(window=window).mean()
    return df

df = add_moving_avg(df, window=50)
df = add_moving_avg(df, window=200)

df = df.dropna()
df = df.round(2)
df.head()
open high low close volume MA_50 MA_200
timestamp
2024-06-26 61794.47 62470.00 60656.80 60816.68 11392.80 66417.97 57733.65
2024-06-27 60818.86 62346.16 60546.94 61615.39 10530.36 66426.89 57822.75
2024-06-28 61611.43 62170.62 59868.00 60313.35 11381.03 66371.68 57918.17
2024-06-29 60312.36 61122.66 60273.80 60885.67 3199.04 66373.64 58015.21
2024-06-30 60884.44 62942.67 60611.01 62668.26 4263.23 66410.72 58114.13

Identifying “golden cross” points where the 50-day average surpasses the 200-day average:

has_crossed = (df['MA_50'] > df['MA_200'])
not_previously_crossed = (df['MA_50'].shift(1) <= df['MA_200'].shift(1))
df['Golden_Cross'] = (has_crossed & not_previously_crossed)

golden_cross_points = df[df['Golden_Cross'] == True]
golden_cross_points[["close", "MA_50", "MA_200"]].head()
close MA_50 MA_200
timestamp
2024-10-28 69924.5 63549.13 63260.62

Visualizing the golden cross points:

title = "Bitcoin Prices with Moving Average and Golden Cross"
colors_map = {"close": "steelblue", "MA_50": "yellow", "MA_200": "orange"}

fig = px.line(df, y=["close", "MA_50", "MA_200"], height=450, title=title,
    color_discrete_map=colors_map
)

fig.add_scatter(
    x=golden_cross_points.index,
    y=golden_cross_points['MA_50'],
    mode='markers',
    marker=dict(size=12, color='yellow', symbol='x'),
    name='Golden Cross'
)
fig.show()

5.3 Bollinger Bands

Reference: https://www.investopedia.com/terms/b/bollingerbands.asp

“Bollinger Bands is a technical analysis tool developed by John Bollinger in the 1980s to help investors and traders gauge market volatility and identify when securities are poised to rise or fall.”

“The three lines that make up Bollinger Bands are based on a security’s price moves. The center line is the intermediate-term trend and is typically a 20-day SMA of the closing prices. The upper and lower bands are plotted a distance from the SMA set by a certain number of standard deviations, usually two, above and below the center line.”

Loading the data and sorting time series by date in ascending order, as necessary:

df = fetch_crypto_data()
df = df.sort_index() # assumes index is datetime-aware

Calculating moving average, moving standard deviation, and upper and lower bands:

window = 50
ma_col = f"MA_{window}"
std_col = f"STD_{window}"
df[ma_col] = df["close"].rolling(window=window).mean()
df[std_col] = df["close"].rolling(window=window).std()

band_std = 2
lower_col = f"LOWER_{window}"
upper_col = f"UPPER_{window}"
df[upper_col] = df[ma_col] + (band_std * df[std_col])
df[lower_col] = df[ma_col] - (band_std * df[std_col])

df = df.dropna()
df = df.round(2)

df[["close", ma_col, lower_col, upper_col]].head()
close MA_50 LOWER_50 UPPER_50
timestamp
2024-01-28 42034.59 42830.64 39576.58 46084.70
2024-01-29 43306.18 42820.89 39575.70 46066.08
2024-01-30 42942.78 42855.16 39642.53 46067.78
2024-01-31 42548.08 42876.57 39687.24 46065.90
2024-02-01 43078.81 42880.44 39690.59 46070.28

Plotting the bands:

import plotly.express as px

title = "Bitcoin Price with Moving Average and Bollinger Bands"
colors_map = {"close": "steelblue", ma_col: "orange",
                upper_col: "yellow", lower_col: "yellow",
}
fig = px.line(df, y=["close", ma_col, upper_col, lower_col],
                title=title, color_discrete_map=colors_map,
                labels={"value": "Price (USD)", "timestamp": "Date"},
)
fig.update_layout(legend_title_text="Legend")
fig.show()

Plotting bands (alternative fill variant):

from pandas import concat, Series

# fill regions:
fig.add_traces([
    dict(
        x=concat([Series(df.index), Series(df.index)[::-1]]),
        y=concat([df[upper_col], df[lower_col][::-1]]),
        fill="toself",
        #fillcolor="rgba(128, 0, 128, 0.2)",  # Light purple fill
        fillcolor="rgba(255, 255, 102, 0.2)",  # Light yellow fill
        line=dict(color="rgba(255,255,255,0)"),
        hoverinfo="skip",
        showlegend=False
    )
])
fig.show()

How can we interpret the chart? Are the bands getting narrower or wider?