5 Moving Averages in Time-series Data

We can leverage the concept of shift-based methods to calculate our own trends and moving averages in time-series data.

5.1 Simple vs Exponential Weighted Moving Average

We will implement two different kinds of moving average:

Rolling Window Averages, using the rolling method
Exponential Weighted Moving Averages, using the ewm method

Let’s consider the following time series dataset of stock prices:

from pandas import read_csv

request_url = "https://raw.githubusercontent.com/prof-rossetti/python-for-finance/main/docs/data/daily_adjusted_NFLX.csv"
prices_df = read_csv(request_url)
prices_df.head()

	timestamp	open	high	low	close	adjusted_close	volume	split_coefficient
0	2024-06-27	679.35	688.570	678.14	684.34	684.34	2404849	1.0
1	2024-06-26	672.41	682.960	670.02	677.69	677.69	3545815	1.0
2	2024-06-25	673.44	673.665	662.30	672.41	672.41	2278768	1.0
3	2024-06-24	685.00	685.500	664.50	669.02	669.02	3916335	1.0
4	2024-06-21	683.26	686.900	678.10	686.12	686.12	4557086	1.0

Because we’ll be using shift-based methods, we must first sort the data, as usual, by date in ascending order:

prices_df.sort_values(by=["timestamp"], inplace=True) #
prices_df.head()

	timestamp	open	high	low	close	adjusted_close	volume	split_coefficient
99	2024-02-05	562.47	575.7500	557.195	562.06	562.06	4143099	1.0
98	2024-02-06	564.12	566.3399	554.370	555.88	555.88	2840300	1.0
97	2024-02-07	558.16	567.8000	554.980	559.30	559.30	4373591	1.0
96	2024-02-08	560.55	563.7000	555.740	558.53	558.53	3175468	1.0
95	2024-02-09	565.00	566.0000	558.100	561.32	561.32	3055698	1.0

prices_df["ma_50"] = prices_df["adjusted_close"].rolling(window=50).mean()

prices_df["ema_50"] = prices_df["adjusted_close"].ewm(span=50, min_periods=0, adjust=False, ignore_na=False).mean()

prices_df[["timestamp", "adjusted_close", "ma_50", "ema_50"]]

	timestamp	adjusted_close	ma_50	ema_50
99	2024-02-05	562.06	NaN	562.060000
98	2024-02-06	555.88	NaN	561.817647
97	2024-02-07	559.30	NaN	561.718916
96	2024-02-08	558.53	NaN	561.593860
95	2024-02-09	561.32	NaN	561.583121
...	...	...	...	...
4	2024-06-21	686.12	619.8296	632.294057
3	2024-06-24	669.02	620.6344	633.734290
2	2024-06-25	672.41	621.6260	635.250985
1	2024-06-26	677.69	623.0368	636.915260
0	2024-06-27	684.34	624.3732	638.775054

100 rows × 4 columns

import plotly.express as px

px.line(prices_df, x="timestamp", y=["close", "ma_50", "ema_50"],
        title=f"Adjusted Closing Prices",
        color_discrete_map={
            "close": "royalblue",
            "ma_50": "orange",
            "ema_50":"yellow"
        }
)

You’ll notice there are no values for the first N number of periods in our rolling window average (where N is the size of the window). This is because there aren’t enough values to complete the average. It’s OK!

If you would like to change this behavior to be less methodologically strict, we can apply the min_periods parameter, setting minimum number of periods to zero, in which case as many values will be used until we get to fifty, at which point the true rolling average takes over:

# SETTING MIN PERIODS = ZERO:
prices_df["ma_50_min_0"] = prices_df["adjusted_close"].rolling(window=50, min_periods=0).mean()

px.line(prices_df, x="timestamp",
        y=["close", "ma_50_min_0", "ma_50", "ema_50"],
        title=f"Adjusted Closing Prices",
        color_discrete_map={
            "close": "royalblue",
            "ma_50_min_0": "pink",
            "ma_50": "orange",
            "ema_50":"yellow"
        }
)

5.2 Golden Cross

Reference: https://www.investopedia.com/terms/g/goldencross.asp

“A Golden Cross is a chart pattern in which a relatively short-term moving average crosses above a long-term moving average. It is a bullish breakout pattern that forms when a security’s short-term moving average (such as the 50-day moving average) crosses above its long-term moving average (such as the 200-day moving average) or resistance level.”

For this “golden cross” analysis we will use dataset of Bitcoin prices:

from pandas import read_csv, to_datetime

def fetch_crypto_data():
    request_url = (
        "https://raw.githubusercontent.com/"
        "prof-rossetti/applied-data-science-python-book/"
        "main/docs/data/currency-daily-BTC-USD-20141123.csv"
    )
    df = read_csv(request_url)
    df.index = to_datetime(df["timestamp"])
    df.drop(columns=["timestamp"], inplace=True)
    return df

df = fetch_crypto_data()
df.head()

	open	high	low	close	volume
timestamp
2024-11-23	99024.03	99044.11	98450.00	98472.47	1067.160666
2024-11-22	98439.10	99860.00	97209.52	99024.03	29936.622150
2024-11-21	94384.15	99073.53	94127.20	98439.09	31188.093948
2024-11-20	92445.04	94989.99	91600.00	94393.15	20329.039987
2024-11-19	90521.32	94105.84	90403.24	92435.89	18267.161112

Data Source

Recent historical prices for Bitcoin, from the AlphaVantage API, accessed on November 23, 2024.

Source: https://www.alphavantage.co/documentation/#currency-daily

Calculating the short-term and long-term moving averages, using 50 and 200 days respectively:

def add_moving_avg(df, window=50):
    df = df.sort_index() # assumes index is datetime-aware
    df[f"MA_{window}"] = df["close"].rolling(window=window).mean()
    return df

df = add_moving_avg(df, window=50)
df = add_moving_avg(df, window=200)

df = df.dropna()
df = df.round(2)
df.head()

	open	high	low	close	volume	MA_50	MA_200
timestamp
2024-06-26	61794.47	62470.00	60656.80	60816.68	11392.80	66417.97	57733.65
2024-06-27	60818.86	62346.16	60546.94	61615.39	10530.36	66426.89	57822.75
2024-06-28	61611.43	62170.62	59868.00	60313.35	11381.03	66371.68	57918.17
2024-06-29	60312.36	61122.66	60273.80	60885.67	3199.04	66373.64	58015.21
2024-06-30	60884.44	62942.67	60611.01	62668.26	4263.23	66410.72	58114.13

Identifying “golden cross” points where the 50-day average surpasses the 200-day average:

has_crossed = (df['MA_50'] > df['MA_200'])
not_previously_crossed = (df['MA_50'].shift(1) <= df['MA_200'].shift(1))
df['Golden_Cross'] = (has_crossed & not_previously_crossed)

golden_cross_points = df[df['Golden_Cross'] == True]
golden_cross_points[["close", "MA_50", "MA_200"]].head()

	close	MA_50	MA_200
timestamp
2024-10-28	69924.5	63549.13	63260.62

Visualizing the golden cross points:

title = "Bitcoin Prices with Moving Average and Golden Cross"
colors_map = {"close": "steelblue", "MA_50": "yellow", "MA_200": "orange"}

fig = px.line(df, y=["close", "MA_50", "MA_200"], height=450, title=title,
    color_discrete_map=colors_map
)

fig.add_scatter(
    x=golden_cross_points.index,
    y=golden_cross_points['MA_50'],
    mode='markers',
    marker=dict(size=12, color='yellow', symbol='x'),
    name='Golden Cross'
)
fig.show()

5.3 Bollinger Bands

Reference: https://www.investopedia.com/terms/b/bollingerbands.asp

“Bollinger Bands is a technical analysis tool developed by John Bollinger in the 1980s to help investors and traders gauge market volatility and identify when securities are poised to rise or fall.”

“The three lines that make up Bollinger Bands are based on a security’s price moves. The center line is the intermediate-term trend and is typically a 20-day SMA of the closing prices. The upper and lower bands are plotted a distance from the SMA set by a certain number of standard deviations, usually two, above and below the center line.”

Loading the data and sorting time series by date in ascending order, as necessary:

df = fetch_crypto_data()
df = df.sort_index() # assumes index is datetime-aware

Calculating moving average, moving standard deviation, and upper and lower bands:

window = 50
ma_col = f"MA_{window}"
std_col = f"STD_{window}"
df[ma_col] = df["close"].rolling(window=window).mean()
df[std_col] = df["close"].rolling(window=window).std()

band_std = 2
lower_col = f"LOWER_{window}"
upper_col = f"UPPER_{window}"
df[upper_col] = df[ma_col] + (band_std * df[std_col])
df[lower_col] = df[ma_col] - (band_std * df[std_col])

df = df.dropna()
df = df.round(2)

df[["close", ma_col, lower_col, upper_col]].head()

	close	MA_50	LOWER_50	UPPER_50
timestamp
2024-01-28	42034.59	42830.64	39576.58	46084.70
2024-01-29	43306.18	42820.89	39575.70	46066.08
2024-01-30	42942.78	42855.16	39642.53	46067.78
2024-01-31	42548.08	42876.57	39687.24	46065.90
2024-02-01	43078.81	42880.44	39690.59	46070.28

Plotting the bands:

import plotly.express as px

title = "Bitcoin Price with Moving Average and Bollinger Bands"
colors_map = {"close": "steelblue", ma_col: "orange",
                upper_col: "yellow", lower_col: "yellow",
}
fig = px.line(df, y=["close", ma_col, upper_col, lower_col],
                title=title, color_discrete_map=colors_map,
                labels={"value": "Price (USD)", "timestamp": "Date"},
)
fig.update_layout(legend_title_text="Legend")
fig.show()

Plotting bands (alternative fill variant):

from pandas import concat, Series

# fill regions:
fig.add_traces([
    dict(
        x=concat([Series(df.index), Series(df.index)[::-1]]),
        y=concat([df[upper_col], df[lower_col][::-1]]),
        fill="toself",
        #fillcolor="rgba(128, 0, 128, 0.2)",  # Light purple fill
        fillcolor="rgba(255, 255, 102, 0.2)",  # Light yellow fill
        line=dict(color="rgba(255,255,255,0)"),
        hoverinfo="skip",
        showlegend=False
    )
])
fig.show()

How can we interpret the chart? Are the bands getting narrower or wider?