We can leverage the concept of shift-based methods to calculate our own trends and moving averages in time-series data.
Simple vs Exponential Weighted Moving Average
We will implement two different kinds of moving average:
Let’s consider the following time series dataset of stock prices:
from pandas import read_csv
request_url = "https://raw.githubusercontent.com/prof-rossetti/python-for-finance/main/docs/data/daily_adjusted_NFLX.csv"
prices_df = read_csv(request_url)
prices_df.head()
0
2024-06-27
679.35
688.570
678.14
684.34
684.34
2404849
0.0
1.0
1
2024-06-26
672.41
682.960
670.02
677.69
677.69
3545815
0.0
1.0
2
2024-06-25
673.44
673.665
662.30
672.41
672.41
2278768
0.0
1.0
3
2024-06-24
685.00
685.500
664.50
669.02
669.02
3916335
0.0
1.0
4
2024-06-21
683.26
686.900
678.10
686.12
686.12
4557086
0.0
1.0
Because we’ll be using shift-based methods, we must first sort the data, as usual, by date in ascending order:
prices_df.sort_values(by= ["timestamp" ], inplace= True ) #
prices_df.head()
99
2024-02-05
562.47
575.7500
557.195
562.06
562.06
4143099
0.0
1.0
98
2024-02-06
564.12
566.3399
554.370
555.88
555.88
2840300
0.0
1.0
97
2024-02-07
558.16
567.8000
554.980
559.30
559.30
4373591
0.0
1.0
96
2024-02-08
560.55
563.7000
555.740
558.53
558.53
3175468
0.0
1.0
95
2024-02-09
565.00
566.0000
558.100
561.32
561.32
3055698
0.0
1.0
prices_df["ma_50" ] = prices_df["adjusted_close" ].rolling(window= 50 ).mean()
prices_df["ema_50" ] = prices_df["adjusted_close" ].ewm(span= 50 , min_periods= 0 , adjust= False , ignore_na= False ).mean()
prices_df[["timestamp" , "adjusted_close" , "ma_50" , "ema_50" ]]
99
2024-02-05
562.06
NaN
562.060000
98
2024-02-06
555.88
NaN
561.817647
97
2024-02-07
559.30
NaN
561.718916
96
2024-02-08
558.53
NaN
561.593860
95
2024-02-09
561.32
NaN
561.583121
...
...
...
...
...
4
2024-06-21
686.12
619.8296
632.294057
3
2024-06-24
669.02
620.6344
633.734290
2
2024-06-25
672.41
621.6260
635.250985
1
2024-06-26
677.69
623.0368
636.915260
0
2024-06-27
684.34
624.3732
638.775054
100 rows × 4 columns
import plotly.express as px
px.line(prices_df, x= "timestamp" , y= ["close" , "ma_50" , "ema_50" ],
title= f"Adjusted Closing Prices" ,
color_discrete_map= {
"close" : "royalblue" ,
"ma_50" : "orange" ,
"ema_50" :"yellow"
}
)
You’ll notice there are no values for the first N number of periods in our rolling window average (where N is the size of the window). This is because there aren’t enough values to complete the average. It’s OK!
If you would like to change this behavior to be less methodologically strict, we can apply the min_periods
parameter, setting minimum number of periods to zero, in which case as many values will be used until we get to fifty, at which point the true rolling average takes over:
# SETTING MIN PERIODS = ZERO:
prices_df["ma_50_min_0" ] = prices_df["adjusted_close" ].rolling(window= 50 , min_periods= 0 ).mean()
px.line(prices_df, x= "timestamp" ,
y= ["close" , "ma_50_min_0" , "ma_50" , "ema_50" ],
title= f"Adjusted Closing Prices" ,
color_discrete_map= {
"close" : "royalblue" ,
"ma_50_min_0" : "pink" ,
"ma_50" : "orange" ,
"ema_50" :"yellow"
}
)
Golden Cross
Reference: https://www.investopedia.com/terms/g/goldencross.asp
“A Golden Cross is a chart pattern in which a relatively short-term moving average crosses above a long-term moving average. It is a bullish breakout pattern that forms when a security’s short-term moving average (such as the 50-day moving average) crosses above its long-term moving average (such as the 200-day moving average) or resistance level.”
For this “golden cross” analysis we will use dataset of Bitcoin prices:
from pandas import read_csv, to_datetime
def fetch_crypto_data():
request_url = (
"https://raw.githubusercontent.com/"
"prof-rossetti/applied-data-science-python-book/"
"main/docs/data/currency-daily-BTC-USD-20141123.csv"
)
df = read_csv(request_url)
df.index = to_datetime(df["timestamp" ])
df.drop(columns= ["timestamp" ], inplace= True )
return df
df = fetch_crypto_data()
df.head()
timestamp
2024-11-23
99024.03
99044.11
98450.00
98472.47
1067.160666
2024-11-22
98439.10
99860.00
97209.52
99024.03
29936.622150
2024-11-21
94384.15
99073.53
94127.20
98439.09
31188.093948
2024-11-20
92445.04
94989.99
91600.00
94393.15
20329.039987
2024-11-19
90521.32
94105.84
90403.24
92435.89
18267.161112
Calculating the short-term and long-term moving averages, using 50 and 200 days respectively:
def add_moving_avg(df, window= 50 ):
df = df.sort_index() # assumes index is datetime-aware
df[f"MA_ { window} " ] = df["close" ].rolling(window= window).mean()
return df
df = add_moving_avg(df, window= 50 )
df = add_moving_avg(df, window= 200 )
df = df.dropna()
df = df.round (2 )
df.head()
timestamp
2024-06-26
61794.47
62470.00
60656.80
60816.68
11392.80
66417.97
57733.65
2024-06-27
60818.86
62346.16
60546.94
61615.39
10530.36
66426.89
57822.75
2024-06-28
61611.43
62170.62
59868.00
60313.35
11381.03
66371.68
57918.17
2024-06-29
60312.36
61122.66
60273.80
60885.67
3199.04
66373.64
58015.21
2024-06-30
60884.44
62942.67
60611.01
62668.26
4263.23
66410.72
58114.13
Identifying “golden cross” points where the 50-day average surpasses the 200-day average:
has_crossed = (df['MA_50' ] > df['MA_200' ])
not_previously_crossed = (df['MA_50' ].shift(1 ) <= df['MA_200' ].shift(1 ))
df['Golden_Cross' ] = (has_crossed & not_previously_crossed)
golden_cross_points = df[df['Golden_Cross' ] == True ]
golden_cross_points[["close" , "MA_50" , "MA_200" ]].head()
timestamp
2024-10-28
69924.5
63549.13
63260.62
Visualizing the golden cross points:
title = "Bitcoin Prices with Moving Average and Golden Cross"
colors_map = {"close" : "steelblue" , "MA_50" : "yellow" , "MA_200" : "orange" }
fig = px.line(df, y= ["close" , "MA_50" , "MA_200" ], height= 450 , title= title,
color_discrete_map= colors_map
)
fig.add_scatter(
x= golden_cross_points.index,
y= golden_cross_points['MA_50' ],
mode= 'markers' ,
marker= dict (size= 12 , color= 'yellow' , symbol= 'x' ),
name= 'Golden Cross'
)
fig.show()
Bollinger Bands
Reference: https://www.investopedia.com/terms/b/bollingerbands.asp
“Bollinger Bands is a technical analysis tool developed by John Bollinger in the 1980s to help investors and traders gauge market volatility and identify when securities are poised to rise or fall.”
“The three lines that make up Bollinger Bands are based on a security’s price moves. The center line is the intermediate-term trend and is typically a 20-day SMA of the closing prices. The upper and lower bands are plotted a distance from the SMA set by a certain number of standard deviations, usually two, above and below the center line.”
Loading the data and sorting time series by date in ascending order, as necessary:
df = fetch_crypto_data()
df = df.sort_index() # assumes index is datetime-aware
Calculating moving average, moving standard deviation, and upper and lower bands:
window = 50
ma_col = f"MA_ { window} "
std_col = f"STD_ { window} "
df[ma_col] = df["close" ].rolling(window= window).mean()
df[std_col] = df["close" ].rolling(window= window).std()
band_std = 2
lower_col = f"LOWER_ { window} "
upper_col = f"UPPER_ { window} "
df[upper_col] = df[ma_col] + (band_std * df[std_col])
df[lower_col] = df[ma_col] - (band_std * df[std_col])
df = df.dropna()
df = df.round (2 )
df[["close" , ma_col, lower_col, upper_col]].head()
timestamp
2024-01-28
42034.59
42830.64
39576.58
46084.70
2024-01-29
43306.18
42820.89
39575.70
46066.08
2024-01-30
42942.78
42855.16
39642.53
46067.78
2024-01-31
42548.08
42876.57
39687.24
46065.90
2024-02-01
43078.81
42880.44
39690.59
46070.28
Plotting the bands:
import plotly.express as px
title = "Bitcoin Price with Moving Average and Bollinger Bands"
colors_map = {"close" : "steelblue" , ma_col: "orange" ,
upper_col: "yellow" , lower_col: "yellow" ,
}
fig = px.line(df, y= ["close" , ma_col, upper_col, lower_col],
title= title, color_discrete_map= colors_map,
labels= {"value" : "Price (USD)" , "timestamp" : "Date" },
)
fig.update_layout(legend_title_text= "Legend" )
fig.show()
Plotting bands (alternative fill variant):
from pandas import concat, Series
# fill regions:
fig.add_traces([
dict (
x= concat([Series(df.index), Series(df.index)[::- 1 ]]),
y= concat([df[upper_col], df[lower_col][::- 1 ]]),
fill= "toself" ,
#fillcolor="rgba(128, 0, 128, 0.2)", # Light purple fill
fillcolor= "rgba(255, 255, 102, 0.2)" , # Light yellow fill
line= dict (color= "rgba(255,255,255,0)" ),
hoverinfo= "skip" ,
showlegend= False
)
])
fig.show()
How can we interpret the chart? Are the bands getting narrower or wider?