Moving Averages
We can leverage the concept of shift-based methods to calculate our own trends and moving averages.
We will implement two different kinds of moving average:
Let’s consider the following time series dataset of stock prices:
from pandas import read_csv
request_url = "https://raw.githubusercontent.com/prof-rossetti/python-for-finance/main/docs/data/daily_adjusted_NFLX.csv"
prices_df = read_csv(request_url)
prices_df.head()
0
2024-06-27
679.35
688.570
678.14
684.34
684.34
2404849
0.0
1.0
1
2024-06-26
672.41
682.960
670.02
677.69
677.69
3545815
0.0
1.0
2
2024-06-25
673.44
673.665
662.30
672.41
672.41
2278768
0.0
1.0
3
2024-06-24
685.00
685.500
664.50
669.02
669.02
3916335
0.0
1.0
4
2024-06-21
683.26
686.900
678.10
686.12
686.12
4557086
0.0
1.0
Because we’ll be using shift-based methods, we must first sort the data, as usual, by date in ascending order:
prices_df.sort_values(by= ["timestamp" ], inplace= True ) #
prices_df.head()
99
2024-02-05
562.47
575.7500
557.195
562.06
562.06
4143099
0.0
1.0
98
2024-02-06
564.12
566.3399
554.370
555.88
555.88
2840300
0.0
1.0
97
2024-02-07
558.16
567.8000
554.980
559.30
559.30
4373591
0.0
1.0
96
2024-02-08
560.55
563.7000
555.740
558.53
558.53
3175468
0.0
1.0
95
2024-02-09
565.00
566.0000
558.100
561.32
561.32
3055698
0.0
1.0
prices_df["ma_50" ] = prices_df["adjusted_close" ].rolling(window= 50 ).mean()
prices_df["ema_50" ] = prices_df["adjusted_close" ].ewm(span= 50 , min_periods= 0 , adjust= False , ignore_na= False ).mean()
prices_df[["timestamp" , "adjusted_close" , "ma_50" , "ema_50" ]]
99
2024-02-05
562.06
NaN
562.060000
98
2024-02-06
555.88
NaN
561.817647
97
2024-02-07
559.30
NaN
561.718916
96
2024-02-08
558.53
NaN
561.593860
95
2024-02-09
561.32
NaN
561.583121
...
...
...
...
...
4
2024-06-21
686.12
619.8296
632.294057
3
2024-06-24
669.02
620.6344
633.734290
2
2024-06-25
672.41
621.6260
635.250985
1
2024-06-26
677.69
623.0368
636.915260
0
2024-06-27
684.34
624.3732
638.775054
100 rows × 4 columns
import plotly.express as px
px.line(prices_df, x= "timestamp" , y= ["close" , "ma_50" , "ema_50" ],
title= f"Adjusted Closing Prices" ,
color_discrete_map= {
"close" : "royalblue" ,
"ma_50" : "orange" ,
"ema_50" :"yellow"
}
)
You’ll notice there are no values for the first N number of periods in our rolling window average (where N is the size of the window). This is because there aren’t enough values to complete the average. It’s OK!
If you would like to change this behavior to be less methodologically strict, we can apply the min_periods
parameter, setting minimum number of periods to zero, in which case as many values will be used until we get to fifty, at which point the true rolling average takes over:
# SETTING MIN PERIODS = ZERO:
prices_df["ma_50_min_0" ] = prices_df["adjusted_close" ].rolling(window= 50 , min_periods= 0 ).mean()
px.line(prices_df, x= "timestamp" ,
y= ["close" , "ma_50_min_0" , "ma_50" , "ema_50" ],
title= f"Adjusted Closing Prices" ,
color_discrete_map= {
"close" : "royalblue" ,
"ma_50_min_0" : "pink" ,
"ma_50" : "orange" ,
"ema_50" :"yellow"
}
)