Fama (1970) introduced efficient market hypothesis (EMH), stating the prices of securities fully reflect available information. Therefore, investors buying securities in an efficient market should expect to obtain an equilibrium rate of return. Later he introduced three different form of market efficiency: weak, semi-strong, and strong. Anomalies are empirical results that seem to be inconsistent with maintained theories of asset-pricing behavior. One class of market anomalies is seasonality and changes in market trend during different seasons. Seasonality in stock returns is a closely related to week-form of market efficiency. There are multiple seasonality effects (aka calendar effects):
Here, we attempt to explore monthly change in stock market returns. The following table demonstrates market return at each month for the past 300 plus years!(Source: SeekingAlpha)
Average Return % | 316 years | 82 years | 50 years | 14 years |
---|---|---|---|---|
1693-2009 | 1929-2011 | 1961-2011 | 1999-2012 | |
January | 0.69 | 1.0 | 1.2 | -1.6 |
February | 0.09 | 0.0 | 0.0 | -1.3 |
March | -0.03 | 0.4 | 1.1 | 2.7 |
April | 0.49 | 1.4 | 2 | 2.7 |
May | 0.02 | -0.2 | -0.1 | -0.8 |
June | -0.12 | 0.5 | -0.6 | -1.22 |
July | -0.31 | 1.5 | 0.9 | 0.34 |
August | 0.44 | 0.8 | 0.2 | 0.39 |
September | -0.49 | -1.3 | -0.8 | -1.28 |
October | -0.5 | 0.0 | 0.5 | 1.32 |
November | 0.35 | 0.8 | 1.2 | 1.2 |
December | 0.81 | 1.5 | 1.5 | 2.6 |
Winter | 2.42 | 5.20 | 7.20 | 6.36 |
Summer | -0.96 | 1.28 | 0.09 | -1.27 |
Whole Year | 1.44 | 6.55 | 7.29 | 5.01 |
Difference Winter-Summer | 3.38 | 3.92 | 7.11 | 7.63 |
First, we are going to look at SPY calendar anomaly and observe possible trend changes in every month. Data available for this analysis starts from 2012 so we have 16 years of history to study. We are going to calculate monthly return of each year and plot with box plot.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
start = '2002-01-01'
end = '2018-01-01'
Symbol='SPY'
SPY = get_pricing(Symbol, fields='price', start_date=start, end_date=end,frequency='daily')
r = SPY.pct_change()
Monthly_Returns = r.groupby((r.index.year,r.index.month)).sum()
Monthly_Returns_List=[]
for i in range(len(Monthly_Returns)):
Monthly_Returns_List.append({'Year':Monthly_Returns.index[i][0],'Month':Monthly_Returns.index[i][1],
'Monthly_Return': Monthly_Returns[i] })
Monthly_Returns_List=pd.DataFrame(Monthly_Returns_List,
columns=('Year','Month','Monthly_Return'))
Monthly_Returns_List.boxplot(column='Monthly_Return', by='Month')
ax = plt.gca()
labels = [item.get_text() for item in ax.get_xticklabels()]
labels=['JAN','FEB','MAR','APR','MAY','JUN','JUL','AUG','SEP','OCT','NOV','DEC']
ax.set_xticklabels(labels)
plt.tick_params(axis='both', which='major', labelsize=15)
plt.show()
Observations:
Now we will utilize the Augmented Dickey-Fuller Test for stationarity. The null hypothesis states that large p values indicate non-stationarity and smaller p values indicate stationarity. (We will be using 0.05 as selected alpha value.)
from statsmodels.tsa.stattools import adfuller
result = adfuller(SPY)
print('ADF Statistic: %f' % result[0])
print('p-value: %f' % result[1])
print('Critical Values:')
for key, value in result[4].items():
print('\t%s: %.3f' % (key, value))
As suspected, ADF test revealed the time series data (SPY stock data) are non stationary meaning time series is a stochastic process whose unconditional joint probability distribution changes when shifted in time. Now we know our time series is non staitionary, we attempt to exploit seasonality trends using time series decomposition.
We shall think of the time series $y_t$ as comprising three components: a seasonal component, a trend-cycle component (containing both trend and cycle), and a remainder component (containing anything else in the time series). For example, if we assume an additive model, then we can write: $$y_t=S_t+T_t+E_t$$ where ytyt is the data at period $t$, $S_t$ is the seasonal component at period $t$, $T_t$ is the trend-cycle component at period $t$ and $E_t$ is the remainder (or irregular or error) component at period $t$. Alternatively, a multiplicative model would be written as $$y_t=S_t×T_t×E_t$$
For more detailed information visit here
In this notebook I am using additive decompositional model to extract seasonal model. In order to claculate seasonal effect, I used frequency decomposition of 252 meanining, trend is repeated every 252 days (5 trading days and 9 holidays results in 252 trading day).
import statsmodels.api
import statsmodels as sm
res = sm.tsa.seasonal.seasonal_decompose(SPY,model='additive', freq=252)
resplot = res.plot()
As it can be seen from the above chart, overall trend of the market was increasing with the downward trend in 2008-2009. The seasonal trend repeating every year indicate vally and peak. Now, we attempt to isolate this trend and inspect it further:
SPY_Seasonality=res.seasonal[res.seasonal.index.year == 2017]
MA1=SPY_Seasonality.rolling(window=20).mean()
plt.plot(SPY_Seasonality, label='SPY Seasonality')
plt.plot(MA1, label='20D MA')
plt.title("Seasonality of SPY", fontsize=18, fontweight='bold')
plt.legend()
plt.show()
Observations:
Now we isolated seasonality trend we can, adjust market trend in regard to seasonality. Since we used additive model to identify seasonality, we can simply deduct seasonality from stock data (SPY) and adjust for seasonality. In periods where market has strong bull or bear (upward or downward) trends, seasonality effects might be too weak to observe. However, if market exhibits range bound behavior it, such effect can be more evident. Here, we will look at 2017 data (strong bull market) and 2015 data (sideways behavior)
plt.plot(SPY[SPY.index.year == 2017]+SPY_Seasonality,color='red',label='Seasonality Adjusted SPY');
plt.plot(SPY[SPY.index.year == 2017],label='SPY')
plt.legend(loc=4)
plt.show()
plt.plot(SPY[SPY.index.year == 2015]+res.seasonal[res.seasonal.index.year == 2015],color='red',label='Seasonality Adjusted SPY');
plt.plot(SPY[SPY.index.year == 2015],label='SPY')
plt.legend(loc=4)
plt.show()
Now we can simply test the return of a simple strategy based on seasonality. We are going to buy SPY in October and sell the entire holdings in May or simply comparing return of SPY from October of each year to May of next year.
import datetime
Min_Month=SPY_Seasonality.idxmin().month
Max_Month=SPY_Seasonality.idxmax().month
Index=SPY.index
Years = list(set(Index.year))
Years.sort()
Returns = d = {'Market Return' : pd.Series(index=Years),'Strategy Return' : pd.Series( index=Years)}
Returns = pd.DataFrame(Returns)
for index, row in Returns.iterrows():
Data=SPY[SPY.index.year == index]
row['Market Return']=((Data.iloc[-1]-Data.iloc[0])*100/Data.iloc[0])
if index<Years[-1]:
# purchase at first day of the min month (this year)
Purchase=SPY.loc[(SPY.index.year==index) & (SPY.index.month==Min_Month)][0]
# sell at the first day of max month (next year)
Sell=SPY.loc[(SPY.index.year==index+1) & (SPY.index.month==Max_Month)][0]
row['Strategy Return']=(Sell-Purchase)*100/Purchase
ax = Returns[['Market Return','Strategy Return']].plot(kind='bar', title ="Seasonality Strategy vs Market Return", figsize=(15, 10), legend=True, fontsize=14)
ax.set_xlabel("Year", fontsize=12)
ax.set_ylabel("% Returns", fontsize=12)
plt.show()
Returns.hist();
Looking at returns of our strategy we can see; this strategy is not as profitable as simply buying and holding for entire year. During market crash of 2008, this strategy lost less money than average market return. However, this might be due to start month of market crash (further analysis is required). One should note that this strategy is extremely simple without considering proper entry and exit point within each month. Therefore, in order to fully understand the profitability and risk of this strategy, in depth analysis is required which is outside the scope of this article.
Now we will analyze different market sectors and their respective seasonality and measure the correlation to SPY seasonality:
Sectors={'XLB','XLY','XLF','IYR','XLP','XLV','XLU','IYZ','XLE','XLI','XLK'}
Sectors_Data = get_pricing(Sectors, fields='price', start_date=start, end_date=end,frequency='daily')
df_ = pd.DataFrame(index=SPY_Seasonality.index, columns=Sectors)
for i in range(len(Sectors)):
Sectors_res = sm.tsa.seasonal.seasonal_decompose( Sectors_Data[[i]],model='additive', freq=252)
Seasonality=Sectors_res.seasonal[Sectors_res.seasonal.index.year == 2017]
df_[list(Sectors)[i]]=Seasonality
import scipy.stats as ss
Summary=[]
Legend={'XLB','XLY','XLF','IYR','XLP','XLV','XLU','IYZ','XLE','XLI','XLK'}
NUM_COLORS = len(Sectors)
cm = plt.get_cmap('gist_ncar')
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_color_cycle([cm(1.*i/NUM_COLORS) for i in range(NUM_COLORS)])
for i in range(NUM_COLORS):
ax.plot(df_[[i]].rolling(window=20).mean())
Summary.append({'Sector': list(df_[[i]].columns)[0], 'Min': np.min(df_[[i]])[0], 'Max': np.max(df_[[i]])[0],\
'Range':np.max(df_[[i]])[0]-np.min(df_[[i]])[0],'Std':np.std(df_[[i]])[0], 'Corr to SPY Seasonality': ss.spearmanr(SPY_Seasonality, df_[[i]])[0] })
ax.plot(SPY_Seasonality.rolling(window=20).mean(), color='black', linestyle='-',marker='*')
plt.legend(Legend)
plt.show()
Summary=pd.DataFrame(Summary, columns=('Sector','Min','Max','Range','Std','Corr to SPY Seasonality'))
SPY seasonality in the above picture overlaied with black line.
I used Spearman Rank Correlation to identify correlation of seasonality of different market sectors to seasonality of market itself. As it can be seen from the tables below, consumer discretionary has highest correlation to market seasonality and real estate sector has the least seasonality correlation to market.
Summary.sort_values('Corr to SPY Seasonality', inplace=True)
Summary
The cumulative histogram of seasonality of selected market sectors are listed below.
from scipy import stats
plt.close(fig)
fig = plt.figure()
ax = fig.add_subplot(111)
stats.probplot(SPY_Seasonality, plot=ax);
stats.probplot(df_['XLE'], plot=ax);
stats.probplot(df_['XLK'], plot=ax);
stats.probplot(df_['XLY'], plot=ax);
stats.probplot(df_['IYR'], plot=ax);
ax.get_lines()[0].set_markerfacecolor('red')
ax.get_lines()[2].set_markerfacecolor('green')
ax.get_lines()[4].set_markerfacecolor('yellow')
ax.get_lines()[6].set_markerfacecolor('black')
ax.get_lines()[8].set_markerfacecolor('darkcyan')
ax.get_lines()[1].remove()
ax.get_lines()[2].remove()
ax.get_lines()[3].remove()
ax.get_lines()[4].remove()
ax.get_lines()[5].remove()
ax.legend({'SPY','XLE','XLK','XLY','IYR'},loc=4);
We can use empirical time series model to identify the effect of each month. The model is defined as below: $$ R=\beta+\alpha*{jan} \times r*{jan}+\alpha*{feb} \times r*{feb}+...\alpha*{nov} \times r*{nov}$$
All we need to do is to find returns of each month then use regression analysis to find the coefficients. $\beta$ in this case can be considered as December return
start = '2002-01-01'
end = '2018-01-01'
Symbol='SPY'
SPY = get_pricing(Symbol, fields='price', start_date=start, end_date=end,frequency='daily')
r = SPY.pct_change()
g=r.groupby((r.index.year,r.index.month)).sum()
Monthly_Data=g.unstack()
Yearly_Data=r.groupby((r.index.year)).sum()
labels=['JAN','FEB','MAR','APR','MAY','JUN','JUL','AUG','SEP','OCT','NOV','DEC']
Monthly_Data.columns=labels
np.shape(Yearly_Data)
np.shape(Monthly_Data)
import statsmodels.api as sm
X = Monthly_Data[['JAN','FEB','MAR','APR','MAY','JUN','JUL','AUG','SEP','OCT','NOV']]
X = sm.add_constant(X)
model11 = sm.OLS(Yearly_Data, X).fit()
model11.summary()
model11.params.plot(kind='bar', title ="Coeffcients of Regressio Analysis", figsize=(15, 10), legend=True, fontsize=14);
Looking at the model parameters we can see, month of April/November has the highest coefficient while July has the least coefficient. This shows possibility of higher return in months with higher coefficient and vice versa. This outcome is consistent with our analysis on average monthly return of the market bar chart presented at the beginning of this article.
It is important to note during this study; the remainder component of seasonality has not been studied because it requires more in-depth statistical analysis which is outside the scope of this study. Now let's see how market reacted in 2018 so far:
start = '2018-01-01'
end = '2018-07-01'
Symbol='SPY'
SPY_2018 = get_pricing(Symbol, fields='price', start_date=start, end_date=end,frequency='daily')
r_2018 = SPY_2018.pct_change()
Monthly_Returns_2018 = r_2018.groupby((r_2018.index.year,r_2018.index.month)).sum()
Monthly_Returns_2018.plot(kind='bar', title ="Seasonality Strategy vs Market Return", figsize=(15, 10), legend=True, fontsize=14)