Stochastic Volatility model

In [2]:
import numpy as np
import pymc3 as pm
from pymc3.distributions.timeseries import GaussianRandomWalk

from scipy import optimize

%pylab inline
Populating the interactive namespace from numpy and matplotlib

Asset prices have time-varying volatility (variance of day over day returns). In some periods, returns are highly variable, while in others very stable. Stochastic volatility models model this with a latent volatility variable, modeled as a stochastic process. The following model is similar to the one described in the No-U-Turn Sampler paper, Hoffman (2011) p21.

\[\sigma \sim Exponential(50)\]
\[\nu \sim Exponential(.1)\]
\[s_i \sim Normal(s_{i-1}, \sigma^{-2})\]
\[log(\frac{y_i}{y_{i-1}}) \sim t(\nu, 0, exp(-2 s_i))\]

Here, \(y\) is the daily return series and \(s\) is the latent log volatility process.

Build Model

First we load some daily returns of the S&P 500.

In [3]:
n = 400
returns = np.genfromtxt("../data/SP500.csv")[-n:]
array([-0.00637 , -0.004045, -0.02547 ,  0.005102, -0.047733])
In [4]:
[<matplotlib.lines.Line2D at 0x7f64eab559b0>]

Specifying the model in pymc3 mirrors its statistical specification.

In [5]:
model = pm.Model()
with model:
    sigma = pm.Exponential('sigma', 1./.02, testval=.1)

    nu = pm.Exponential('nu', 1./10)
    s = GaussianRandomWalk('s', sigma**-2, shape=n)

    r = pm.StudentT('r', nu, lam=pm.math.exp(-2*s), observed=returns)

Fit Model

For this model, the full maximum a posteriori (MAP) point is degenerate and has infinite density. To get good convergence with NUTS we use ADVI (autodiff variational inference) for initialization. This is done under the hood by the sample_init() function.

In [7]:
with model:
    trace = pm.sample(2000)
Auto-assigning NUTS sampler...
Initializing NUTS using advi...
INFO (theano.gof.compilelock): Waiting for existing lock by process '19349' (I am process '19302')
INFO (theano.gof.compilelock): To manually release the lock, delete /home/wiecki/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.2-64/lock_dir
Average ELBO = 1,138.51: 100%|██████████| 500000/500000 [04:55<00:00, 1693.17it/s]
Finished [100%]: Average ELBO = 1,138.62
100%|██████████| 1/1 [00:00<00:00, 1837.99it/s]
100%|██████████| 2000/2000 [04:27<00:00, 11.80it/s]
In [8]:
pm.traceplot(trace, model.vars[:-1]);
In [9]:
plot(trace[s][::10].T,'b', alpha=.03);
ylabel('log volatility')
<matplotlib.text.Text at 0x7f64c41f4048>

Looking at the returns over time and overlaying the estimated standard deviation we can see how the model tracks the volatility over time.

In [9]:
plot(np.exp(trace[s][::10].T), 'r', alpha=.03);
sd = np.exp(trace[s].T)
ylabel('absolute returns')
<matplotlib.text.Text at 0x7f8e7a93d160>