Tutorial on Parameter defaults

As of version 0.9.8, HDDM doesn’t expect that you always explicitly want to fit the v, a and t parameters. You are now allowed to fix any of these parameters to any default you like. In this tutorial we show how to fit any given subset of parameters of a model, while supplying (user picked) default values for the remaining parameters.

Load Modules


# warning settings
import warnings

warnings.simplefilter(action="ignore", category=FutureWarning)

# Data management
import pandas as pd
import numpy as np
import pickle

# Plotting
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns

# Stats functionality
from statsmodels.distributions.empirical_distribution import ECDF

import hddm
from hddm.simulators.hddm_dataset_generators import simulator_h_c

Example Models


Simulate Data

#from hddm.simulators.hddm_dataset_generators import simulator_h_c
from hddm.simulators.basic_simulator import simulator
from hddm.simulators.hddm_dataset_generators import hddm_preprocess

model = 'ddm_hddm_base'

data = simulator(theta = [1., 1., 0.5, 0.5],
                 model = model,
                 n_samples = 500)

data = hddm_preprocess(data)

Model and Sample

Let’s first fit all parameters.

hddm_model = hddm.HDDM(data,
                        include = ['v', 'a', 't', 'z'],
                        informative = False,
                        is_group_model = False,
No model attribute --> setting up standard HDDM
Set model to ddm
hddm_model.sample(1000, burn = 500)
[-----------------100%-----------------] 1000 of 1000 complete in 8.5 sec
<pymc.MCMC.MCMC at 0x7ff3014fc210>
mean std 2.5q 25q 50q 75q 97.5q mc err
a 0.996413 0.022212 0.949557 0.983287 0.995682 1.011092 1.040408 0.001349
v 1.150145 0.122256 0.925338 1.072607 1.14161 1.225006 1.435445 0.007297
t 0.501954 0.003282 0.495503 0.500019 0.502353 0.504147 0.507753 0.000225
z 0.488272 0.015967 0.45511 0.477788 0.489069 0.498717 0.519293 0.001011

Now we fix ``a`` to it’s default as per the HDDM-supplied model_config dictionary. As shown below, this sets a = 2. which corresponds to an overestimation. We expect that, having fixed a at such value, we will correspondingly overestimate v to compensate (however the fit will end up worse in general).

{'doc': 'Model used internally for simulation purposes. Do NOT use with the LAN extension.',
 'params': ['v', 'a', 'z', 't'],
 'params_trans': [0, 0, 1, 0],
 'params_std_upper': [1.5, 1.0, None, 1.0],
 'param_bounds': [[-5.0, 0.1, 0.05, 0], [5.0, 5.0, 0.95, 3.0]],
 'boundary': <function ssms.basic_simulators.boundary_functions.constant(t=0)>,
 'params_default': [0.0, 2.0, 0.5, 0],
 'hddm_include': ['v', 'a', 't', 'z'],
 'choices': [0, 1],
 'slice_widths': {'v': 1.5,
  'v_std': 1,
  'a': 1,
  'a_std': 1,
  'z': 0.1,
  'z_trans': 0.2,
  't': 0.01,
  't_std': 0.15}}
hddm_model_no_a = hddm.HDDM(data,
                        include = ['v', 't', 'z'],
                        informative = False,
                        is_group_model = False,
No model attribute --> setting up standard HDDM
Set model to ddm
/Users/afengler/OneDrive/project_hddm_extension/hddm/hddm/models/base.py:1316: UserWarning:
 Your include statement misses either the v, a or t parameters.
Parameters not explicitly included will be set to the defaults,
which you can find in the model_config dictionary!
  "Parameters not explicitly included will be set to the defaults, n" + 
hddm_model_no_a.sample(1000, burn = 500)
[-----------------100%-----------------] 1000 of 1000 complete in 5.6 sec
<pymc.MCMC.MCMC at 0x7ff301546b50>
mean std 2.5q 25q 50q 75q 97.5q mc err
v 2.077952 0.146506 1.806508 1.977741 2.075691 2.164686 2.385388 0.011489
t 0.335338 0.009595 0.315019 0.330216 0.335275 0.341575 0.353736 0.000553
z 0.36132 0.022235 0.319808 0.346506 0.360754 0.376082 0.406029 0.001815

As predicted, v is now overestimated as well.

Let’s now try to set a to a default of our liking. We will set it to the ground-truth and again not include it in the parameters to estimate. To do so, we supply our own model_config to the HDDM() class.

from copy import deepcopy
# copy model_config dictionary so we can change it
my_model_config = deepcopy(hddm.model_config.model_config['ddm_hddm_base'])

# setting 'a' to 1.
my_model_config['params_default'][1] = 1.

hddm_model_no_a_2 = hddm.HDDM(data,
                        include = ['v', 't', 'z'],
                        informative = False,
                        is_group_model = False,
                        model_config = my_model_config
Custom model config supplied as:

{'doc': 'Model used internally for simulation purposes. Do NOT use with the LAN extension.', 'params': ['v', 'a', 'z', 't'], 'params_trans': [0, 0, 1, 0], 'params_std_upper': [1.5, 1.0, None, 1.0], 'param_bounds': [[-5.0, 0.1, 0.05, 0], [5.0, 5.0, 0.95, 3.0]], 'boundary': <function constant at 0x7ff31c1fab90>, 'params_default': [0.0, 1.0, 0.5, 0], 'hddm_include': ['v', 'a', 't', 'z'], 'choices': [0, 1], 'slice_widths': {'v': 1.5, 'v_std': 1, 'a': 1, 'a_std': 1, 'z': 0.1, 'z_trans': 0.2, 't': 0.01, 't_std': 0.15}}
No model attribute --> setting up standard HDDM
Set model to ddm
/Users/afengler/OneDrive/project_hddm_extension/hddm/hddm/models/base.py:1316: UserWarning:
 Your include statement misses either the v, a or t parameters.
Parameters not explicitly included will be set to the defaults,
which you can find in the model_config dictionary!
  "Parameters not explicitly included will be set to the defaults, n" + 
hddm_model_no_a_2.sample(1000, burn = 500)
[-----------------100%-----------------] 1000 of 1000 complete in 5.2 sec
<pymc.MCMC.MCMC at 0x7ff30157b150>
mean std 2.5q 25q 50q 75q 97.5q mc err
v 1.171748 0.118515 0.935492 1.094611 1.170236 1.251087 1.425975 0.006572
t 0.501641 0.002593 0.496291 0.499915 0.50178 0.503453 0.506346 0.000121
z 0.486188 0.016581 0.450828 0.476066 0.486156 0.497147 0.518931 0.000963

As we see, in this case v is estimated appropriately again.

Let’s compare DICs
print('Standard: ', hddm_model.dic)
print('No a with HDDM default: ', hddm_model_no_a.dic)
print('No a with a set to ground truth: ', hddm_model_no_a_2.dic)
Standard:  -7.05123814817064
No a with HDDM default:  562.273161208081
No a with a set to ground truth:  -9.028954442474097


Let’s repeat this with another model via the HDDMnn() class. We will pick the HDDM-supplied angle model.

Simulate Data

model = 'angle'
theta = [1., 1.5, .5, .5, 0.2] # v, a, z, t, theta
data_angle = simulator(theta = theta,
                       model = 'angle',
                       n_samples = 500)
data_angle = hddm_preprocess(data_angle,
                             keep_negative_responses = True)

Model and Sample

model_angle = hddm.HDDMnn(data_angle,
                          model = 'angle',
                          include = ['v', 'a', 't', 'z', 'theta'])
Using default priors: Uninformative
Supplied model_config specifies params_std_upper for  z as  None.
Changed to 10
model_angle.sample(1000, burn = 500)
[-----------------100%-----------------] 1000 of 1000 complete in 52.0 sec
<pymc.MCMC.MCMC at 0x7ff301575390>
mean std 2.5q 25q 50q 75q 97.5q mc err
v 1.020563 0.083431 0.852936 0.964464 1.020123 1.074769 1.178448 0.006605
a 1.584719 0.098083 1.419463 1.511524 1.576579 1.651892 1.787859 0.009077
z 0.526568 0.025441 0.475032 0.510146 0.527315 0.543314 0.577972 0.002245
t 0.494598 0.037365 0.420589 0.470531 0.495294 0.521575 0.561596 0.003361
theta 0.270865 0.050656 0.177934 0.236454 0.269616 0.304024 0.377811 0.004407

Again we will now leave out one parameter (let’s pick theta this time). As we can see from the printed model_config below, the default that will be chosen for this parameter is to set it to 0 in this case.

model_angle_no_theta = hddm.HDDMnn(data_angle,
                                   model = 'angle',
                                   include = ['v', 'a', 't', 'z'])
Using default priors: Uninformative
Supplied model_config specifies params_std_upper for  z as  None.
Changed to 10
model_angle_no_theta.sample(1000, burn = 500)
[-----------------100%-----------------] 1000 of 1000 complete in 47.4 sec
<pymc.MCMC.MCMC at 0x7ff30160bed0>
mean std 2.5q 25q 50q 75q 97.5q mc err
v 1.124278 0.09761 0.928012 1.052883 1.128692 1.191886 1.310813 0.008418
a 1.363875 0.053492 1.259736 1.327325 1.364278 1.396537 1.468327 0.004211
z 0.49972 0.030672 0.436555 0.477589 0.498887 0.522161 0.558534 0.002693
t 0.536048 0.033127 0.464361 0.515404 0.53657 0.558055 0.595229 0.002959

Again we observe how the parameter estimates are affected by the wrong choice of ``theta``. The model tries to compensate for the parallel bounds (no collapse), implied by the ``theta`` default, by decreasing ``a`` and slightly increasing ``v``. Let’s now try again, but this time we set ``theta`` fixed to the actualground truth*.

# copy out the model_config dictionary for the angle model
my_model_config_angle = deepcopy(hddm.model_config.model_config['angle'])
# set theta default to the ground truth defined above
my_model_config_angle['params_default'][4] = 0.2

model_angle_no_theta_2 = hddm.HDDMnn(data_angle,
                                     model = 'angle',
                                     include = ['v', 'a', 't', 'z'],
                                     model_config = my_model_config_angle)
Using default priors: Uninformative
Supplied model_config specifies params_std_upper for  z as  None.
Changed to 10
model_angle_no_theta_2.sample(1000, burn = 500)
[-----------------100%-----------------] 1000 of 1000 complete in 53.4 sec
<pymc.MCMC.MCMC at 0x7ff301652c90>
mean std 2.5q 25q 50q 75q 97.5q mc err
v 1.020949 0.089617 0.838443 0.959232 1.019338 1.087935 1.188067 0.007229
a 1.46397 0.049143 1.363087 1.429878 1.465694 1.49972 1.554566 0.003623
z 0.527111 0.027281 0.471521 0.508124 0.527366 0.547015 0.581392 0.002391
t 0.528561 0.029786 0.470851 0.50755 0.527586 0.547936 0.588767 0.002522

As we see, fixing theta to the actual ground truth, corrects the parameter estimates of the remaining parameters to be much more accurate again.

Let’s compare DICs
print('Standard: ', model_angle.dic)
print('theta set to model_config default: ', model_angle_no_theta.dic)
print('theta set to ground truth: ', model_angle_no_theta_2.dic)
Standard:  1059.453694824219
theta set to model_config default:  1066.945202636719
theta set to ground truth:  1058.248090332031

We observe in this case, that fixing theta to 0 instead of 0.2, didn’t do too much damage as far as the DICs are concerned. Nevertheless, the explicitly wrong model performs worst as per this metric.


Hopefully this was helpful.