Parameter recovery of the DDM with starting point bias

[1]:
import rlssm
import pandas as pd

Simulate individual data

[2]:
from rlssm.random import simulate_ddm
[3]:
data = simulate_ddm(
    n_trials=400,
    gen_drift=.8,
    gen_threshold=1.3,
    gen_ndt=.23,
    gen_rel_sp=.6)
[4]:
data.describe()[['rt', 'accuracy']]
[4]:
rt accuracy
count 400.000000 400.00000
mean 0.583140 0.81750
std 0.296395 0.38674
min 0.257000 0.00000
25% 0.366000 1.00000
50% 0.493500 1.00000
75% 0.706500 1.00000
max 1.916000 1.00000

Initialize the model

[5]:
model = rlssm.DDModel(hierarchical_levels = 1, starting_point_bias=True)
Using cached StanModel

Fit

[6]:
# sampling parameters
n_iter = 3000
n_chains = 2
n_thin = 1

# bayesian model, change default priors:
drift_priors = {'mu':1, 'sd':3}
threshold_priors = {'mu':-1, 'sd':3}
ndt_priors = {'mu':-1, 'sd':1}
[7]:
model_fit = model.fit(
    data,
    drift_priors=drift_priors,
    threshold_priors=threshold_priors,
    ndt_priors=ndt_priors,
    thin = n_thin,
    iter = n_iter,
    chains = n_chains,
    verbose = False)
Fitting the model using the priors:
drift_priors {'mu': 1, 'sd': 3}
threshold_priors {'mu': -1, 'sd': 3}
ndt_priors {'mu': -1, 'sd': 1}
rel_sp_priors {'mu': 0, 'sd': 0.8}
WARNING:pystan:Maximum (flat) parameter count (1000) exceeded: skipping diagnostic tests for n_eff and Rhat.
To run all diagnostics call pystan.check_hmc_diagnostics(fit)
Checks MCMC diagnostics:
n_eff / iter looks reasonable for all parameters
0.0 of 3000 iterations ended with a divergence (0.0%)
0 of 3000 iterations saturated the maximum tree depth of 10 (0.0%)
E-BFMI indicated no pathological behavior

get Rhat

[8]:
model_fit.rhat
[8]:
rhat variable
0 1.002813 drift
1 1.000353 threshold
2 0.999921 ndt
3 1.003156 rel_sp

calculate wAIC

[9]:
model_fit.waic
[9]:
{'lppd': -122.26126045695726,
 'p_waic': 3.682425753566376,
 'waic': 251.88737242104727,
 'waic_se': 47.269086540763105}

Posteriors

[10]:
model_fit.samples.describe()
[10]:
chain draw transf_drift transf_threshold transf_ndt transf_rel_sp
count 3000.000000 3000.000000 3000.000000 3000.000000 3000.000000 3000.000000
mean 0.500000 749.500000 0.792928 1.306639 0.233079 0.604580
std 0.500083 433.084792 0.103006 0.033952 0.004736 0.019013
min 0.000000 0.000000 0.338697 1.204520 0.214251 0.537698
25% 0.000000 374.750000 0.721867 1.283811 0.230113 0.591341
50% 0.500000 749.500000 0.793180 1.306182 0.233424 0.604716
75% 1.000000 1124.250000 0.863523 1.329765 0.236436 0.617648
max 1.000000 1499.000000 1.242643 1.431171 0.247481 0.668216
[11]:
import seaborn as sns
sns.set(context = "talk",
        style = "white",
        palette = "husl",
        rc={'figure.figsize':(15, 8)})

Here we plot the estimated posterior distributions against the generating parameters, to see whether the model parameters are recovering well:

[12]:
g = model_fit.plot_posteriors(height=5, show_intervals='HDI')

for i, ax in enumerate(g.axes.flatten()):
    ax.axvline(data[['drift', 'threshold', 'ndt', 'rel_sp']].mean().values[i], color='grey', linestyle='--')
../_images/notebooks_DDM_starting-point-bias_fitting_19_0.png