Outliers on Timefrequency Transform
http://www.mattcraddock.com/tags/outliers/index.xml
Recent content in Outliers on Timefrequency Transform
Hugo  gohugo.io
engb
© 20162017. All rights reserved.

Analysing reaction times  Revisiting Ratcliff (1993) (pt 1)
http://www.mattcraddock.com/blog/2017/12/04/analysingreactiontimesrevisitingratcliff1993/
Mon, 04 Dec 2017 00:00:00 +0000
http://www.mattcraddock.com/blog/2017/12/04/analysingreactiontimesrevisitingratcliff1993/
<script src="../../rmarkdownlibs/htmlwidgets/htmlwidgets.js"></script>
<script src="../../rmarkdownlibs/plotlybinding/plotly.js"></script>
<script src="../../rmarkdownlibs/typedarray/typedarray.min.js"></script>
<script src="../../rmarkdownlibs/jquery/jquery.min.js"></script>
<link href="../../rmarkdownlibs/crosstalk/css/crosstalk.css" rel="stylesheet" />
<script src="../../rmarkdownlibs/crosstalk/js/crosstalk.min.js"></script>
<link href="../../rmarkdownlibs/plotlyjs/plotlyhtmlwidgets.css" rel="stylesheet" />
<script src="../../rmarkdownlibs/plotlyjs/plotlylatest.min.js"></script>
<p>Reaction times are a very common outcome measure in psychological science. Frequently, people use the mean to summarise reaction time distributions and compares means across conditions using ANOVAs. For example, in a typical experiment, researchers might record reaction times to familiar and unfamiliar faces, and look for differences in mean reaction time across these two types of stimuli.</p>
<p>An issue with this is that reaction time distributions are skewed: there are many more short values than long values, so their distribution has a long right tail. The mean of such distributions is drawn out towards the tail. Unlike in normally distributed data, it becomes quite different from the median, which is the midpoint of the distribution. Thus the mean is influenced by the skewness of the distribution in a way that the median is not. In addition, the median is robust to outliers.</p>
<p>A classic article from 1993 by Ratcliff <a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a> examined how outliers in reaction time distributions impacted statistical power, and how a variety of different methods can be used to mitigate their influence. I’m going to replicate some of his simulations here.</p>
<p>RTs can be modelled as coming from an exGaussian distribution. An exGaussian <a href="#fn2" class="footnoteRef" id="fnref2"><sup>2</sup></a> is the sum of a Gaussian and an exponential distribution, and has three parameters  the mean of the Gaussian (mu), standard deviation of the Gaussian (sigma), and the mean of the exponential (tau). Between them, sigma and tau effectively control the rise of the left tail of the distribution and the fall of the right tail. Let’s simulate an exGaussian by generating a normal distribution with a mean of 400 and SD of 40, an exponential distribution with a mean of 200 (this is given by a rate parameter in R’s <em>rexp()</em> command  divide 1 by your desired mean.) and adding them together.</p>
<pre class="r"><code>exGauss < data.frame(RT = rnorm(1000, 400, 40) + rexp(1000, 1 / 200))
p < ggplot(exGauss, aes(x = RT)) +
geom_histogram() +
theme_classic()
p</code></pre>
<p><img src="../../post/20171201analysingreactiontimesrevisitingratcliff1993_files/figurehtml/exGauss1.png" width="672" /></p>
<p>As always, other things come into play during psychology experiments. People get bored or press buttons too quickly. Our distributions become contaminated with outliers. Ratcliff simulated this by replacing values from the sample distribution with values from a second distribution. Note that this is outliers in the sense of datapoints drawn from a distribution other than that related to the process of interest. In fact, such outliers can easily by embedded unnoticed within the distribution generated by the genuine process, and are as such impossible to detect. Thus, in general, it’s only the really short or really long outliers that can be identified as being outliers.</p>
<p>R has functions builtin to generate data from a lot of distributions, but not exGaussians. So first of all I create a helper function to produce exGaussian distributions with specified parameters  mu, sd, and tau. Here’s a bunch of distributions produced by combining two distributions: a reference exGaussian distribution which exemplifies the process of interest, and an outlier exGaussian distribution with a mean 1 or 2 standard deviations away from the mean of the reference distribution, as in Ratcliff (1993).</p>
<pre class="r"><code># define a helper function to generate an exgaussian
exGausDist < function(nObs = 1000, mu = 400, sd = 40, tau = 200) {
round(rnorm(nObs, mu, sd) + rexp(nObs, 1 / tau))
}
# Generate the various distributions + outliers
exampleDists < data.frame("Reference" = exGausDist(),
"Minus 1 Sd" = c(exGausDist(nObs = 800), exGausDist(nObs = 200, mu = 200)),
"Plus 1 Sd" = c(exGausDist(nObs = 800), exGausDist(nObs = 200, mu = 600)),
"Plus 2 Sd" = c(exGausDist(nObs = 800), exGausDist(nObs = 200, mu = 800)))
gg2 < exampleDists %>%
gather(distribution,RT) %>%
ggplot(aes(x = RT)) +
geom_histogram(binwidth = 50) +
facet_wrap(~distribution) +
theme_bw()
gg2</code></pre>
<p><img src="../../post/20171201analysingreactiontimesrevisitingratcliff1993_files/figurehtml/exg_fun1.png" width="672" /></p>
<p>Note how with the “Plus 1 SD” distribution, the outliers are pretty much impossible to tell apart from the genuine process  there’s a more noticeable but still not exceedingly obvious bump in the tail of the “Plus 2 SD” plot, and an early bump in the “Minus 1 SD” plot.</p>
<div id="ratcliffstyleexperimentalsimulations" class="section level1">
<h1>Ratcliffstyle experimental simulations</h1>
<p>Ratcliff (1993) ran simulations testing how various methods of dealing with outliers stood up. To generate RT distributions, Ratcliff created distributions in which 90% of the values were drawn directly from a genuine exGaussian distribution, and 10% drawn from the same distribution but with a random number between 0 and 2000, drawn from a uniform distribution, added to it. Ratcliff also introduced intersubject variability by vary individual subject means by a random number between 50 and 50, again drawn from a uniform distribution.</p>
<p>Let’s start by building a helper function to generate data distributions with a specified proportion of outliers. I then create a second function to generate a full set of experimental data as per Ratcliff. This generates data for a 2 X 2 design with withinsubjects factors A and B. Ratcliff generated data for 32 subjects, with 7 trials in each condition (more on that later), an exGaussian distribution with parameters mu = 400, sigma = 40, and tau = 200, and a main effect in factor A, so let’s make sure we can vary all those things at will.</p>
<pre class="r"><code>exGausDist < function(nObs = 1000, mu = 400, sd = 40, tau = 200, outProb = .1) {
#generate vector of trials
tmp < round(rnorm(nObs, mu, sd) + rexp(nObs, 1 / tau))
#pick the lucky ones
rollD < as.logical(rbinom(nObs, 1, prob = outProb))
outliers < round(runif(sum(rollD), 0, 2000))
#add the outliers to the genuine distribution
tmp[rollD] < tmp[rollD] + outliers
tmp
}
## Custom function to generate a Ratcliff style dataset.
## Main effect is always in A.
## Defaults reflect first set of simmulations in Ratcliff (1993)
ratcliff < function(nSubs = 32, mainEffect = 0, mu = 400, subjVar = 50,
nObs = 7, outProb = 0, tau = 200, sigma = 40) {
#generate a vector of individual subject means with variance drawn from
#a uniform distribution
subMeans < rep(mu, nSubs) + runif(nSubs, min = subjVar, max = subjVar)
#generate a full dataset
ratcliff_data < data.frame(
A1.B1 = unlist(lapply(subMeans,
function (x) do.call(exGausDist,
list(nObs = nObs, mu = x,
sd = sigma, tau = tau,
outProb = outProb)))),
A1.B2 = unlist(lapply(subMeans,
function (x) do.call(exGausDist,
list(nObs = nObs, mu = x,
sd = sigma, tau = tau,
outProb = outProb)))),
A2.B1 = unlist(lapply(subMeans,
function (x) do.call(exGausDist,
list(nObs = nObs,
mu = x + mainEffect,
sd = sigma, tau = tau,
outProb = outProb)))),
A2.B2 = unlist(lapply(subMeans,
function (x) do.call(exGausDist,
list(nObs = nObs,
mu = x + mainEffect,
sd = sigma, tau = tau,
outProb = outProb)))),
Subject = factor(rep(1:nSubs, each = nObs)))
ratcliff_data < ratcliff_data %>%
gather(condition, RT, Subject) %>%
separate(condition, c("A", "B"))
ratcliff_data
}
ratcliff_data < ratcliff()
ggplot(ratcliff_data, aes(x = RT)) +
geom_density(aes(fill = Subject), alpha = 0.5) +
facet_grid(B ~ A) +
theme_bw()</code></pre>
<p><img src="../../post/20171201analysingreactiontimesrevisitingratcliff1993_files/figurehtml/ratcliffData1.png" width="672" /></p>
<p>The above plot shows kernel density estimates of the generated distribution for each subject, for each condition.</p>
<p>So now we have a function that can simulate Ratcliff’s simulated datasets. Ratcliff generated 1000 simulated datasets and ran a 2X2X32(!) ANOVA for each one using a variety of methods for dealing with outliers. I’ll go with the normal 2x2 repeated measures ANOVA.</p>
<p>Let’s write yet another function: this one generates a dataset using the <em>ratcliff()</em> function I wrote above, then summarises it using all the cutoffs, transformations and statistics used by Ratcliff. Ratcliff uses:</p>
<ol style="liststyletype: decimal">
<li>mean RT with
<ul>
<li>no cutoff</li>
<li>cutoff at 1000ms, 1250ms, 1500ms, 2000ms, or 2500ms</li>
<li>cutoff at 1 or 1.5 standard devations (SD) above the mean (calculated across all conditions, not separately)</li>
<li>mean RT after the maximum RT in each condition is removed;</li>
<li>Winsorized mean  values more than 2 SD above the mean replaced with 2 * SD</li>
</ul></li>
<li>median RT</li>
<li>mean logtransformed RT</li>
<li>mean inversetransformed RT (1/RT)</li>
</ol>
<pre class="r"><code>runSims < function(mainEffect = 0, outProb = 0, nObs = 7) {
#Generate the initial dataset  default
tmp < ratcliff(mainEffect = mainEffect, outProb = outProb, nObs = nObs)
summary_data < tmp %>%
group_by(Subject) %>%
mutate(scaled_RT = scale(RT),
windsor = ifelse(scaled_RT >= 2, sd(RT) * 2, RT)) %>%
group_by(Subject, A, B) %>%
summarise(meanRT = mean(RT),
medianRT = median(RT),
invTr = mean(1 / RT),
logTr = mean(log(RT)),
trim_Max = (sum(RT)  max(RT)) / (n()  1),
mean_1sd = mean(RT[scaled_RT <= 1]),
mean_1.5sd = mean(RT[scaled_RT <= 1.5]),
windsor = mean(windsor),
cut_2500 = mean(RT[RT < 2500]),
cut_2000 = mean(RT[RT < 2000]),
cut_1500 = mean(RT[RT < 1500]),
cut_1250 = mean(RT[RT < 1250]),
cut_1000 = mean(RT[RT < 1000])
)
#Reshape to tidy format, split the DF based on the measurement type,
#and run ANOVA for each type of measure.
aov_results < summary_data %>%
gather(measure, RT, Subject, A, B) %>%
split(.$measure) %>%
map(~aov_ez("Subject", "RT", data = ., within = c("A", "B")))
pA < map_dbl(aov_results %>% modify_depth(1, "anova_table") %>% map("Pr(>F)"), 1)
pB < map_dbl(aov_results %>% modify_depth(1, "anova_table") %>% map("Pr(>F)"), 2)
pAxB < map_dbl(aov_results %>% modify_depth(1, "anova_table") %>% map("Pr(>F)"), 3)
return(list("pA" = pA,"pB" = pB,"pAxB" = pAxB))
}</code></pre>
<p>The function returns a list of the pvalues for each ANOVA term for each different type of measure. It’s not enough to run that just once of course  we need to run it over and over again. We’ll replicate Ratcliff’s first set of simulations. He simulated data from 32 subjects, from a 2 X 2 factorial design, with 7 trials per condition. The exGaussian was generated with the normal part having a mean of 400 ms and standard deviation of 40 ms, and the exponential part having a tau of 200 ms. Ratcliff added a main effect of 30 ms, so that both the interaction and other main effect were null, and ran the simulations 1000 times. To show the effect of outliers, he also ran the simulations twice: once with 10% outliers, and once with no outliers.</p>
<pre class="r"><code>nSims < 1000
no_outliers < replicate(nSims, runSims(mainEffect = 30))
Apow_no < do.call(rbind, no_outliers[1, ])
#Uncomment these lines if you want to check the main effect of B and AxB interaction
Bpow_no < do.call(rbind, no_outliers[2,])
AxBpow_no < do.call(rbind, no_outliers[3,])
ten_perc < replicate(nSims, runSims(mainEffect = 30, outProb = 0.1))
Apow_ten < do.call(rbind, ten_perc[1, ])
Bpow_ten < do.call(rbind, ten_perc[2, ])
AxBpow_ten < do.call(rbind, ten_perc[3, ])
# combine output into single frame and convert to 1s and 0s (sig versus not sig)
all_ps < data.frame(rbind(Apow_no, Apow_ten))
all_ps < data.frame((all_ps <= .05) + 0)
all_ps$outliers < rep(c("None", "10%"), each = nSims)</code></pre>
<pre class="r"><code>both_outliers < all_ps %>%
gather(measure, percent, outliers) %>%
mutate(measure = fct_relevel(measure, "meanRT", "cut_2500", "cut_2000",
"cut_1500", "cut_1250", "cut_1000", "logTr",
"invTr", "trim_Max", "medianRT", "mean_1.5sd",
"mean_1sd", "windsor")) %>%
ggplot(aes(x = measure, y = percent, colour = outliers)) +
stat_summary(fun.y = "mean", geom= "point", size = 4) +
ggtitle("30ms main effect in mu") +
labs(y = "Proportion of significant tests p < .05 ") +
ylim(0, 1) +
scale_color_brewer(palette = "Set1") +
scale_x_discrete(labels = c("mean", "2.5s", "2s", "1.5s", "1.25s", "1s",
"log(RT)", "1/RT", "trim", "median", "1.5 SD",
"1 SD", "Wind")) +
theme_minimal()
ggplotly(both_outliers)</code></pre>
<div id="41cc6b6249c3" style="width:672px;height:480px;" class="plotly htmlwidget"></div>
<script type="application/json" datafor="41cc6b6249c3">{"x":{"data":[{"x":[1,2,3,4,5,6,7,8,9,10,11,12,13],"y":[0.18,0.191,0.285,0.448,0.547,0.655,0.413,0.689,0.349,0.342,0.475,0.569,0.378],"text":["measure: meanRT<br />percent: 0.180<br />outliers: 10%","measure: cut_2500<br />percent: 0.191<br />outliers: 10%","measure: cut_2000<br />percent: 0.285<br />outliers: 10%","measure: cut_1500<br />percent: 0.448<br />outliers: 10%","measure: cut_1250<br />percent: 0.547<br />outliers: 10%","measure: cut_1000<br />percent: 0.655<br />outliers: 10%","measure: logTr<br />percent: 0.413<br />outliers: 10%","measure: invTr<br />percent: 0.689<br />outliers: 10%","measure: trim_Max<br />percent: 0.349<br />outliers: 10%","measure: medianRT<br />percent: 0.342<br />outliers: 10%","measure: mean_1.5sd<br />percent: 0.475<br />outliers: 10%","measure: mean_1sd<br />percent: 0.569<br />outliers: 10%","measure: windsor<br />percent: 0.378<br />outliers: 10%"],"type":"scatter","mode":"markers","marker":{"autocolorscale":false,"color":"rgba(228,26,28,1)","opacity":1,"size":15.1181102362205,"symbol":"circle","line":{"width":1.88976377952756,"color":"rgba(228,26,28,1)"}},"hoveron":"points","name":"10%","legendgroup":"10%","showlegend":true,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null},{"x":[1,2,3,4,5,6,7,8,9,10,11,12,13],"y":[0.571,0.578,0.59,0.603,0.65,0.704,0.753,0.877,0.739,0.55,0.728,0.78,0.597],"text":["measure: meanRT<br />percent: 0.571<br />outliers: None","measure: cut_2500<br />percent: 0.578<br />outliers: None","measure: cut_2000<br />percent: 0.590<br />outliers: None","measure: cut_1500<br />percent: 0.603<br />outliers: None","measure: cut_1250<br />percent: 0.650<br />outliers: None","measure: cut_1000<br />percent: 0.704<br />outliers: None","measure: logTr<br />percent: 0.753<br />outliers: None","measure: invTr<br />percent: 0.877<br />outliers: None","measure: trim_Max<br />percent: 0.739<br />outliers: None","measure: medianRT<br />percent: 0.550<br />outliers: None","measure: mean_1.5sd<br />percent: 0.728<br />outliers: None","measure: mean_1sd<br />percent: 0.780<br />outliers: None","measure: windsor<br />percent: 0.597<br />outliers: None"],"type":"scatter","mode":"markers","marker":{"autocolorscale":false,"color":"rgba(55,126,184,1)","opacity":1,"size":15.1181102362205,"symbol":"circle","line":{"width":1.88976377952756,"color":"rgba(55,126,184,1)"}},"hoveron":"points","name":"None","legendgroup":"None","showlegend":true,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null}],"layout":{"margin":{"t":43.7625570776256,"r":7.30593607305936,"b":40.1826484018265,"l":48.9497716894977},"font":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"title":"30ms main effect in mu","titlefont":{"color":"rgba(0,0,0,1)","family":"","size":17.5342465753425},"xaxis":{"domain":[0,1],"type":"linear","autorange":false,"tickmode":"array","range":[0.4,13.6],"ticktext":["mean","2.5s","2s","1.5s","1.25s","1s","log(RT)","1/RT","trim","median","1.5 SD","1 SD","Wind"],"tickvals":[1,2,3,4,5,6,7,8,9,10,11,12,13],"ticks":"","tickcolor":null,"ticklen":3.65296803652968,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.689497716895},"tickangle":0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176,"zeroline":false,"anchor":"y","title":"measure","titlefont":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"hoverformat":".2f"},"yaxis":{"domain":[0,1],"type":"linear","autorange":false,"tickmode":"array","range":[0.05,1.05],"ticktext":["0.00","0.25","0.50","0.75","1.00"],"tickvals":[0,0.25,0.5,0.75,1],"ticks":"","tickcolor":null,"ticklen":3.65296803652968,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.689497716895},"tickangle":0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176,"zeroline":false,"anchor":"x","title":"Proportion of significant tests p < .05 ","titlefont":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"hoverformat":".2f"},"shapes":[{"type":"rect","fillcolor":null,"line":{"color":null,"width":0,"linetype":[]},"yref":"paper","xref":"paper","x0":0,"x1":1,"y0":0,"y1":1}],"showlegend":true,"legend":{"bgcolor":null,"bordercolor":null,"borderwidth":0,"font":{"color":"rgba(0,0,0,1)","family":"","size":11.689497716895},"y":0.913385826771654},"annotations":[{"text":"outliers","x":1.02,"y":1,"showarrow":false,"ax":0,"ay":0,"font":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"xref":"paper","yref":"paper","textangle":0,"xanchor":"left","yanchor":"bottom","legendTitle":true}],"hovermode":"closest"},"source":"A","attrs":{"41cc351055b7":{"x":{},"y":{},"colour":{},"type":"ggplotly"}},"cur_data":"41cc351055b7","visdat":{"41cc351055b7":["function (y) ","x"]},"config":{"modeBarButtonsToAdd":[{"name":"Collaborate","icon":{"width":1000,"ascent":500,"descent":50,"path":"M487 375c710 923 536l79259c3121123223111822123512l263 0c15 029 543 1513 1023 2328 375 135 251 37 0 0 0 3 1 7 1 5 1 8 1 11 0 2 0 41 6 0 31 51 6 1 2 2 4 3 6 1 2 2 4 4 6 2 3 4 5 5 7 5 7 9 16 13 26 4 10 7 19 9 26 0 2 0 5 0 91 41 6 0 8 0 2 2 5 4 8 3 3 5 5 5 7 4 6 8 15 12 26 4 11 7 19 7 26 1 1 0 4 0 91 41 7 0 8 1 2 3 5 6 8 4 4 6 6 6 7 4 5 8 13 13 24 4 11 7 20 7 28 1 1 0 4 0 71 31 61 7 0 2 1 4 3 6 1 1 3 4 5 6 2 3 3 5 5 6 1 2 3 5 4 9 2 3 3 7 5 10 1 3 2 6 4 10 2 4 4 7 6 9 2 3 4 5 7 7 3 2 7 3 11 3 3 0 8 0 131l01c7 2 12 2 14 2l218 0c14 0 255 3216 810 1023 637l79259c722133720437719103710l248 0c5 0921152327 012 413 1820 4120l264 0c5 0 10 2 16 5 5 3 8 6 10 11l85 282c2 5 2 10 2 17 73 137 1713z m304 0c1315 07 11 32 62l174 0c2 0 4 1 7 2 2 2 4 4 5 7l6 18c0 3 0 51 71 13 26 2l173 0c3 05182224447z m2473c1315 07 22 32 62l174 0c2 0 5 0 7 2 3 2 4 4 5 7l6 18c1 2 0 51 61 23 35 3l174 0c3 05173314456z"},"click":"function(gd) { \n // is this being viewed in RStudio?\n if (location.search == '?viewer_pane=1') {\n alert('To learn about plotly for collaboration, visit:\\n https://cpsievert.github.io/plotly_book/plotlyforcollaboration.html');\n } else {\n window.open('https://cpsievert.github.io/plotly_book/plotlyforcollaboration.html', '_blank');\n }\n }"}],"cloud":false},"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.2,"selected":{"opacity":1}},"base_url":"https://plot.ly"},"evals":["config.modeBarButtonsToAdd.0.click"],"jsHooks":{"render":[{"code":"function(el, x) { var ctConfig = crosstalk.var('plotlyCrosstalkOpts').set({\"on\":\"plotly_click\",\"persistent\":false,\"dynamic\":false,\"selectize\":false,\"opacityDim\":0.2,\"selected\":{\"opacity\":1}}); }","data":null}]}}</script>
<p>The stimulation results match up nicely with Ratcliff’s. We can see clearly that all methods have worse power in the presence of 10% outliers. The effect of outliers decreases as the cutoffs become more extreme, with very restrictive cutoffs of anything over 1s having very similar power. The inverse transform (1/RT) has the highest power overall, even performing well with outliers. Both the mean, the 2.5s cutoff mean, and the trimmed mean (maximum RT deleted) suffer the most from outliers, with drops in power of some 3040%.</p>
<p>Any effect on Type 1 error for the other, null main effect?</p>
<pre class="r"><code>null_ps < data.frame(rbind(Bpow_no, Bpow_ten))
null_ps < data.frame((null_ps <= .05) + 0)
null_ps$outliers < rep(c("None", "10%"), each = nSims)
null_plot < null_ps %>%
gather(measure, percent, outliers) %>%
mutate(measure = fct_relevel(measure, "meanRT", "cut_2500", "cut_2000",
"cut_1500", "cut_1250", "cut_1000", "logTr",
"invTr", "trim_Max", "medianRT", "mean_1.5sd",
"mean_1sd", "windsor")) %>%
ggplot(aes(x = measure, y = percent, colour = outliers)) +
stat_summary(fun.y = "mean", geom= "point", size = 4) +
ggtitle("Null main effect") +
labs(y = "Proportion of significant tests p < .05") +
scale_color_brewer(palette = "Set1") +
scale_x_discrete(labels = c("mean", "2.5s", "2s", "1.5s", "1.25s", "1s",
"log(RT)", "1/RT", "trim", "median", "1.5 SD",
"1 SD", "Wind")) +
ylim(0, 1) +
geom_hline(yintercept = 0.05, linetype = "dashed") +
annotate("rect", xmin = 0, xmax = 14, ymin = 0.025, ymax = 0.075, alpha = 0.4) +
theme_minimal()
ggplotly(null_plot)</code></pre>
<div id="41cc780a2995" style="width:672px;height:480px;" class="plotly htmlwidget"></div>
<script type="application/json" datafor="41cc780a2995">{"x":{"data":[{"x":[1,2,3,4,5,6,7,8,9,10,11,12,13],"y":[0.036,0.047,0.04,0.034,0.041,0.044,0.039,0.041,0.046,0.036,0.037,0.047,0.045],"text":["measure: meanRT<br />percent: 0.036<br />outliers: 10%","measure: cut_2500<br />percent: 0.047<br />outliers: 10%","measure: cut_2000<br />percent: 0.040<br />outliers: 10%","measure: cut_1500<br />percent: 0.034<br />outliers: 10%","measure: cut_1250<br />percent: 0.041<br />outliers: 10%","measure: cut_1000<br />percent: 0.044<br />outliers: 10%","measure: logTr<br />percent: 0.039<br />outliers: 10%","measure: invTr<br />percent: 0.041<br />outliers: 10%","measure: trim_Max<br />percent: 0.046<br />outliers: 10%","measure: medianRT<br />percent: 0.036<br />outliers: 10%","measure: mean_1.5sd<br />percent: 0.037<br />outliers: 10%","measure: mean_1sd<br />percent: 0.047<br />outliers: 10%","measure: windsor<br />percent: 0.045<br />outliers: 10%"],"type":"scatter","mode":"markers","marker":{"autocolorscale":false,"color":"rgba(228,26,28,1)","opacity":1,"size":15.1181102362205,"symbol":"circle","line":{"width":1.88976377952756,"color":"rgba(228,26,28,1)"}},"hoveron":"points","name":"10%","legendgroup":"10%","showlegend":true,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null},{"x":[1,2,3,4,5,6,7,8,9,10,11,12,13],"y":[0.053,0.051,0.055,0.049,0.054,0.049,0.047,0.054,0.052,0.057,0.042,0.045,0.045],"text":["measure: meanRT<br />percent: 0.053<br />outliers: None","measure: cut_2500<br />percent: 0.051<br />outliers: None","measure: cut_2000<br />percent: 0.055<br />outliers: None","measure: cut_1500<br />percent: 0.049<br />outliers: None","measure: cut_1250<br />percent: 0.054<br />outliers: None","measure: cut_1000<br />percent: 0.049<br />outliers: None","measure: logTr<br />percent: 0.047<br />outliers: None","measure: invTr<br />percent: 0.054<br />outliers: None","measure: trim_Max<br />percent: 0.052<br />outliers: None","measure: medianRT<br />percent: 0.057<br />outliers: None","measure: mean_1.5sd<br />percent: 0.042<br />outliers: None","measure: mean_1sd<br />percent: 0.045<br />outliers: None","measure: windsor<br />percent: 0.045<br />outliers: None"],"type":"scatter","mode":"markers","marker":{"autocolorscale":false,"color":"rgba(55,126,184,1)","opacity":1,"size":15.1181102362205,"symbol":"circle","line":{"width":1.88976377952756,"color":"rgba(55,126,184,1)"}},"hoveron":"points","name":"None","legendgroup":"None","showlegend":true,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null},{"x":[0,14],"y":[0.05,0.05],"text":"yintercept: 0.05","type":"scatter","mode":"lines","line":{"width":1.88976377952756,"color":"rgba(0,0,0,1)","dash":"dash"},"hoveron":"points","showlegend":false,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null},{"x":[0,0,14,14,0],"y":[0.025,0.075,0.075,0.025,0.025],"text":"","type":"scatter","mode":"lines","line":{"width":1.88976377952756,"color":"transparent","dash":"solid"},"fill":"toself","fillcolor":"rgba(89,89,89,0.4)","hoveron":"fills","showlegend":false,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null}],"layout":{"margin":{"t":43.7625570776256,"r":7.30593607305936,"b":40.1826484018265,"l":48.9497716894977},"font":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"title":"Null main effect","titlefont":{"color":"rgba(0,0,0,1)","family":"","size":17.5342465753425},"xaxis":{"domain":[0,1],"type":"linear","autorange":false,"tickmode":"array","range":[0,14],"ticktext":["mean","2.5s","2s","1.5s","1.25s","1s","log(RT)","1/RT","trim","median","1.5 SD","1 SD","Wind"],"tickvals":[1,2,3,4,5,6,7,8,9,10,11,12,13],"ticks":"","tickcolor":null,"ticklen":3.65296803652968,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.689497716895},"tickangle":0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176,"zeroline":false,"anchor":"y","title":"measure","titlefont":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"hoverformat":".2f"},"yaxis":{"domain":[0,1],"type":"linear","autorange":false,"tickmode":"array","range":[0.05,1.05],"ticktext":["0.00","0.25","0.50","0.75","1.00"],"tickvals":[0,0.25,0.5,0.75,1],"ticks":"","tickcolor":null,"ticklen":3.65296803652968,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.689497716895},"tickangle":0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176,"zeroline":false,"anchor":"x","title":"Proportion of significant tests p < .05","titlefont":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"hoverformat":".2f"},"shapes":[{"type":"rect","fillcolor":null,"line":{"color":null,"width":0,"linetype":[]},"yref":"paper","xref":"paper","x0":0,"x1":1,"y0":0,"y1":1}],"showlegend":true,"legend":{"bgcolor":null,"bordercolor":null,"borderwidth":0,"font":{"color":"rgba(0,0,0,1)","family":"","size":11.689497716895},"y":0.913385826771654},"annotations":[{"text":"outliers","x":1.02,"y":1,"showarrow":false,"ax":0,"ay":0,"font":{"color":"rgba(0,0,0,1)","family":"","size":14.6118721461187},"xref":"paper","yref":"paper","textangle":0,"xanchor":"left","yanchor":"bottom","legendTitle":true}],"hovermode":"closest"},"source":"A","attrs":{"41cc5d262f26":{"x":{},"y":{},"colour":{},"type":"ggplotly"},"41cc44fb4979":{"yintercept":{}},"41cc3a766b8":{"xmin":{},"xmax":{},"ymin":{},"ymax":{}}},"cur_data":"41cc5d262f26","visdat":{"41cc5d262f26":["function (y) ","x"],"41cc44fb4979":["function (y) ","x"],"41cc3a766b8":["function (y) ","x"]},"config":{"modeBarButtonsToAdd":[{"name":"Collaborate","icon":{"width":1000,"ascent":500,"descent":50,"path":"M487 375c710 923 536l79259c3121123223111822123512l263 0c15 029 543 1513 1023 2328 375 135 251 37 0 0 0 3 1 7 1 5 1 8 1 11 0 2 0 41 6 0 31 51 6 1 2 2 4 3 6 1 2 2 4 4 6 2 3 4 5 5 7 5 7 9 16 13 26 4 10 7 19 9 26 0 2 0 5 0 91 41 6 0 8 0 2 2 5 4 8 3 3 5 5 5 7 4 6 8 15 12 26 4 11 7 19 7 26 1 1 0 4 0 91 41 7 0 8 1 2 3 5 6 8 4 4 6 6 6 7 4 5 8 13 13 24 4 11 7 20 7 28 1 1 0 4 0 71 31 61 7 0 2 1 4 3 6 1 1 3 4 5 6 2 3 3 5 5 6 1 2 3 5 4 9 2 3 3 7 5 10 1 3 2 6 4 10 2 4 4 7 6 9 2 3 4 5 7 7 3 2 7 3 11 3 3 0 8 0 131l01c7 2 12 2 14 2l218 0c14 0 255 3216 810 1023 637l79259c722133720437719103710l248 0c5 0921152327 012 413 1820 4120l264 0c5 0 10 2 16 5 5 3 8 6 10 11l85 282c2 5 2 10 2 17 73 137 1713z m304 0c1315 07 11 32 62l174 0c2 0 4 1 7 2 2 2 4 4 5 7l6 18c0 3 0 51 71 13 26 2l173 0c3 05182224447z m2473c1315 07 22 32 62l174 0c2 0 5 0 7 2 3 2 4 4 5 7l6 18c1 2 0 51 61 23 35 3l174 0c3 05173314456z"},"click":"function(gd) { \n // is this being viewed in RStudio?\n if (location.search == '?viewer_pane=1') {\n alert('To learn about plotly for collaboration, visit:\\n https://cpsievert.github.io/plotly_book/plotlyforcollaboration.html');\n } else {\n window.open('https://cpsievert.github.io/plotly_book/plotlyforcollaboration.html', '_blank');\n }\n }"}],"cloud":false},"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.2,"selected":{"opacity":1}},"base_url":"https://plot.ly"},"evals":["config.modeBarButtonsToAdd.0.click"],"jsHooks":{"render":[{"code":"function(el, x) { var ctConfig = crosstalk.var('plotlyCrosstalkOpts').set({\"on\":\"plotly_click\",\"persistent\":false,\"dynamic\":false,\"selectize\":false,\"opacityDim\":0.2,\"selected\":{\"opacity\":1}}); }","data":null}]}}</script>
<p>For all measures, the results are close to the nominal alpha of .05  I’ve marked the region from .025 to .075 with a grey rectangle. The interaction term comes out much the same. So in other words, most of the procedures increase power without increasing the rate of false positives.</p>
<p>A couple of things I always wondered about with these simulations: Normally data that comes from within subjects is correlated, and there’s no attempt to manipulate that here, and the number of trials per condition is very low at 7, which seems like it may exacerbate the influence of outliers in general. The size of the main effect should probably follow a normal distribution rather than being a flat 30ms for everyone. And the betweensubject variability in general might be better represented by a normal distribution too. I wouldn’t expect all of these to influence the main conclusions, but I’ll have a look at that at a later date.</p>
</div>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Ratcliff, R. (1993). Methods for dealing with reaction time outliers. Psychological Bulletin, 114(3), 510532 <a href="http://dx.doi.org/10.1037/00332909.114.3.510">doi:</a><a href="#fnref1">↩</a></p></li>
<li id="fn2"><p>Which isn’t pining for the fjords.<a href="#fnref2">↩</a></p></li>
</ol>
</div>