A Guide to Robust Statistical Methods in Neuroscience

Rand R. Wilcox1, Guillaume A. Rousselet2

1 Deptartment of Psychology, University of Southern California, Los Angeles, California, 2 Institute of Neuroscience and Psychology, College of Medical, Veterinary, and Life Sciences, University of Glasgow, Glasgow
Publication Name:  Current Protocols in Neuroscience
Unit Number:  Unit 8.42
DOI:  10.1002/cpns.41
Online Posting Date:  January, 2018
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


There is a vast array of new and improved methods for comparing groups and studying associations that offer the potential for substantially increasing power, providing improved control over the probability of a Type I error, and yielding a deeper and more nuanced understanding of data. These new techniques effectively deal with four insights into when and why conventional methods can be unsatisfactory. But for the non‐statistician, the vast array of new and improved techniques for comparing groups and studying associations can seem daunting, simply because there are so many new methods that are now available. This unit briefly reviews when and why conventional methods can have relatively low power and yield misleading results. The main goal is to suggest some general guidelines regarding when, how, and why certain modern techniques might be used. © 2018 by John Wiley & Sons, Inc.

Keywords: non‐normality; heteroscedasticity; skewed distributions; outliers; curvature

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Insights Regarding Conventional Methods
  • Dealing with Violation of Assumptions
  • Comparing Groups and Measures of Association
  • Some Illustrations
  • A Suggested Guide
  • Concluding Remarks
  • Acknowledgments
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
  Agresti, A., & Coull, B. A. (1998). Approximate is better than "exact" for interval estimation of binomial proportions. American Statistician, 52, 119–126. doi: 10.2307/2685469.
  Almeida‐Suhett, C. P., Prager, E. M., Pidoplichko, V., Figueiredo, T. H., Marini, A. M., Li, Z., … Braga, M. (2014). Reduced GABAergic inhibition in the basolateral amygdala and the development of anxiety‐like behaviors after mild traumatic brain injury. PLoS One, 9, e102627. doi: 10.1371/journal.pone.0102627.
  Bessel, F. W. (1818). Fundamenta Astronomiae pro anno MDCCLV deducta ex observationibus viri incomparabilis James Bradley in specula astronomica Grenovicensi per annos 1750‐1762 institutis. Königsberg: Friedrich Nicolovius.
  Bernhardson, C. (1975). Type I error rates when multiple comparison procedures follow a significant F test of ANOVA. Biometrics, 31, 719–724. doi: 10.2307/2529724.
  Boik, R. J. (1987). The Fisher‐Pitman permutation test: A non‐robust alternative to the normal theory F test when variances are heterogeneous. British Journal of Mathematical and Statistical Psychology, 40, 26–42. doi: 10.1111/j.2044‐8317.1987.tb00865.x.
  Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152. doi: 10.1111/j.2044‐8317.1978.tb00581.x.
  Brown, M. B., & Forsythe, A. (1974). The small sample behavior of some statistics which test the equality of several means. Technometrics, 16, 129–132. doi: 10.1080/00401706.1974.10489158.
  Brunner, E., Domhof, S., & Langer, F. (2002). Nonparametric analysis of longitudinal data in factorial experiments. New York: Wiley.
  Chung, E., & Romano, J. P. (2013). Exact and asymptotically robust permutation tests. Annals of Statistics, 41, 484–507. doi: 10.1214/13‐AOS1090.
  Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74, 829–836. doi: 10.1080/01621459.1979.10481038.
  Cliff, N. (1996). Ordinal methods for behavioral data analysis. Mahwah, NJ: Erlbaum.
  Cressie, N. A. C., & Whitford, H. J. (1986). How to use the two sample t‐test. Biometrical Journal, 28, 131–148. doi: 10.1002/bimj.4710280202.
  Derksen, S., & Keselman, H. J. (1992). Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematical and Statistical Psychology, 45, 265–282. doi: 10.1111/j.2044‐8317.1992.tb00992.x.
  Doksum, K. A., & Sievers, G. L. (1976). Plotting with confidence: Graphical comparisons of two populations. Biometrika, 63, 421–434. doi: 10.1093/biomet/63.3.421.
  Doksum, K. A., & Wong, C.‐W. (1983). Statistical tests based on transformed data. Journal of the American Statistical Association, 78, 411–417. doi: 10.1080/01621459.1983.10477986.
  Duncan, G. T., & Layard, M. W. (1973). A Monte‐Carlo study of asymptotically robust tests for correlation. Biometrika, 60, 551–558. doi: 10.1093/biomet/60.3.551.
  Fagerland, M. W., & Leiv Sandvik, L. (2009). The Wilcoxon‐Mann‐Whitney test under scrutiny. Statistics in Medicine, 28, 1487–1497. doi: 10.1002/sim.3561.
  Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Thousand Oaks, CA: Sage Publishing.
  Grayson, D. (2004). Some myths and legends in quantitative psychology. Understanding Statistics, 3, 101–134. doi: 10.1207/s15328031us0302_3.
  Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., & Stahel, W. A. (1986). Robust statistics. New York: Wiley.
  Hand, A. (1998). A History of mathematical statistics from 1750 to 1930. New York: Wiley.
  Harrell, F. E., & Davis, C. E. (1982). A new distribution‐free quantile estimator. Biometrika, 69, 635–640. doi: 10.1093/biomet/69.3.635.
  Heritier, S., Cantoni, E., Copt, S., & Victoria‐Feser, M.‐P. (2009). Robust methods in biostatistics. New York: Wiley.
  Hettmansperger, T. P., & McKean, J. W. (2011). Robust nonparametric statistical methods, 2nd ed. Boca Raton, FL: CRC Press.
  Hettmansperger, T. P., & Sheather, S. J. (1986). Confidence interval based on interpolated order statistics. Statistical Probability Letters, 4, 75–79. doi: 10.1016/0167‐7152(86)90021‐0.
  Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800–802. doi: 10.1093/biomet/75.4.800.
  Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika, 75, 383–386. doi: 10.1093/biomet/75.2.383.
  Houston, S. M., Lebel, C., Katzir, T., Manis, F. R., Kan, E., Rodriguez, G. R., & Sowell, E. R. (2014). Reading skill and structural brain development. Neuroreport, 25, 347–352. doi: 10.1097/WNR.0000000000000121.
  Huber, P. J., & Ronchetti, E. (2009). Robust statistics, 2nd ed. New York: Wiley.
  Keselman, H. J., Othman, A., & Wilcox, R. R. (2016). Generalized linear model analyses for treatment group equality when data are non‐normal. Journal of Modern and Applied Statistical Methods, 15, 32–61. doi: 10.22237/jmasm/1462075380.
  Mancini, F. (2016). ANA_mancini14_data.zip. figshare. Retreived from https://figshare.com/articles/ANA_mancini14_data_zip/3427766.
  Mancini, F., Bauleo, A., Cole, J., Lui, F., Porro, C. A., Haggard, P., & Iannetti, G. D. (2014). Whole‐body mapping of spatial acuity for pain and touch. Annals of Neurology, 75, 917–924. doi: 10.1002/ana.24179.
  Maronna, R. A., Martin, D. R., & Yohai, V. J. (2006). Robust statistics: Theory and methods. New York: Wiley.
  Montgomery, D. C., & Peck, E. A. (1992). Introduction to linear regression analysis. New York: Wiley.
  Newcombe, R. G. (2006). Confidence intervals for an effect size measure based on the Mann‐Whitney statistic. Part 1: General issues and tail‐area‐based methods. Statistics in Medicine, 25, 543–557. doi: 10.1002/sim.2323.
  Nieuwenhuis, S., Forstmann, B. U., & Wagenmakers, E.‐J. (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nature Neuroscience, 9, 1105–1107. doi: 10.1038/nn.2886.
  Özdemir, A. F., Wilcox, R. R., & Yildiztepe, E. (2013). Comparing measures of location: Some small‐sample results when distributions differ in skewness and kurtosis under heterogeneity of variances. Communications in Statistics‐Simulation and Computations, 42, 407–424. doi: 10.1080/03610918.2011.636163.
  Pernet, C. R., Wilcox, R., & Rousselet, G. A. (2012). Robust correlation analyses: False positive and power validation using a new open source matlab toolbox. Frontiers in Psychology, 3, 606. doi: 10.3389/fpsyg.2012.00606.
  Pernet, C. R., Latinus, M., Nichols, T. E., & Rousselet, G. A. (2015). Cluster‐based computational methods for mass univariate analyses of event‐related brain potentials/fields: A simulation study. Journal of Neuroscience Methods, 250, 85–93. doi: 10.1016/j.jneumeth.2014.08.003.
  Rasmussen, J. L. (1989). Data transformation, Type I error rate and power. British Journal of Mathematical and Statistical Psychology, 42, 203–211. doi: 10.1111/j.2044‐8317.1989.tb00910.x.
  Romano, J. P. (1990). On the behavior of randomization tests without a group invariance assumption. Journal of the American Statistical Association, 85, 686–692. doi: 10.1080/01621459.1990.10474928.
  Rousseeuw, P. J., & Leroy, A. M. (1987). Robust regression & outlier detection. New York: Wiley.
  Rousselet, G. A., Foxe, J. J., & Bolam, J. P. (2016). A few simple steps to improve the description of group results in neuroscience. European Journal of Neuroscience, 44, 2647–2651. doi: 10.1111/ejn.13400.
  Rousselet, G. A. & Pernet, C. R. (2012). Improving standards in brain‐behavior correlation analyses. Frontiers in Human Neuroscience, 6, 119. doi: 10.3389/fnhum.2012.00119.
  Rousselet, G. A., Pernet, C. R., & Wilcox, R. R. (2017). Beyond differences in means: Robust graphical methods to compare two groups in neuroscience. European Journal of Neuroscience, 46, 1738–1748. doi: 10.1111/ejn.1361013610.
  Ruscio, J. (2008). A probability‐based measure of effect size: Robustness to base rates and other factors. Psychological Methods, 13, 19–30. doi: 10.1037/1082‐989X.13.1.19.
  Ruscio, J., & Mullen, T. (2012). Confidence intervals for the probability of superiority effect size measure and the area under a receiver operating characteristic curve. Multivariate Behavioral Research, 47, 201–223. doi: 10.1080/00273171.2012.658329.
  Schilling, M., & Doi, J. (2014). A coverage probability approach to finding an optimal binomial confidence procedure. American Statistician, 68, 133–145. doi: 10.1080/00031305.2014.899274.
  Staudte, R. G., & Sheather, S. J. (1990). Robust estimation and testing. New York: Wiley.
  Talebi, V., & Baker, C. L., Jr. (2016). Categorically distinct types of receptive fields in early visual cortex. Journal of Neurophysiology, 115, 2556–2576. doi: 10.1152/jn.00659.2015.
  Tukey, J. W. (1960). A survey of sampling from contaminated normal distributions. In I. Olkin (Ed.), Contributions to Probability and Statistics: Essays in honor of Harold Hotelling (pp. 448–485). Stanford, CA: Stanford University Press.
  Tukey, J. W., & McLaughlin, D. H. (1963). Less vulnerable confidence and significance procedures for location based on a single sample: Trimming/Winsorization 1. Sankhya: The Indian Journal of Statistics, 25, 331–352.
  Wagenmakers, E. J., Wetzels, R., Borsboom, D., van der Maas, H. L., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives in Psychological Science, 7, 632–638. doi: 10.1177/1745691612463078.
  Weissgerber, T. L., Milic, N. M., Winham, S. J., & Garovic, V. D. (2015). Beyond bar and line graphs: Time for a new data presentation paradigm. PLOS Biology, 13, e1002128. doi: 10.1371/journal.pbio.1002128.
  Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal. Biometrika, 29, 350–362. doi: 10.1093/biomet/29.3‐4.350.
  Wiederhold, J. L., & Bryan, B. R. (2001). Gray oral reading test, 4th ed. Austin, TX: ProEd Publishing.
  Wilcox, R. R. (2009). Comparing Pearson correlations: Dealing with heteroscedasticity and non‐normality. Communications in Statistics‐Simulation and Computation, 38, 2220–2234. doi: 10.1080/03610910903289151.
  Wilcox, R. R. (2012). Comparing two independent groups via a quantile generalization of the Wilcoxon–Mann–Whitney test. Journal of Modern and Applied Statistical Methods, 11, 296–302. doi: 10.22237/jmasm/1351742460.
  Wilcox, R. R. (2017a). Introduction to robust estimation and hypothesis testing, (4th ed.). San Diego: Academic Press.
  Wilcox, R. R. (2017b). Understanding and applying basic statistical methods using R. New York: Wiley.
  Wilcox, R. R. (2017c). Modern statistics for the social and behavioral sciences: A practical introduction, (2nd ed.). New York: CRC Press.
  Wilcox, R. R. (2018). Robust regression: An inferential method for determining which independent variables are most important. Journal of Applied Statistics, 45, 100–111. doi: 10.1080/02664763.2016.1268105.
  Wilcox, R. R. (in press). An inferential method for determining which of two independent variables is most important when there is curvature. Journal of Modern and Applied Statistical Methods.
  Wilcox, R. R., & Rousselet, G. A. (2017). A guide to robust statistical methods: Illustrations using R figshare. Retrieved from https://figshare.com/articles/Current\_Protocols\_pdf/5047591.
  Winkler, A. M., Ridgwaym, G. R., Webster, M. A., Smith, S. M., & Nichols, T. M. (2014). Permutation inference for the general linear model. NeuroImage, 92, 381–397. doi: 10.1016/j.neuroimage.2014.01.060.
  Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock Johnson‐III tests of achievement. Rolling Meadows, IL: Riverside Publishing.
  Yuen, K. K. (1974). The two sample trimmed t for unequal population variances. Biometrika, 61, 165–170. doi: 10.1093/biomet/61.1.165.
PDF or HTML at Wiley Online Library