BAY E S FACTOR 235
Ashby, F. G., & Maddox, W. T. (1992). Complex decision rules in cat-
egorization: Contrasting novice and experienced performance. Jour-
nal of Experimental Psychology: Human Perception & Performance,
18, 50-71.
Augustin, T. (2008). Stevens’ power law and the problem of meaning-
fulness. Acta Psychologica, 128, 176.
Berger, J. O., & Berry, D. A. (1988). Analyzing data: Is objectivity
possible? American Scientist, 76, 159-165.
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete
multivariate analysis: Theory and practice. Cambridge, MA: MIT
Press.
Clarke, F. R. (1957). Constant-ratio rule for confusion matrices in
speech communication. Journal of the Acoustical Society of America,
29, 715-720.
Cohen, J. (1994). The earth is round ( p .05). American Psychologist,
49, 997-1003.
Cumming, G., & Finch, S. (2001). A primer on the understanding, use,
and calculation of confidence intervals based on central and noncentral
distributions. Educational & Psychological Measurement, 61, 532-574.
Debner, J. A., & Jacoby, L. L. (1994). Unconscious perception: Atten-
tion, awareness, and control. Journal of Experimental Psychology:
Learning, Memory, & Cognition, 20, 304-317.
Dehaene, S., Naccache, L., Le Clec’H, G., Koechlin, E., Muel-
ler, M., Dehaene-Lambertz, G., et al. (1998). Imaging uncon-
scious semantic priming. Nature, 395, 597-600.
Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statisti-
cal inference for psychological research. Psychological Review, 70,
193-242.
Egan, J. P. (1975). Signal detection theory and ROC-analysis. New
York: Academic Press.
Fechner, G. T. (1966). Elements of psychophysics. New York: Holt,
Rinehart & Winston. (Original work published 1860)
García-Donato, G., & Sun, D. (2007). Objective priors for hypothesis
testing in one-way random effects models. Canadian Journal of Sta-
tistics, 35, 303-320.
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayes-
ian data analysis (2nd ed.). Boca Raton, FL: Chapman & Hall.
Gillispie, C. C., Fox, R., & Grattan-Guinness, I. (1997). Pierre-
Simon Laplace, 1749–1827: A life in exact science. Princeton, NJ:
Princeton University Press.
Gönen, M., Johnson, W. O., Lu, Y., & Westfall, P. H. (2005). The
Bayesian two-sample t test. American Statistician, 59, 252-257.
Goodman, S. N. (1999). Toward evidence-based medical statistics:
I. The p value fallacy. Annals of Internal Medicine, 130, 995-1004.
Green, D. M., & Swets, J. A. (1966). Signal detection theory and
psychophysics. New York: Wiley.
Grider, R. C., & Malmberg, K. J. (2008). Discriminating between
changes in bias and changes in accuracy for recognition memory of
emotional stimuli. Memory & Cognition, 36, 933-946.
Hawking, S. (E
D.) (2002). On the shoulders of giants: The great works
of physics and astronomy. Philadelphia: Running Press.
Hays, W. L. (1994). Statistics (5th ed.). Fort Worth, TX: Harcourt
Brace.
Jacoby, L. L. (1991). A process dissociation framework: Separating
automatic from intentional uses of memory. Journal of Memory &
Language, 30, 513-541.
Jeffreys, H. (1961). Theory of probability (3rd ed.). Oxford: Oxford
University Press, Clarendon Press.
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the
American Statistical Association, 90, 773-795.
Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for
nested hypotheses with large samples. Journal of the American Statis-
tical Association, 90, 928-934.
Killeen, P. R. (2005). An alternative to null-hypothesis significance
tests. Psychological Science, 16, 345-353.
Killeen, P. R. (2006). Beyond statistical inference: A decision theory
for science. Psychonomic Bulletin & Review, 13, 549-562.
Kline, R. B. (2004). Beyond significance testing: Reforming data
analysis methods in behavioral research. Washington, DC: American
Psychological Association.
Lee, M. D., & Wagenmakers, E.-J. (2005). Bayesian statistical infer-
ence in psychology: Comment on Trafimow (2003). Psychological
Review, 112, 662-668.
invariances and, as a consequence, overstate the evidence
against them.
It is reasonable to ask whether hypothesis testing is always
necessary. In many ways, hypothesis testing has been em-
ployed in experimental psychology too often and too hast-
ily, without sufficient attention to what may be learned by
exploratory examination for structure in data (Tukey, 1977).
To observe structure, it is often sufficient to plot estimates
of appropriate quantities along with measures of estimation
error (Rouder & Morey, 2005). As a rule of thumb, hypoth-
esis testing should be reserved for those cases in which the
researcher will entertain the null as theoretically interesting
and plausible, at least approximately.
Researchers willing to perform hypothesis testing must
realize that the endeavor is inherently subjective (Berger
& Berry, 1988). For any data set, the null will be superior
to some alternatives and inferior to others. Therefore, it
is necessary to commit to specific alternatives, with the
resulting evidence dependent to some degree on this com-
mitment. This commitment is essential to and unavoid-
able for sound hypothesis testing in both frequentist and
Bayesian settings. We advocate Bayes factors because
their interpretation is straightforward and natural. More-
over, in Bayesian analysis, the elements of subjectivity
are transparent rather than hidden (Wagenmakers, Lee,
Lodewyckx, & Iverson, 2008).
This commitment to specify judicious and reasoned al-
ternatives places a burden on the analyst. We have provided
default settings appropriate to generic situations. Nonethe-
less, these recommendations are just that and should not be
used blindly. Moreover, analysts can and should consider
their goals and expectations when specifying priors. Sim-
ply put, principled inference is a thoughtful process that
cannot be performed by rigid adherence to defaults.
There is no loss in dispensing with the illusion of ob-
jectivity in hypothesis testing. Researchers are acclimated
to elements of social negotiation and subjectivity in sci-
entific endeavors. Negotiating the appropriateness of vari-
ous alternatives is no more troubling than negotiating the
appropriateness of other elements, including design, oper-
ationalization, and interpretation. As part of the everyday
practice of psychological science, we have the communal
infrastructure to evaluate and critique the specification of
alternatives. This view of negotiated alternatives is vastly
preferable to the current practice, in which significance
tests are mistakenly regarded as objective. Even though
inference is subjective, we can agree on the boundaries
of reasonable alternatives. The sooner we adopt inference
based on specifying alternatives, the better.
AUTHOR NOTE
We are grateful for valuable comments from Michael Pratte, E.-J.
Wagenmakers, and Peter Dixon. This research was supported by NSF
Grant SES-0720229 and NIMH Grant R01-MH071418. Please address
correspondence to J. N. Rouder, Department of Psychological Sciences,
210 McAlester Hall, University of Missouri, Columbia, MO 65211
REFERENCES
Akaike, H. (1974). A new look at the statistical model identification.
IEEE Transactions on Automatic Control, 19, 716-723.