25 Matching Annotations

Mar 2026
ianarawjo.com ianarawjo.com

Fair Statistical Communication in HCI

10
1. ianarawjo 30 Mar 2026
  
  in Public
  
  As evidenced by numerous studies on statistical cognition (Kline, 2004; Beyth-Marom et al, 2008), even trained scientists have a hard time interpreting p-values, which frequently leads to misleading or incorrect conclusions.
  
  p-value is misinterpreted and confusing
  
  p-value is misleading ai-user-approved
2. ianarawjo 30 Mar 2026
  
  in Public
  
  few researchers can resist the temptation to conclude that there is no effect, a common fallacy called "accepting the null" which had frequently led to misleading or wrong scientific conclusions (Dienes, 2014, p.1).
  
  p-value is misinterpreted and confusing
  
  p-value is misleading ai-user-approved
3. ianarawjo 30 Mar 2026
  
  in Public
  
  Again, p is the probability of seeing results as extreme (or more extreme) as those actually observed if the null hypothesis were true. So p is computed under the assumption that the null hypothesis is true. Yet it is common for researchers, teachers and even textbooks to think of p as the probability of the null hypothesis being true (or equivalently, of the results being due to chance), an error called the "fallacy of the transposed conditional" (Haller and Krauss, 2002; Cohen, 1994, p.999).
  
  p-value is misinterpreted and confusing
  
  p-value is misleading ai-user-approved
4. ianarawjo 30 Mar 2026
  
  in Public
  
  Many researchers fail to appreciate that p-values are unreliable and vary widely across replications.
  
  p-value is misinterpreted and confusing
  
  p-value is misleading ai-user-approved
5. ianarawjo 30 Mar 2026
  
  in Public
  
  Providing non-misleading interpretations of figures with confidence intervals requires judgment, and no mechanical decision procedure can carry out this job better than a thoughtful investigator.
  
  Estimation is necessary but not sufficient
  
  Estimation is necessary but not sufficient ai-user-approved
6. ianarawjo 30 Mar 2026
  
  in Public
  
  NHST as it is carried out today consists of this incoherent mix of Fisher and Neyman–Pearson methods (Gigerenzer, 2004).
  
  p-value is misinterpreted and confusing
  
  ai-pending p-value is misleading
7. ianarawjo 30 Mar 2026
  
  in Public
  
  p-values give a seductive illusion of certainty and truth (Cumming, 2012, Chap. 1). The sacred α = .05 criterion amplifies this illusion, since results end up being either "significant" or "non-significant".
  
  p-value is misinterpreted and confusing
  
  ai-pending p-value is misleading
8. ianarawjo 30 Mar 2026
  
  in Public
  
  Estimation seems much more likely to promote clear statistical thinking.
  
  Need to change our way of thinking
  
  change thinking ai-user-approved
9. ianarawjo 30 Mar 2026
  
  in Public
  
  Decades spent educating researchers have had little or no influence on beliefs and practice (Schmidt and Hunter, 1997, pp.20–22).
  
  Calls for reform fall on deaf ears
  
  reform-failure ai-user-approved
10. ianarawjo 30 Mar 2026
  
  in Public
  
  NHST has been severely criticized for more than 50 years by end users to whom fair statistical communication matters.
  
  Calls for reform fall on deaf ears
  
  reform-failure ai-user-approved
Visit annotations in context

Tags

Estimation is necessary but not sufficient

ai-pending

reform-failure

ai-user-approved

p-value is misleading

change thinking

Annotators

ianarawjo

URL

ianarawjo.com/annotation-test/fairstats-last.pdf
ianarawjo.com ianarawjo.com

Why Hypothesis Tests Are Essential for Psychological Science

2
1. ianarawjo 30 Mar 2026
  
  in Public
  
  This assessment raises two issues. First, it is arbitrary. If 10 of the 15 CIs included the predicted values, would the results also support the theory, or instead refute it? If one instead used 99% CIs, would positive results for 12 of the 15 predictions be enough to support the theory? This arbitrariness arises because CIs offer no principled method for generating an inference regarding the theory.
  
  Estimation is too messy / complex and not clear enough
  
  Messiness of estimation as a solution ai-user-approved
2. ianarawjo 30 Mar 2026
  
  in Public
  
  two out of three necessary conditions for testing theory are missing.
  
  Estimation is too messy / complex and not clear enough
  
  Messiness of estimation as a solution ai-user-approved
Visit annotations in context

Tags

ai-user-approved

Messiness of estimation as a solution

Annotators

ianarawjo

URL

ianarawjo.com/annotation-test/morey-et-al-2014-why-hypothesis-tests-are-essential-for-psychological-science.pdf
ianarawjo.com ianarawjo.com

Rethinking statistical analysis methods for CHI

4
1. ianarawjo 30 Mar 2026
  
  in Public
  
  To illustrate this point Oakes posed a series of true/false questions regarding the interpretation of p-vales to seventy experienced researchers and discovered that only two had a sound understanding of the underlying concept of significance [25].
  
  Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
  
  ai-user-approved
2. ianarawjo 30 Mar 2026
  
  in Public
  
  failure to check assumptions about the data required by particular tests, over-testing and using inappropriate tests
  
  Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
  
  ai-user-approved
3. ianarawjo 30 Mar 2026
  
  in Public
  
  abusing statistical tests, making illogical arguments as a result of tests, deriving inappropriate conclusions from nonsignificant results, and confusing the size of p-values with effect sizes.
  
  Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
  
  ai-user-approved
4. ianarawjo 30 Mar 2026
  
  in Public
  
  This approach, fiercely promoted by Fisher in the 1930's [9], has become the gold standard in many disciplines including quantitative evaluations in HCI. However, the approach is rather counter-intuitive; many researchers misinterpret the meaning of the p-value.
  
  Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
  
  ai-user-approved
Visit annotations in context

Tags

ai-user-approved

Annotators

ianarawjo

URL

ianarawjo.com/annotation-test/2207676.2208557.pdf
sites.stat.columbia.edu sites.stat.columbia.edu

Untitled document

2
1. ianarawjo 29 Mar 2026
  
  in Public
  
  the solution is not to reform p-values or to replace them with some other statistical summary or threshold, but rather to move toward a greater acceptance of uncertainty and embracing of variation.
  
  Where it's mentioned how to address the problems with p-values
  
  ai-pending Solutions to p-value crisis
2. ianarawjo 29 Mar 2026
  
  in Public
  
  What, then, can and should be done? I agree with the ASA statement's final paragraph, which emphasizes the importance of design, understanding, and context—and I would also add measurement to that list.
  
  Where it's mentioned how to address the problems with p-values
  
  ai-pending Solutions to p-value crisis
Visit annotations in context

Tags

ai-pending

Solutions to p-value crisis

Annotators

ianarawjo

URL

sites.stat.columbia.edu/gelman/research/published/asa_pvalues.pdf
courses.media.mit.edu courses.media.mit.edu

Untitled document

7
1. ianarawjo 29 Mar 2026
  
  in Public
  
  the psychology research community has been strongly questioning the value of NHST in psychology for some years now [6] and calling for a more meaningful reporting of statistical inference based on effect sizes, confidence intervals and Bayesian reasoning [9].
  
  Mentioning the problems with p-values
  
  ai-pending p-values
2. ianarawjo 29 Mar 2026
  
  in Public
  
  Similarly, if the significance level is set at 0.05, then this is the probability of the data occurring by chance when there is no experimental effect, namely one in twenty times. The more tests that are done on a particular dataset, the more likely it is that some chance variation will be extreme enough to seem like significance.
  
  Mentioning the problems with p-values
  
  ai-pending p-values
3. ianarawjo 29 Mar 2026
  
  in Public
  
  Violation of the assumptions of any statistical test can produce p values that bear little relation to the actual probabilities of outcomes and hence comparison to the significance level of 0.05 is meaningless.
  
  Mentioning the problems with p-values
  
  ai-pending p-values
4. ianarawjo 29 Mar 2026
  
  in Public
  
  for an analysis to be sound, it is necessary that in the tests performed the probabilities of outcomes are accurately reflected in the p values produced by the tests. If this is not the case, then the NHST argument form is severely weakened.
  
  Mentioning the problems with p-values
  
  ai-pending p-values
5. ianarawjo 29 Mar 2026
  
  in Public
  
  NHST is the most commonly encountered form of statistical inference and is what is usually associated with producing a null hypothesis, then testing it to give some statistic such as a t value, and then turning the statistic into a p value.
  
  Mentioning the problems with p-values
  
  ai-pending p-values
6. ianarawjo 29 Mar 2026
  
  in Public
  
  properly reported non-significant results can help future researchers to provide estimates of effect sizes and associated confidence intervals.
  
  Effect sizes are mentioned
  
  ai-pending Effect sizes
7. ianarawjo 29 Mar 2026
  
  in Public
  
  calling for a more meaningful reporting of statistical inference based on effect sizes, confidence intervals and Bayesian reasoning [9].
  
  Effect sizes are mentioned
  
  ai-pending Effect sizes
Visit annotations in context

Tags

p-values

ai-pending

Effect sizes

Annotators

ianarawjo

URL

courses.media.mit.edu/2010spring/mas964/Week6/Cairns_2007.pdf

Ian Arawjo

Annotations: 25

Joined: March 29, 2026

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL