SciELO - Scientific Electronic Library Online

vol.146 issue10Consensus statement of the Chilean endocrinological society on the role of bariatric surgery in type 2 diabetesOscar Castro Zúñiga, the poet who succumbed to tuberculosis author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand




Related links

  • On index processCited by Google
  • Have no similar articlesSimilars in SciELO
  • On index processSimilars in Google


Revista médica de Chile

Print version ISSN 0034-9887


DOMENECH, RAÚL J. The uncertainties of statistical “significance”. Rev. méd. Chile [online]. 2018, vol.146, n.10, pp.1184-1189. ISSN 0034-9887.

Statistical inference was introduced by Fisher and Neyman-Pearson more than 90 years ago to define the probability that the difference in results between several groups is due to randomness or is a real, “significant” difference. The usual procedure is to test the probability (P) against the null hypothesis that there is no real difference except because of the inevitable sampling variability. If this probability is high we accept the null hypothesis and infer that there is no real difference, but if P is low (P < 0.05) we reject the null hypothesis and infer that there is, a “significant” difference. However, a large amount of discoveries using this method are not reproducible. Statisticians have defined the deficiencies of the method and warned the researchers that P is a very unreliable measure. Two uncertainties of the “significance” concept are described in this review: a) The inefficacy of a P value to discard the null hypothesis; b) The low probability to reproduce a P value after an exact replication of the experiment. Due to the discredit of “significance” the American Statistical Association recently stated that P values do not provide a good measure of evidence for a hypothesis. Statisticians recommend to never use the word “significant” because it is misleading. Instead, the exact P value should be stated along with the effect size and confidence intervals. Nothing greater than P = 0.001 should be considered as a demonstration that something was discovered. Currently, several alternatives are being studied to replace the classical concepts.

Keywords : Biostatistics; Confidence Intervals; Reproducibility of Results.

        · text in Spanish     · Spanish ( pdf )