Editorial Revealing an Exchange Between Authors and Reviewers About Statistical Significance
Donna E. Alvermann
David Reinking
As editors, we are privy to exchanges between authors and reviewers during the review process. To preserve confidentiality in the review process, these exchanges are not revealed beyond the editorial review process, even though they are often quite enlightening and often produce interesting and useful clarifications of issues in the field. However, recently we were led to make an exception, which prompts this editorial. During the review of the article in this issue authored by Barbara M. Taylor, P. David Pearson, Debra S. Peterson, and Michael C. Rodriguez, there was an exchange of views between the authors and three members of RRQ's editorial review board, who were asked to respond to a concern raised by one of the original reviewers. At our request, the authors and the reviewers have graciously agreed to share their exchange. Their respective contributions to the exchange appear subsequent to this editorial, although each has been slightly revised and edited for publication. To introduce and explain this exchange, we wish to provide readers with background about how it emerged in the context of the review process. During the initial review of the manuscript, one of the reviewers questioned the use of accepting as statistically significant a p value of .07. When submitting a subsequent revision, the authors argued that the circumstances of their research and particularly their statistical procedures justified an alpha above the customary value of .05. Realizing that this may be an important issue given new views about statistical analyses (see New Directions for Research, Reading Research Quarterly, Volume 39, No. 1) and sensing that we did not possess sufficient statistical expertise to make a final decision, we sought advice from three statistical experts who serve on RRQ's editorial review board. We asked them specifically to address this issue in the context of the authors' manuscript. They did, and we shared their anonymous responses with the authors. After receiving the experts' reviews, Taylor et al. submitted a revised manuscript in which they responded to the reviewers' comments. The authors also submitted a version of their research paper that used traditional statistical conventions (i.e., they did not report findings that were significant beyond the .05 level of significance). After carefully considering the views of the three board members and of the authors, and following much discussion in our weekly editorial meetings, we decided to accept the authors' rationale and notified them that their version of the manuscript reporting results based on a p value of .07 would be published. However, realizing that the issue and the views expressed about it in this exchange might be useful for the field, we requested permission from the authors and from the three board members to publish the exchange. We hope that this exchange will help clarify some of the issues surrounding levels of statistical significance. We hope, too, that it will provide a glimpse of the review process, which typically operates confidentially among authors, reviewers, and editors. Further, we wish to emphasize that publishing this exchange is not meant to detract from or lessen the potential impact of the article around which the exchange occurred. Our decision to publish the article was not contingent on publishing this exchange, and we stand by our decision to add this article to the archival literature of the field. Finally, as always, we invite readers who wish to comment on anything published in RRQ to submit a letter to the editor or a commentary, as outlined in the guidelines for authors. Richard Lomax's reviewI was asked to respond to the editors' specific queries about the statistical analysis of the manuscript. There are two issues that come to mind. First, over the past decade, there has been a debate among statisticians in education and the behavioral sciences about whether to totally eliminate significance testing in favor of reporting effect sizes. One camp says just use significance testing, one camp says just use effect sizes, and a middle-of-the-road opinion is to report both. Some journals now require the reporting of effect sizes. The authors of the RRQ manuscript do not make any such argument. Also, this debate has never mentioned the use of significance levels above .05. Second, here is the rationale used by the authors at the end of the methods section:
Because of the improved estimation enabled by HLM [Hierarchical Linear Models], including the use of maximum likelihood and empirical Bayes estimates, interpretation of statistical results can be broadened to include a larger p value associated with statistical tests. Furthermore, statistical results with p values at or near .10 should be included in interpretation and explored in further studies with smaller numbers of cases (e.g., with fewer teachers or schools) because such results indicate that there are relationships that merit further exploration.
|
Prev
|
Next
1
2
3
4
5
|