Technical Rigour, Exaggeration, and Peer Reviewing in the Publishing of Medical Research: Dangerous Tides and a Case Study
Ognjen Arandjelovic´*
School of Computer Science, University of St Andrews, UK
Submission: September 13, 2017; Published: November 10, 2017
*Corresponding author: Ognjen Arandjelovic´, School of Computer Science, University of St Andrews, St Andrews KY16 9SX, Fife, Scotland, UK, Tel: +44 (0)133-446-2824; Email: [email protected]
How to cite this article: Ognjen A. Technical Rigour, Exaggeration, and Peer Reviewing in the Publishing of Medical Research: Dangerous Tides and a Case Study. Curre Res Diabetes & Obes J. 2017; 4(4): 555644. DOI: 10.19080/CRDOJ.2017.04.555644
Abstract
To say that accuracy is of paramount importance in academic publishing is little short of a platitude. Nevertheless, the question of how accuracy can be ensured is not a trivial one in the real world, given the landscape of competing interests and a plethora of practical constraints and limitations. Though the primary responsibility lies with authors themselves, the framework of peer review, managed and overseen by editors, was developed in part to ensure robustness and resilience of the system as a whole. Analysing how often and why the system fails is inherently difficult as in most instances the specific processes are not visible to parties not directly involved in the handling of a prospective paper. For this reason it is important to encourage the reporting of case studies which should inform avenues for potential improvement. The present manuscript describes an example which illustrates well a number of flaws of the peer review framework as it exists today. In particular, I detail a number of serious errors in an article recently published in a leading journal, which include conclusions not supported by evidence, methodological flaws, and ill-conceived statistical analysis, alarmist and exaggerated manner in which the findings are communicated in the media, and the poor handling of these issues by the journal’s editorial board. In conclusion, the research community should exert a concerted effort to report, discuss, and document instances of criticisms of published scientific work being silenced.
Keywords: Accuracy; Responsibility; Exaggeration; Media; Peer review
Introduction
We live in an age of rapid development of science which places increasing demands for specialization even for researchers. For laymen, the nuances of cutting edge research are all but impenetrable. This environment places the scientific community in a position of power which demands responsibility. It is imperative that findings are communicated accurately, using appropriate language and terminology, in a manner which is clear but does not misrepresent, mislead, or strip away the relevant substance. While there can be little doubt that the foremost responsibility lies with the authors of a piece of research, the modern framework for the reporting and dissemination of scientific research and findings has evolved to include a robust and multilayered system of checks in order to ensure that the aforementioned ethical and scientific standards are met. The process of peer review is a key part of this framework. The broad principles this process relies on are those of expertise, independence, and complementarily. In particular, an editor in charge of processing a candidate manuscript solicits advice from a number of reviewers. These are selected on the basis of their expertise in specific areas, which may be complementary in their nature, so as to cover the entirety of the technical content of the manuscript. For example, an oncologist and a biomedical statistician may be invited to review a work on cancer epidemiology. Moreover, as much as possible in practice, the reviewers should be independent from one another and have no conflicts of interest with the authors. The reviewers’ opinions and recommendations are finally considered and weighed by the editor, who makes the final decision on the acceptance of the manuscript. Though there are good theoretical and practical reasons to have confidence that the peer reviewing process generally performs well, a number of weaknesses and structural problems have been highlighted by a number of researchers. However, the process is notoriously difficult to study rigorously as most of it lacks transparency and accountability. Rejected submission and the corresponding reviewer and editor reports are inaccessible to the public. Some of these never end up published, others are submitted to alternative venues without their prior submission history becoming known or subject to scrutiny and corrective feedback. Arguably, an even more pressing problem is that of work which does get published but due to errors or omissions fails to meet the aforementioned quality standards. In principle, the spirit of the scientific inquiry and academia in general should facilitate an easy remedy in the form of follow-up debate and openness to criticism. Yet, as a number of high profile cases have illustrated poignantly, such level of intellectual integrity is not always found even in journals which are generally highly regarded, with concerns and criticisms being ignored and shut down [1]. My aim with the present article is to illustrate some of the cultural problems which undermine the credibility of peer review and the academic community with a specific case study. I start on a technical note, by detailing a number of methodological errors in an article recently published in a leading journal, follow by the analysis of how the findings of the research were communicated to the community and the general public, and finish by describing how serious concerns on the content of the article were dealt with by the editorial board of the journal in question.
Background
Recently the European Journal of Endocrinology published an article authored by Lofvenborg¨ et al. [2] entitled “Sweetened beverage intake and risk of latent autoimmune diabetes in adults (LADA) and type 2 diabetes” [2]. I first learnt of this article through conventional, main-stream media sources, and was struck by the reports of findings which I found surprising, particularly considering the claimed magnitude of the observed effects. Used to poorly informed and misleading reporting of science in the popular media, I expected the message from the authors to be different but after listening to an interview with the lead investigator, what I got to hear was in substance the same story. Hence I decided to examine the article in some detail expecting to find either that the authors misrepresented their work (but that the scientific claims in the actual paper are sound) or a truly interesting new discovery. Though in my opinion the former of the two options is highly objectionable morally, at least both of them would be underlain by solid science as regards the content which was peer reviewed and published, in a wellknown and reputable journal. Sadly, my expectations turned out to be incorrect.
Case Presentation
Key criticisms
As the title of the article itself suggests, the study described by Lofvenborg¨ et al. [2] aims to identify risk factors associated with the development of LADA and type 2 diabetes in adults with a particular focus on the intake of sweetened beverages. The link between beverages sweetened with sugar and type 2 diabetes is well supported by the existing literature, which is further strengthened by convincing metabolic mechanisms which explain it [3,4]. The case of LADA, a much rarer condition, is different and, as I explain next, the present study contributes little in terms of empirical evidence, and physiologically unconvincing speculation (predicated on the misconstrued conclusions from empirical observations) regarding potential explanatory mechanisms.
Methodology: The aspect of the study by Lofvenborg¨ et al. [2] which should immediately attract the attention of a reader of their article, and especially the reviewers, concerns its design. In particular, the reported study follows the well-understood case controlled setup. Case controlled studies are notoriously weak in their ability to provide evidence on causality. Indeed they are-quite literally -used as textbook examples to illustrate the dangers of inferring causality from association [5]. Without nuanced statistical analysis and convincing hypotheses about potential mechanisms which could explain causality (the former not having been done by Lofvenborg¨ et al. [2] and the latter being more debatable, with my view being that their hy-potheses are overly speculative and insufficiently convincing) [5,6], the presented results do not warrant claims such as:
“In conclusion, these findings add support to the accumulating evidence suggesting that high intake of sweetened beverages, both sugar-sweetened and artificially sweetened, is a potential risk factor for type 2 diabetes. Importantly, these findings indicate that the adverse health effects seen with high sweetened beverage intake also encompass autoimmune forms of diabetes.” let alone the even stronger statements by the authors in the mainstream media: “In this study we were surprised by the increased risk in developing autoimmune diabetes by drinking soft drinks.” Similar claims suggesting causality are repeated throughout the article. Such claims are all that much more worrying given that the authors themselves noted their findings of association between a higher intake of sweetened drinks and generally “unhealthier” lifestyles, their acknowledgement that these confounding factors could not be confidently adjusted for, and the systematic biases introduced by the self-reporting nature of data collection. Considering the magnitude of the effect reported in the study, a hypothesis centred around these associations seems to me like a much more reasonable explanation of the observed effect, than the speculative hypothesis of causality. Indeed, by the authors’ own admission, even in 24h recall interviews the accuracy of selfreporting is very poor indeed (see section entitled “Dietary assessment” on Page 607) and there is ample evidence of systematic bias, with overweight and obese individuals being more likely to underreport their food intake [7].
Statistical analysis: In addition to the problems related to the study design, I also found a number of aspects concerning the authors’ statistical analysis wanting. For example, the restriction of the sensitivity analysis only to individuals whose diagnosis preceded the data collection by at most six months can be easily seen to have the potential of increasing rather than decreasing the potential for so-called reverse causation. The manner in which various statistics were reported in the paper are also rather uninformative and not up to the standards of the state of the art. Most notably, the quantification of the statistical significance of the observed differentials in specific variables across relevant strata, using the p-value is fundamentally unprincipled [8,9], in that it does not answer a relevant question (e.g. “What is the probability of the difference being greater than x?”), and obfuscating, in that it collapses a potentially nuanced probability density function (e.g. see [10] for an example of how such data should be reported) to a single (and, to repeat, all but entirely irrelevant) number. Clear explanations of various problems inherent in the use of p-value have been already communicated by others -particularly well by MacKay [9] -with promising signs of the message beginning to resonate with the wider research community [11,12], so I will not dwell on the issue and will instead refer the reader to the aforementioned sources. There are a number of other dubious choices made by the authors in their data analysis. For the sake of brevity I shall not attempt to cover all of them, but to give an example consider the choice to adjust for the intake of various food groups by consumption in grams per day. Considering the higher bodyweight of the LADA vs. the control group, it seems more reasonable to me to adjust by body weight normalized consumption in grams per day instead. In short, the paper is ridden with questionable decisions which cast additional doubt over the validity of the results. It is in no small part that the spirit of this reply, so very much at odds with what ought to be the quintessential spirit of scientific discourse and inquiry, that crystallized in my mind the dire need for change in how the entirety of aims and processes for the dissemination of scientific findings is understood. One step towards this goal certainly should include the duty for cases like this to be reported and put into public domain, so that their prevalence and potential correlates can be quantified and studied at least to some degree.
Dealing with issues: editors’ and journal’s response
That errors and omissions should occur in scientific articles is entirely expectable. In such cases it should be a duty of those who observe them to explain and correct, and of journals and editors to facilitate robust and open discussion (which of course may include disagreement). Following this sentiment I promptly contacted the European Journal of Endocrinology with a letter with a summary of my objections laid out in the previous section. Swiftly a reply came. The editors addressed none of the substance of the letter and rather stated that the issues raised do not have a ‘high priority’ in the context of the European Journal of Endocrinology. This struck me as rather remarkable -for a series of serious methodological and statistical errors, and a misinterpretation of findings in a study which has been highly popularized and which stands to affect health care recommendations to a vulnerable patient population not to be considered of outmost priority by a medical journal is entirely incomprehensible and, I would suggest, intolerable and inexcusable.
Conclusion
The article by Lofvenborg¨ et al. [2] describing their study of the relationship between the in- take of sweetened beverages intake and risk of latent autoimmune diabetes in adults and type 2 diabetes exemplifies several important problems. The first of these regards scientific methodology and the soundness of analyses used to interpret observational data, which in this case leave much to be desired. The second concern that emerges from my criticism regards the peer review process and in particular, its robustness and quality. As I noted, none of the methodological problems of the study in question are particularly subtle. As such it is difficult to understand how the conclusions drawn by the authors did not get corrected by any of the reviewers or the handling editor with whom the ultimate decision on the acceptance of submitted manuscripts lies. This is especially troubling in this case, given the nature of the study, the easily predictable attention it would receive in the media, and it’s potential to affect dietary recommendations. The last concern, no less important than the previous two, pertains to communication of science and scientific findings to the general public. We live in an age of rapid development of science which places increasing demands for specialization even for researchers. For laymen, the nuances of cutting edge research are all but impenetrable. This environment places the scientific community in a position of power which demands responsibility. It is imperative that findings are communicated accurately, using appropriate language and terminology, in a manner which is clear but does not misrepresent, mislead, or strip away the relevant substance..
The need for action
This behaviour is understandable when the change in the spirit of academic publishing is considered. It would not be unfair to describe the current trends as being dominated by market forces. From the pointer of view of authors there has been a gradual but cumulatively remarkable shift away from the aim of disseminating novel findings and observations, and informing others, towards a means of securing research positions, advancing one’s career, securing funding, and remaining ‘relevant’ [13-16]. The greater intellectual liberty enjoyed by authors in the past is reflected in the freer and more personal style of writing which is all but unrecognizable today [17,18]. On the other side are editors and reviewers. In the context of the descried ecosystem, they stand to gain little by diverting their time from the aforementioned efforts as authors themselves, and by spending it on thorough, intellectually principled, and constructive consideration of manuscripts of others [19]. However, the fact that the behaviour is understandable in this mechanistic sense does not suggest that it is excusable [15]. To say that the academic and research community must make efforts to fight this tide of pressures which threatens the very nature of research is little more than a truism, for it must not be forgotten that the behaviour of a community can only come through the actions of individuals which comprise it. To quote Tolstoy from ‘War and Peace’: “The movement of nations is caused by the activity of all the people who participate in the events, and who always combine in such a way that those taking the largest direct share in the event take on themselves the least responsibility”.
Funding
This research did not receive any specific grant from any funding agency in the public, commercial or not-for-profit sector.
References
- Campanario JM (1998) Peer review for journals as it stands today-part 1. Science Communication 19(3): 181-211.
- Bacchetti P (2002) Peer review of statistics in medical research: the other problem. BMJ 324(7348): 1271-1273.
- Cicchetti DV (1991) The reliability of peer review for manuscript and grant submissions: A cross-disciplinary investigation. Behavioral and Brain Sciences 14(1): 119-135.
- Godlee F, Gale CR, Martyn CN (1998) Effect on the quality of peer review of blinding reviewers and asking them to sign their reports: a randomized controlled trial. JAMA 280(3): 237-240.
- Jefferson T, Alderson P, Wager E, Davidoff F (2002) Effects of editorial peer review: a systematic review. JAMA 287(21): 2784-2786.
- Peters DP, Ceci SJ (1982) Peer-review practices of psychological journals: The fate of published articles, submitted again. Behavioral and Brain Sciences 5(2): 187-195.
- Kassirer JP, Campion EW (1994) Peer review: crude and understudied, but indispensable. JAMA 272(2): 96-97.
- Carlisle JB (2017) Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia 72(8): 944-952.
- Welsh LC, Hansson GK (2016) Tracheobronchial transplantation: The Royal Swedish Academy of Sciences’ concerns. The Lancet 387(10022): 942.
- Hvistendahl M (2013) Corruption and research fraud send big chill through Big Pharma in China. Science 341(6145): 445-446.
- Woodhead M (2016) 80% of China’s clinical trial data are fraudulent, investigation finds. BMJ 355: i5396.
- Wakefield AJ, Murch SH, Anthony A, Linnell J, Casson DM, et al. (1998) RETRACTED: Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disor-der in children. The Lancet 351(9103): 637-641.
- Lofvenborg JE, Andersson T, Carlsson PO, Dorkhan M, Groop L, et al. (2016) Sweetened beverage intake and risk of latent autoimmune diabetes in adults (LADA) and type 2 diabetes. European Journal of Endocrinology 175(6): 605-614.
- Ludwig DS, Peterson KE, Gortmaker SL (2001) Relation between consumption of sugar-sweetened drinks and childhood obesity: a prospective, observational analysis. Lancet 357(9222): 505-508.
- Malik VS, Popkin BM, Bray GA, Despres JP, Hu FB (2010) Sugarsweetened beverages, obesity, type 2 diabetes mellitus, and cardiovascular disease risk. Circulation 121(11): 1356-1364.
- Schlesselman JJ (1982) Case-control studies: design, conduct, analysis. Oxford University Press, New York, USA.
- Flather MD (1999) Causality -the achilles’ heel of observational studies. British Medical Journal 319(7208): 488-489.
- Poppitt SD, Swann D, Black AE, Prentice AM (1998) Assessment of selective under-reporting of food intake by both obese and non-obese women in a metabolic facility. Int J Obes Relat Metab Disord 22(4): 303-311.
- Anderson DR, Burnham KP, Thompson WL (2000) Null hypothesis testing: problems, prevalence, and an alternative. The Journal of Wildlife Management 64(4): 912-923.
- MacKay DJC (2003) Information theory, inference and learning algorithms. Cambridge University Press, Cambridge, USA.
- Arandjelovic´ O (2012) A new framework for interpreting the outcomes of imperfectly blinded controlled clinical trials. PLOS ONE 7(12): e48984.
- Trafimow D, Marks M (2015) Editorial. Basic and Applied Social Psychology 37(1): 1-2.
- Baker M (2016) Statisticians issue warning over misuse of P values. Nature 531(7593): 151.
- Braine G (2005) The challenge of academic publishing: A Hong Kong perspective. Tesol Quarterly 39(4): 707-716.
- Barnett C (1998) The cultural turn: fashion or progress in human geography? Antipode 30(4): 379-394.
- Frey BS (2003) Publishing as prostitution?-choosing between one’s own ideas and academic success. Public Choice 116(1-2): 205-223.
- Gundersen DE, Capozzoli EA, Rajamma RK (2008) Learned ethical behavior: An academic perspective. Journal of Education for Business 83(6): 315-324.
- Beatson GT (1896) On the treatment of inoperable cases of carcinoma of the mamma: suggestions for a new method of treatment, with illustrative cases. The Lancet 148(3803): 162-165.
- Ellison G (2002) Evolving standards for academic publishing: a q-r theory. Journal of Political Economy 110(5): 994-1034.
- Paget S (1889) The distribution of secondary growths in cancer of the breast. The Lancet 133(3421): 571-573.