Academic Publishing Medicine Research integrity

Carlisle’s statistics bombshell names and shames rigged clinical trials

John Carlisle is a British anaesthesiologist, who works in a seaside Torbay Hospital near Exeter, at the English Channel. Despite not being a professor or in academia at all, he is a legend in medical research, because his amazing statistics skills and his fearlessness to use them exposed scientific fraud of several of his esteemed anaesthesiologist colleagues and professors: the retraction record holder Yoshitaka Fujii and his partner Yuhji Saitoh, as well as Scott Reuben and Joachim Boldt. This method needs no access to the original data: the number presented in the published paper suffice to check if they are actually real. Carlisle was fortunate also to have the support of his journal, Anaesthesia, when evidence of data manipulations in their clinical trials was found using his methodology. Now, the editor Carlisle dropped a major bomb by exposing many likely rigged clinical trial publications not only in his own Anaesthesia, but in five more anaesthesiology journals and two “general” ones, the stellar medical research outlets NEJM and JAMA. The clinical trials exposed in the latter for their unrealistic statistics are therefore from various fields of medicine, not just anaesthesiology. The medical publishing scandal caused by Carlisle now is perfect, and the elite journals had no choice but to announce investigations which they even intend to coordinate. Time will show how seriously their effort is meant.

Carlisle’s bombshell paper “Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals” was published today in Anaesthesia, Carlisle 2017, DOI: 10.1111/anae.13962. It is accompanied by an explanatory editorial, Loadsman & McCulloch 2017, doi: 10.1111/anae.13938. A Guardian article written by Stephen Buranyi provides the details. There is also another, earlier editorial in Anaesthesia, which explains Carlisle’s methodology rather well (Pandit, 2012).

What Carlilse tested was how realistic are the variabilities provided for the baselines of the patient cohorts used in the study. Such baselines are categorical parameters like sex or presence vs absence of a certain illness, or continuous parameters like patients’ body weight, their blood pressure or other measurable physiological values. When patients are distributed for the purpose of the clinical trial in separate cohorts (say, one to receive an intervention and the other being the control), the distribution done by an objective triallist should be random. Which means the mean values in each cohort should be similar. These numbers are of course easy to fake. The statistical trap for cheaters lies elsewhere: in faking a realistic variance (V), which is the degree of departure from expected (or the standard deviation, SD, which is sqrt(V)). The standard error gets smaller the bigger the sample gets; this is also why clinical trials involving only a handful of patients (say, 10 or 30 participants) are to be taken with greatest caution: one single “outlier” or wrongly measured patient can skew the entire analysis, and the error is just too big to allow the applied therapeutic effect appear any significant. Of course, you might obtain no effect despite a very large trial participant cohort simply because your clinical intervention (your pills or therapy) do not work at all. In that case, once you decided to fake your baseline values to procure some significance, you will have it very tough faking their standard deviations anywhere realistically. Of course, hardly any peer reviewer or statistics editor would check those anyway during peer review. But Carlisle now did check, and for the first time he did so not just for his own journal Anaesthesia, but for a bunch of other journals, which certainly did not ask (and probably not welcomed either) for this kind of post-publication peer review.

In a clinical trial paper, you must declare the actual number of your trial participants (100 people, 1000, 2000) in total and for each intervention group, and once you did this, the standard deviations for the patient cohorts can be easily calculated and compared with the ones you provided in your publication. The parallel published editorial explains:

“Essentially, Carlisle’s method identifies papers in which the baseline characteristics (e.g. age, weight) exhibit either too narrow or too wide a distribution than expected by chance, resulting in an excess of p values close to either one or zero”.

Carlisle now applied this statistics screen over 5000 of clinical trials in different journals, and noticed that in around 1-2% of those the paired SD baseline values made absolutely no sense at all, even with the maximum of good will. He even accounted for all kind of possible author mistakes and typos, to allow a possibility of an honest error. And he also set his threshold extremely high at p < 0.0001, to make sure that anything caught there can never, under no realistic circumstances, show the proclaimed random distribution of patients. The authors either rigged the distribution of their control and intervention cohorts, or they just faked the numbers retrospectively, all to pretend a significant effect of their clinical intervention. As already mentioned, one needs some rather advanced statistical skills to fake SD values in such case for them to seem realistic. Here Carlisle’s method description:

“I extracted baseline summary data for continuous variables, reported as mean (SD) or mean (SEM). I did not study trials for which participant allocation was not described as random, or trials that did not report baseline continuous variables, or those that reported a different summary measure, such as median (IQR or range). I defined ‘baseline’ as a variable measured before groups were exposed to the allocated intervention, variables such as age, height, ‘baseline’ blood pressure or serum sodium concentration. I excluded variables that had been stratified. I recorded whether the allocation sequence had been generated in blocks, permuted or otherwise, which could reduce the distribution of means for time-varying measurements”.

There was also a very peculiar control sample Carlisle used: retracted clinical trials, like those of Fujii, where data manipulation was either admitted or was to be expected. Carlisle took them as gold standard to calibrate his analysis. Some of these retracted papers in fact even missed the threshold of p < 0.0001 and were undetectable by this generously set gate, which kind of indicates how badly manipulated those other papers must be, which did pass the threshold. As Carlisle put it:

“Some p values were so extreme that the baseline data could not be correct: for instance, for 43/5015 unretracted trials the probability was less than 1 in 1015 (equivalent to one drop of water in 20,000 Olympic-sized swimming pools)”.

What about those which escaped then? Carlisle offered another way to catch potential cheaters: once suspicious but not yet conclusive values emerged for the same author in several independent papers, another statistics analysis could be applied to prove fraud.

This are the papers and journals which Carlisle analysed:

“I wanted to determine whether data distributions in trials published in specialist anaesthetic journals have been different to distributions in non-specialist medical journals. I analysed the distribution of 72,261 means of 29,789 variables in 5087 randomised, controlled trials published in eight journals between January 2000 and December 2015: Anaesthesia (399); Anesthesia and Analgesia (1288); Anesthesiology (541); British Journal of Anaesthesia (618); Canadian Journal of Anesthesia (384); European Journal of Anaesthesiology (404); Journal of the American Medical Association (518) and New England Journal of Medicine (935). I chose these journals as I had electronic access to the full text”.

Obviously, subscription paywall is a great tool to hide fraud behind it. It did apparently prevent Carlisle from accessing a number of publications using his hospital’s limited access to medical literature. This might provide one reason why medical publishing is so reluctant to move to open access. Carlisle concludes:

“Fraud, unintentional error, correlation, stratified allocation and poor methodology might have contributed to the excess of randomised, controlled trials with similar or dissimilar means, a pattern that was common to all the surveyed journals. It is likely that this work will lead to the identification, correction and retraction of hitherto unretracted randomised, controlled trials”.

Table from Carlisle 2017. Copyright: Wiley.

The papers are in fact all easily identifiable using the Supplementary Table 1 from Carlisle 2017 paper, the ones at the top are seen as the most problematic ones, namely those with the most  ridiculously unrealistic distribution in baseline values. Together with some explanations about the analysed values and authors’ potential sources of error, the year, issue and page number of the analysed publications are provided by Carlisle for each journal. Entering these in an internet search gives you immediately the exact publication in question. There is no escape; the information about the phony data in a number of clinical trials is out there. As Loadsman & McCulloch wrote in their editorial, inviting journal editors to correct and retract unreliable papers:

“Each editor only has to work his/her way down the list. We cannot say at what point in the list editors should desist, and the journals will need to exercise their own discretion”.

I list below as example some papers straight off the top each for JAMA and NEJM, which I could unmask in this way. The former even includes one paper retracted for fraud, another paper had a correction of baseline values issued. Maybe someone can even make an automatic conversion of Carlisle’s Excel document, with hyperlinks to papers? In any case, the journals affected cannot ignore this easily, and I am updating this text below with their responses. 

Update 7.06.2017: in this added comment, I now also list the anaesthesiology journal papers which Carlisle found most problematic (p<0.00001).

Cartoon originally commissioned by Fergal Grace. Support my work here, get your own cartoon!


  1. Effect of Metformin and Rosiglitazone Combination Therapy in Patients With Type 2 Diabetes Mellitus A Randomized Controlled Trial

Vivian Fonseca, MD; Julio Rosenstock, MD; Rita Patwardhan, PhD; Alan Salzman, MD, PhD

JAMA. 2000;283(13):1695-1702. doi:10.1001/jama.283.13.1695

  1. Ketoconazole for Early Treatment of Acute Lung Injury and Acute Respiratory Distress Syndrome A Randomized Controlled Trial

The ARDS Network Authors for the ARDS Network

JAMA. 2000;283(15):1995-2002. doi: 10.1001/jama.283.15.1995

  1. Management of Chronic Tension-Type Headache With Tricyclic Antidepressant Medication, Stress Management Therapy, and Their Combination A Randomized Controlled Trial

Kenneth A. Holroyd, PhD; Francis J. O’Donnell, DO; Michael Stensland, MS; Gay L. Lipchik, PhD; Gary E. Cordingley, MD, PhD; Bruce W. Carlson, PhD

JAMA. 2001;285(17):2208-2215. doi:10.1001/jama.285.17.2208

  1. Effect of Blood Pressure Lowering and Antihypertensive Drug Class on Progression of Hypertensive Kidney Disease Results From the AASK Trial

Jackson T. Wright, Jr, MD, PhD; George Bakris, MD; Tom Greene, PhD; Larry Y. Agodoa, MD; Lawrence J. Appel, MD, MPH; Jeanne Charleston, RN; DeAnna Cheek, MD; Janice G. Douglas-Baltimore, MD; Jennifer Gassman, PhD; Richard Glassock, MD; Lee Hebert, MD; Kenneth Jamerson, MD; Julia Lewis, MD; Robert A. Phillips, MD, PhD; Robert D. Toto, MD; John P. Middleton, MD; Stephen G. Rostand, MD; for the African American Study of Kidney Disease and Hypertension Study Group

JAMA. 2002;288(19):2421-2431. doi:10.1001/jama.288.19.2421

  1. Impact of Electron Beam Tomography, With or Without Case Management, on Motivation, Behavioral Change, and Cardiovascular Risk Profile A Randomized Controlled Trial

Patrick G. O’Malley, MD, MPH; Irwin M. Feuerstein, MD; Allen J. Taylor, MD

JAMA. 2003;289(17):2215-2223. doi: 10.1001/jama.289.17.2215

  1. Effect of Testosterone Supplementation on Functional Mobility, Cognition, and Other Parameters in Older Men A Randomized Controlled Trial

Marielle H. Emmelot-Vonk, MD; Harald J. J. Verhaar, MD, PhD; Hamid R. Nakhai Pour, MD, PhD; André Aleman, PhD; Tycho M. T. W. Lock, MD; J. L. H. Ruud Bosch, MD, PhD; Diederick E. Grobbee, MD, PhD; Yvonne T. van der Schouw, PhD

JAMA. 2008;299(1):39-52. doi:10.1001/jama.2007.51

 7.  Effect of Physical Activity on Cognitive Function in Older Adults at Risk for Alzheimer Disease A Randomized Trial

Nicola T. Lautenschlager, MD; Kay L. Cox, PhD; Leon Flicker, MBBS, PhD; Jonathan K. Foster, DPhil; Frank M. van Bockxmeer, PhD; Jianguo Xiao, MD, PhD; Kathryn R. Greenop, PhD; Osvaldo P. Almeida, MD, PhD

JAMA. 2008;300(9):1027-1037. doi:10.1001/jama.300.9.1027

Incorrect Data (JAMA, January 21, 2009—Vol 301, No. 3): In the Original Contribution entitled “Effect of Physical Activity on Cognitive Function in Older Adults at Risk for Alzheimer Disease: A Randomized Trial” published in the September 3, 2008, issue of JAMA (2008;300[9]: 1027-1037), incorrect data were reported in Table 7, which appears on page 1036. In the row “Total ADAS-Cog [Alzheimer Diseas Assessment Scale– Cognitive Subscale]” and in the “Control Group” columns, the geometric mean (SD) for the completers should have been “6.4 (1.8)” and “10.6 (1.4)” for the dropouts.

  1. Laparoscopic Adjustable Gastric Banding in Severely Obese Adolescents A Randomized Trial

Paul E. O’Brien, MD, FRACS; Susan M. Sawyer, MBBS, MD, FRACP; Cheryl Laurie, RN, BHSc; Wendy A. Brown, MBBS, PhD, FRACS; Stewart Skinner, MBBS, PhD, FRACS; Friederike Veit, MBBS, MD, FRACP; Eldho Paul, MSc; Paul R. Burton, MBBS, FRACS; Melanie McGrice, BSc, M Nutr Diet; Margaret Anderson, BHIM, Grad Dip HA; John B. Dixon, MBBS, PhD, FRACGP

JAMA. 2010;303(6):519-526. doi:10.1001/jama.2010.81

  1. Cognitive Behavioral Therapy for Treatment of Chronic Primary Insomnia A Randomized Controlled Trial

Jack D. Edinger, PhD; William K. Wohlgemuth, PhD; Rodney A. Radtke, MD; Gail R. Marsh, PhD; Ruth E. Quillian, PhD

JAMA. 2001;285(14):1856-1864. doi:10.1001/jama.285.14.1856

  1. Chemoembolization Combined With Radiofrequency Ablation for Patients With Hepatocellular Carcinoma Larger Than 3 cm A Randomized Controlled Trial

Bao-Quan Cheng, MD, PhD; Chong-Qi Jia, PhD; Chun-Tao Liu, MD; Wei Fan, MD; Qing-Liang Wang, MD; Zong-Li Zhang, MD, PhD; Cui-Hua Yi, MD, PhD

JAMA. 2008;299(14):1669-1677. doi:10.1001/jama.299.14.1669

Retraction:  JAMA. 2009;301(18):1931. doi:10.1001/jama.2009.640


1.   High-Dose Atorvastatin after Stroke or Transient Ischemic Attack

The Stroke Prevention by Aggressive Reduction in Cholesterol Levels (SPARCL) Investigators

N Engl J Med 2006; 355:549-559 August 10, 2006 DOI: 10.1056/NEJMoa061894

  1. Horse versus Rabbit Antithymocyte Globulin in Acquired Aplastic Anemia

Phillip Scheinberg, M.D., Olga Nunez, R.N., B.S.N., Barbara Weinstein, R.N., Priscila Scheinberg, M.S., Angélique Biancotto, Ph.D., Colin O. Wu, Ph.D., and Neal S. Young, M.D.

N Engl J Med 2011; 365:430-438 August 4, 2011 DOI: 10.1056/NEJMoa1103975

3.   Primary Prevention of Cardiovascular Disease with a Mediterranean Diet

Ramón Estruch, M.D., Ph.D., Emilio Ros, M.D., Ph.D., Jordi Salas-Salvadó, M.D., Ph.D., Maria-Isabel Covas, D.Pharm., Ph.D., Dolores Corella, D.Pharm., Ph.D., Fernando Arós, M.D., Ph.D., Enrique Gómez-Gracia, M.D., Ph.D., Valentina Ruiz-Gutiérrez, Ph.D., Miquel Fiol, M.D., Ph.D., José Lapetra, M.D., Ph.D., Rosa Maria Lamuela-Raventos, D.Pharm., Ph.D., Lluís Serra-Majem, M.D., Ph.D., Xavier Pintó, M.D., Ph.D., Josep Basora, M.D., Ph.D., Miguel Angel Muñoz, M.D., Ph.D., José V. Sorlí, M.D., Ph.D., José Alfredo Martínez, D.Pharm, M.D., Ph.D., and Miguel Angel Martínez-González, M.D., Ph.D., for the PREDIMED Study Investigators*

N Engl J Med 2013; 368:1279-1290 April 4, 2013 DOI: 10.1056/NEJMoa1200303

4.   Extended Antiretroviral Prophylaxis to Reduce Breast-Milk HIV-1 Transmission

Newton I. Kumwenda, Ph.D., Donald R. Hoover, Ph.D., Lynne M. Mofenson, M.D., Michael C. Thigpen, M.D., George Kafulafula, M.B., B.S., Qing Li, M.Sc., Linda Mipando, M.Sc., Kondwani Nkanaunena, M.Sc., Tsedal Mebrahtu, Sc.M., Marc Bulterys, M.D., Ph.D., Mary Glenn Fowler, M.D., M.P.H., and Taha E. Taha, M.D., Ph.D.

N Engl J Med 2008; 359:119-129 July 10, 2008 DOI: 10.1056/NEJMoa0801941

5.   Treatment of Periodontitis and Endothelial Function

Maurizio S. Tonetti, D.M.D., Ph.D., Francesco D’Aiuto, D.M.D., Ph.D., Luigi Nibali, D.M.D., Ph.D., Ann Donald, Clare Storry, B.Sc., Mohamed Parkar, M.Phil., Jean Suvan, M.Sc., Aroon D. Hingorani, Ph.D., Patrick Vallance, M.D., and John Deanfield, M.B., B.Chir.

N Engl J Med 2007; 356:911-920 March 1, 2007 DOI: 10.1056/NEJMoa063186

6.   Effect of Bronchoconstriction on Airway Remodeling in Asthma

Christopher L. Grainge, Ph.D., Laurie C.K. Lau, Ph.D., Jonathon A. Ward, B.Sc., Valdeep Dulay, B.Sc., Gemma Lahiff, B.Sc., Susan Wilson, Ph.D., Stephen Holgate, D.M., Donna E. Davies, Ph.D., and Peter H. Howarth, D.M.

N Engl J Med 2011; 364:2006-2015 May 26, 2011 DOI: 10.1056/NEJMoa1014350

Updates about journal replies


Reply from Howard Bauchner, Editor in Chief, JAMA and The JAMA Network

“We receive numerous allegations about various issues related to the articles we publish.
This allegation will be treated in a similar manner.  We will assess the validity the allegation, and potentially contact the individual making the allegation for more information or the author of the article.
Authors are always offered the option to respond to allegations that are deemed valid”

Reply from Jeffrey Drazen, Editor-in-Chief New England Journal of Medicine (NEJM):

“We are in the process of studying Dr. Carlisle’s methods and examining the points raised by him”.

Reply from Martin Tramèr, Editor-in-Chief European Journal of Anaesthesiology (EJA):

“It is about possible fraud, not about accusations or evidence.
We will look into this”.

Update 6.06.2017

Reply from Hugh Hemmings, EiC of British Journal of Anesthesiology 

“We are currently reviewing the study by Carlisle and the data in question to determine our course of action. While concerning, there are no allegations of fraud, so we will have to carefully review the data before proceeding”.

Reply from Hilary Grocott, Editor-in-chief Canadian Journal of Anesthesia

 “We take this matter very seriously and intend to investigate”.

Reply from Andrew Klein, editor in chief of Anaesthesia:

“Our journal, Anaesthesia, intends to contact the authors of the trials identified by this study, so that we can discuss with them any errors or issues with the published data in line with COPE guidance.

The six Editors-in-Chief of the anaesthetic journals all met together yesterday (05 June 2017) to discuss the Carlisle paper and their approach following its publication, and we will be following up with them individually and as a group over the next few weeks and months.

I have received further emails from the NEJM and JAMA following the publication of the article and will also keep in touch with them both. I cannot comment as to what each Editor-in-Chief chooses to do and how exactly they will proceed as I do not know – I have explained to you our intentions at Anaesthesia. However, be assured I will be following this up, as above. I would point out the accompanying editorial to the Carlisle paper in our journal (Widening the search for suspect data – is the flood of retractions about to become a tsunami?) which discusses how journals may proceed further.”


Reply from Evan Kharasch, Editor-in-Chief Anesthesiology:

“Upon my return we will be evaluating and analyzing the article by Dr. Carlisle, and those articles published by Anesthesiology cited therein, to determine if there were any issues and whether any actions are indicated”.

Reply (after a reminder) from Jean-Francois Pittet, Editor-in-Chief Anesthesia and Analgesia:

“We will be evaluating and analyzing the methodology and findings of Dr. Carlisle, and specifically, those articles that were published by Anesthesia & Analgesia and cited therein, to determine if there are any issues and whether any actions are indicated”.


20 comments on “Carlisle’s statistics bombshell names and shames rigged clinical trials

  1. Leonid, there’s a mistake above. Variance is a population property, just like the mean. We estimate these parameters using a (random) sample and the larger the sample is, the more precise the estimate is gong to be. However, sample size should not have any effect on the population mean or variance itself, hence “The variance gets smaller the bigger the sample gets” is incorrect. What does get smaller with increasing sample size is the standard error (SEM = SD/square root of n), which is in fact an indication of how precise we have estimated the mean (the SEM gets smaller with increasing sample size and decreasing variance; the larger the variance in a population, the larger a sample size we need to measure a mean with the same precision). With relatively small samples (like the 30 you mentioned), you will only be able to detect very large effects with any confidence and a single atypical individual can indeed throw your whole result off kilter.

    Liked by 1 person

  2. Ana Pedro

    As I previously mentioned here:
    It is not necessary to be a statistics experts to detect many statistical issues in many papers

    Liked by 2 people

  3. Pingback: Boletim de Notícias, 6/jun: Exoplaneta muito quente e o Dia Mundial do Ambiente | Direto da Ciência

  4. typo – year for the paper, 2007, should be 2017

    Liked by 1 person

  5. Markus Skrifvars

    This paper is interesting. However I have some concerns:

    Where was this method validated? It is obvious that if all your published papers have problems like this then it is likely there is something going on. But, what if it is one paper out of several published? Is it then likely to be an error rather than fraud? The author could have started by checking/asking for data on some of these papers rather than publishing them as such.
    This seems to be a diagnostic method for detecting suspicious data or fraud. Then what is the positive predictive and negative predictive values of the test?
    The author has perfomed multiple tests on multiple papers. What is then the risk of a Type 2 error, i.e. finding a significant p-value that is due to chance? Given that this is a statistical paper this would merit further discussion.


  6. Pingback: Using Data to Fight Data Fraud: the Carlisle Method | graph paper diaries

  7. Anaesthesiology papers flagged by Carlisle as p<0.00001 in Table 2.
    Journal: Anaesthesia
    1. The effect of cannula material on the incidence of peripheral venous thrombophlebitis
    A. Gupta, Y. Mehta, R. Juneja, N. Trehan
    First published: 8 October 2007
    DOI: 10.1111/j.1365-2044.2007.05180.x
    2. Evaluation of pelvic wedge for gynaecological laparoscopy
    P. Kundra, V. Kanna, A. Bupathi, K. Sudeep
    First published: 5 September 2008
    DOI: 10.1111/j.1365-2044.2008.05578.x
    3. A comparison of the Airway Scope® and McCoy laryngoscope in patients with simulated restricted neck mobility
    R. Komatsu, K. Kamata, D. I. Sessler, M. Ozaki
    First published: 12 April 2010
    DOI: 10.1111/j.1365-2044.2010.06334.x
    4. Effect-site concentration of propofol for reduction of remifentanil-induced cough
    J. Y. Kim, S. Y. Lee, D. H. Kim, S. K. Park, S. K. Min
    First published: 11 May 2010
    DOI: 10.1111/j.1365-2044.2010.06347.x
    5. Comparison of tracheal intubation with the Airway Scope or Clarus Video System in patients with cervical collars
    J. K. Kim, J. A. Kim, C. S. Kim, H. J. Ahn, M. K. Yang, S. J. Choi
    First published: 13 May 2011
    DOI: 10.1111/j.1365-2044.2011.06762.x
    6. A comparison of the Truview® blade with the Macintosh blade in adult patients
    M. Barak, P. Philipchuck, P. Abecassis, Y. Katz
    First published: 13 July 2007
    DOI: 10.1111/j.1365-2044.2007.05143.x
    7. A comparison of anaesthetic techniques for shock wave lithotripsy: the use of a remifentanil infusion alone compared to intermittent fentanyl boluses combined with a low dose propofol infusion*
    M. A. Burmeister, P. Brauer, M. Wintruff, M. Graefen, I. Blanc, T. G. Standl
    First published: 20 August 2002
    DOI: 10.1046/j.1365-2044.2002.02820.x

    Journal: Anesthesia and Analgesia
    1. The Effect of Patient Positioning on Intraabdominal Pressure and Blood Loss in Spinal Surgery
    Park, Chang Kil MD
    Anesthesia & Analgesia: September 2000 – Volume 91 – Issue 3 – p 552–557
    doi: 10.1213/00000539-200009000-00009
    2. The Effects of Lactated Ringers Solution Infusion on Cardiac Output Changes After Spinal Anesthesia
    Kamenik, Mirt MSc, MD,; Paver-Eržen, Vesna PhD, MD†
    Anesthesia & Analgesia: March 2001 – Volume 92 – Issue 3 – p 710–714
    doi: 10.1213/00000539-200103000-00030
    3. The Placement of the Epidural Catheter at the Predicted Site by Electrical Stimulation Test
    Hayatsu, Keiko MD; Tomita, Misao MD, PhD; Fujihara, Hideyoshi MD, PhD; Baba, Hiroshi MD, PhD; Yamakura, Tomohiro MD, PhD; Taga, Kiichiro MD, PhD; Shimoji, Koki MD, PhD, FRCA
    Anesthesia & Analgesia: October 2001 – Volume 93 – Issue 4 – p 1035–1039
    doi: 10.1097/00000539-200110000-00048
    4. Intrathecal Morphine for Postoperative Analgesia: A Randomized, Controlled, Dose-Ranging Study After Hip and Knee Arthroplasty
    Rathmell, James P. MD; Pino, Carlos A. MD; Taylor, Richard MD; Patrin, Terri RN; Viani, Bruce A. MD
    Anesthesia & Analgesia: November 2003 – Volume 97 – Issue 5 – pp 1452-1457
    doi: 10.1213/01.ANE.0000083374.44039.9E
    5. The Effect of Intraoperative Use of Esmolol and Nicardipine on Recovery After Ambulatory Surgery
    White, Paul F. PhD, MD, FANZCA
    ; Wang, Baoguo MD†; Tang, Jun MD†; Wender, Ronald H. MD†; Naruse, Robert MD†; Sloninsky, Alexander MD†
    Anesthesia & Analgesia: December 2003 – Volume 97 – Issue 6 – pp 1633-1638
    doi: 10.1213/01.ANE.0000085296.07006.BA
    6. Intrathecal Sufentanil and Fetal Heart Rate Abnormalities: A Double-Blind, Double Placebo-Controlled Trial Comparing Two Forms of Combined Spinal Epidural Analgesia with Epidural Analgesia in Labor
    Van de Velde, M. MD, PhD,; Teunkens, A. MD,; Hanssens, M. MD, PhD, FRCOG†,; Vandermeersch, E. MD, PhD,; Verhaeghe, J. MD, PhD†
    Anesthesia & Analgesia: April 2004 – Volume 98 – Issue 4 – pp 1153-1159
    doi: 10.1213/01.ANE.0000101980.34587.66
    7. Tramadol Added to 1.5% Mepivacaine for Axillary Brachial Plexus Block Improves Postoperative Analgesia Dose-Dependently
    Robaux, Sébastien MD
    ,; Blunt, Cornelia FRCA,; Viel, Eric MD†,; Cuvillon, Philippe MD†,; Nouguier, Philippe MD,; Dautel, Gilles MD‡,; Boileau, Sylvie MD,; Girard, Florence MD§,; Bouaziz, Hervé MD, PhD
    Anesthesia & Analgesia: April 2004 – Volume 98 – Issue 4 – pp 1172-1177
    doi: 10.1213/01.ANE.0000108966.84797.72
    8. Metoprolol and Coronary Artery Bypass Grafting Surgery: Does Intraoperative Metoprolol Attenuate Acute -Adrenergic Receptor Desensitization During Cardiac Surgery?
    Booth, John V. MBChB, FRCA; Ward, Erin E. BS; Colgan, Kelly C. BS; Funk, Bonita L. BS; El-Moalem, Habib PhD; Smith, Michael P. BS; Milano, Carmelo MD; Smith, Peter K. MD; Newman, Mark F. MD; Schwinn, Debra A. MD
    Anesthesia & Analgesia: May 2004 – Volume 98 – Issue 5 – pp 1224-1231
    doi: 10.1213/01.ANE.0000112325.66981.03
    9. A Multicenter Randomized Controlled Trial Comparing Patient-Controlled Epidural with Intravenous Analgesia for Pain Relief in Labor
    Halpern, Stephen H. MD, MSc, FRCPC*; Muir, Holly MD, FRCPC†; Breen, Terrance W. MD, FRCPC†; Campbell, David C. MD, MSc, FRCPC‡; Barrett, Jon MBBch, MD, MRCOG, FRCSC§; Liston, Robert MB, ChB, FRCSC∥; Blanchard, J Wade MSc¶
    Section Editor(s): Birnbach, David J.
    Anesthesia & Analgesia: November 2004 – Volume 99 – Issue 5 – pp 1532-1538
    doi: 10.1213/01.ANE.0000136850.08972.07
    10. Ketamine Sedation During Spinal Anesthesia for Arthroscopic Knee Surgery Reduced the Ischemia-Reperfusion Injury Markers
    Saricaoglu, Fatma MD*; Dal, Didem MD*; Salman, Akgün Ebru MD*; Doral, Mahmut Nedim MD†; Klnç, Kamer MD‡ıı; Aypar, Ülkü MD*
    Anesthesia & Analgesia: September 2005 – Volume 101 – Issue 3 – pp 904-909
    doi: 10.1213/01.ANE.0000159377.15687.87
    11. The Dose of Succinylcholine Required for Excellent Endotracheal Intubating Conditions
    Naguib, Mohamed MB, BCh, MSc, MD; Samarkandi, Abdulhamid H. MB, BS, KSUF, FFARCSI; El-Din, Mansour Emad MD; Abdullah, Khaled MB, BCh, MSc, AB, MD; Khaled, Mazen MD; Alharby, Saleh W. MB, BS, FRCS (Glas)
    Anesthesia & Analgesia: January 2006 – Volume 102 – Issue 1 – pp 151-155
    doi: 10.1213/01.ANE.0000181320.88283.BE
    12. A Dose-Ranging Study of Intraarticular Midazolam for Pain Relief After Knee Arthroscopy
    Batra, Yatindra Kumar MD, MNAMS, FAMS*; Mahajan, Rajesh MD*; Kumar, Sushil MD*; Rajeev, Subramanyam MD, DNB*; Singh Dhillon, Mandeep MS†
    Anesthesia & Analgesia: August 2008 – Volume 107 – Issue 2 – pp 669-672
    doi: 10.1213/ane.0b013e3181770f95
    August 2008 – Volume 107 – Issue 2 – pp 669-672
    13. Ropivacaine Continuous Wound Infusion Versus Epidural Morphine for Postoperative Analgesia After Cesarean Delivery: A Randomized Controlled Trial
    O’Neill, Patricia MD; Duarte, Filipa MD; Ribeiro, Isabel MD; Centeno, Maria João MD; Moreira, João MD
    Anesthesia & Analgesia: January 2012 – Volume 114 – Issue 1 – p 179–185
    doi: 10.1213/ANE.0b013e3182368e87
    14. Nocebo-Induced Hyperalgesia During Local Anesthetic Injection
    Varelmann, Dirk MD, DESA*; Pancaro, Carlo MD†; Cappiello, Eric C. MD*; Camann, William R. MD*
    Anesthesia & Analgesia: March 2010 – Volume 110 – Issue 3 – pp 868-870
    doi: 10.1213/ANE.0b013e3181cc5727
    15. A Multicenter, Randomized, Controlled Study Evaluating Preventive Etanercept on Postoperative Pain After Inguinal Hernia Repair
    Cohen, Steven P. MD†; Galvagno, Samuel M. DO, PhD‡; Plunkett, Anthony MD§; Harris, Diamond MD‖; Kurihara, Connie RN; Turabi, Ali MD*; Rehrig, Scott MD**; Buckenmaier, Chester C. III MD*¶; Chelly, Jacques E. MD‖
    Anesthesia & Analgesia: February 2013 – Volume 116 – Issue 2 – p 455–462
    doi: 10.1213/ANE.0b013e318273f71c
    16. Heparin-Level-Based Anticoagulation Management During Cardiopulmonary Bypass: A Pilot Investigation on the Effects of a Half-Dose Aprotinin Protocol on Postoperative Blood Loss and Hemostatic Activation and Inflammatory Response
    Koster, Andreas MD; Huebler, Sabine MD; Merkle, Frank EBCP; Hentschel, Thomas MD; Gründel, Marcus MD; Krabatsch, Thomas MD; Tambeur, Luc MD; Praus, Michael MD; Habazettl, Helmut MD; Kuebler, Wolfgang M. MD; Kuppe, Hermann MD
    Anesthesia & Analgesia: February 2004 – Volume 98 – Issue 2 – pp 285-290
    doi: 10.1213/01.ANE.0000096260.35340.C5
    17. RETRACTED: The Dose-Range Effects of Propofol on the Contractility of Fatigued Diaphragm in Dogs: Retracted.
    Fujii, Yoshitaka MD; Uemura, Aki MD; Toyooka, Hidenori MD
    Anesthesia & Analgesia: November 2001 – Volume 93 – Issue 5 – ppg 1194-1198
    doi: 10.1097/00000539-200111000-00029

    Journal Anesthesiology
    1. Role of the Atrial Natriuretic Factor in Obstetric Spinal Hypotension
    Michael A. Frölich, M.D., D.E.A.A.
    Anesthesiology 8 2001, Vol.95, 371-376.
    2. Epidural Analgesia in the Latent Phase of Labor and the Risk of Cesarean Delivery: A Five-year Randomized Controlled Trial
    FuZhou Wang, Ph.D., M.Sc.; XiaoFeng Shen, M.Sc., M.P.H.; XiRong Guo, M.D.; YuZhu Peng, M.D., M.P.H.; XiaoQi Gu, M.D.
    Anesthesiology 10 2009, Vol.111, 871-880. doi:10.1097/ALN.0b013e3181b55e65
    3. Kidney Protection by Hypothermic Total Liquid Ventilation after Cardiac Arrest in Rabbits
    Renaud Tissier, D.V.M., Ph.D.; Sebastien Giraud, Ph.D.; Nathalie Quellard, Ph.D.; Béatrice Fernandez, Ph.D.; Fanny Lidouren, B.Sc.; Lys Darbera; Matthias Kohlhauer, D.V.M., M.Sc.; Sandrine Pons, Pharm.D., Ph.D.; Mourad Chenoune, D.V.M., Ph.D.; Patrick Bruneval, M.D.; Jean-Michel Goujon, M.D., Ph.D.; Bijan Ghaleh, M.D., Ph.D.; Alain Berdeaux, M.D., Ph.D.; Thierry Hauet, M.D., Ph.D.
    Anesthesiology 04 2014, Vol.120, 861-869. doi:10.1097/ALN.0000000000000048
    4. Limb Remote Ischemic Preconditioning Attenuates Lung Injury after Pulmonary Resection under Propofol–Remifentanil Anesthesia: A Randomized Controlled Study
    Cai Li, M.D.; Miao Xu, M.D.; Yan Wu, M.D.; Yun-Sheng Li, M.D.; Wen-Qi Huang, M.D.; Ke-Xuan Liu, M.D., Ph.D.
    Anesthesiology 08 2014, Vol.121, 249-259. doi:10.1097/ALN.0000000000000266
    5. Protamine-induced Cardiotoxicity Is Prevented by Anti-TNF-α Antibodies and Heparin
    Dmitry Pevni, M.D.; Inna Frolkis, M.D., Ph.D.; Adrian Iaina, M.D.; Yoram Wollman, Ph.D.; Tamara Chernichovski, Ms.C.; Izhak Shapira, M.D.; Josef Paz, M.D.; Amir Kramer, M.D., Ph.D.; Chaim Loker, M.D.; Rephael Mohr, M.D.
    Anesthesiology 12 2001, Vol.95, 1389-1395. doi:
    6. Hemostatic Activation and Inflammatory Response during Cardiopulmonary Bypass: Impact of Heparin Management
    Andreas Koster, M.D.; Thomas Fischer, M.D.; Michael Praus, M.D.; Helmut Haberzettl, M.D.; Wolfgang M. Kuebler, M.D.; Roland Hetzer, M.D.; Herman Kuppe, M.D.
    Anesthesiology 10 2002, Vol.97, 837-841. doi:
    7. Effect of the α2-Agonist Dexmedetomidine on Cerebral Neurotransmitter Concentrations during Cerebral Ischemia in Rats
    Kristin Engelhard, M.D.; Christian Werner, M.D.; Susanne Kaspar, B.S.; Oliver Möllenberg, M.D.; Manfred Blobner, M.D.; Monika Bachl, Cand.Med.; Eberhard Kochs, M.D.
    Anesthesiology 2 2002, Vol.96, 450-457. doi:
    8. Attenuation of Responses to Endotoxin by the Triggering Receptor Expressed on Myeloid Cells-1 Inhibitor LR12 in Nonhuman Primate
    Marc Derive, Ph.D.; Amir Boufenzer, Ph.D.; Sébastien Gibot, M.D., Ph.D.
    Anesthesiology 04 2014, Vol.120, 935-942. doi:10.1097/ALN.0000000000000078
    9. δ Opioid Receptor Antagonist, ICI 174,864, Is Suitable for the Early Treatment of Uncontrolled Hemorrhagic Shock in Rats
    Liangming Liu, M.D., Ph.D.; Kunlun Tian, M.S.; Yu Zhu, M.S.; Xiaoli Ding, M.S.; Tao Li, Ph.D.
    Anesthesiology 08 2013, Vol.119, 379-388. doi:10.1097/ALN.0b013e31829b3804

    Journal: British Journal of Anaesthesia
    1. Effects of valproic acid and magnesium sulphate on rocuronium requirement in patients undergoing craniotomy for cerebrovascular surgery
    M.-H. Kim, J.-W. Hwang, Y.-T. Jeon and S.-H. Do*
    British Journal of Anaesthesia 109 (3): 407–12 (2012) Advance Access publication 5 July 2012 . doi:10.1093/bja/aes218

    Click to access aes218.pdf

    2. Caudal clonidine for postoperative analgesia in adults.
    A C Van Elstraete F Pastureau T Lebrun H Mehdaoui
    Br J Anaesth (2000) 84 (3): 401-402. DOI:
    3. Premedication with controlled-release oxycodone does not improve management of postoperative pain after day-case gynaecological laparoscopic surgery
    R. Jokela1*, J. Ahonen1, M. Valjus1, T. Seppa¨la¨2 and K. Korttila1
    British Journal of Anaesthesia 98 (2): 255–60 (2007) doi:10.1093/bja/ael342

    Click to access ael342.pdf

    4. RETRACTED: Midazolam versus propofol for reducing contractility of fatigued canine diaphragm
    Y. Fujii, H. Toyooka
    Br J Anaesth (2001) 86 (6): 879-881.
    5. Influence of sepsis on minimum alveolar concentration of desflurane in a porcine model
    B. Allaouchiche, F. Duflo, R. Debon, J.‐P. Tournadre, D. Chassard
    Br J Anaesth (2001) 87 (2): 280-283.

    Journal: Canadian Journal of Anesthesia
    1. Additive anti-emetic efficacy of prophylactic ondansetron with droperidol in out-patient gynecological laparoscopy
    Wu, O., Belo, S.E. & Koutsoukos, G.
    Can J Anesth (2000) 47: 529. doi:10.1007/BF03018944

    1. Flumazenil improves cognitive and neuromotor emergence and attenuates shivering after halothane-, enflurane- and isoflurane-based anesthesia
      Weinbroum, A.A. & Geller, E.
      Can J Anaesth (2001) 48: 963. doi:10.1007/BF03016585
    2. Routine handling of propofol prevents contamination as effectively as does strict adherence to the manufacturer’s recommendations
      Ingo H. LorenzChristian KolbitschEmail authorCornelia Lass-FlörlIrene GritznigBurkard VollertWerner LingnauPatrizia L. MoserArnulf Benzer
      Can J Anesth (2002) 49: 347. doi:10.1007/BF03017321
    3. Un mélange de bupivacaïne et de kétamine est supérieur à la kétamine seule pour ľanalgésie intra-articulaire à la suite ďune arthroscopie du genou
      Yatindra Kumar Batra, Rajesh Mahajan, Sushil Kumar Bangalia, Onkar Nath Nagi, Mandeep Singh Dhillon
      Can J Anesth (2005) 52: 832. doi:10.1007/BF03021778
    4. Effects of nabilone, a synthetic cannabinoid, on postoperative pain
      Beaulieu, P.
      Can J Anesth (2006) 53: 769. doi:10.1007/BF03022793

    5. RETRACTED ARTICLE: Different effects of olprinone on contractility in nonfatigued and fatigued diaphragm in dogs
      Fujii, Y. & Toyooka, H.
      Can J Anesth/J Can Anesth (2000) 47: 1243. doi:10.1007/BF03019875
    6. Acupressure wristbands do not prevent postoperative nausea and vomiting after urological endoscopic surgery
      Agarwal, A., Pathak, A. & Gaur, A.
      Can J Anesth (2000) 47: 319. doi:10.1007/BF03020945
    7. Dextromethorphan attenuation of postoperative pain and primary and secondary thermal hyperalgesia
      Avi A. Weinbroum, Alexander Gorodezky, David Niv, Ron Ben-Abraham, Valery Rudick, Amir Szold
      Can J Anaesth (2001) 48: 167. doi:10.1007/BF03019730

    Journal: European Journal of Anaesthesiology
    1. Reflex activity caused by laryngoscopy and intubation is obtunded differently by meptazinol, nalbuphine and fentanyl
    Freye, E.1; Levy, J. V.2
    European Journal of Anaesthesiology: January 2007 – Volume 24 – Issue 1 – p 53–58
    2. Pressure support ventilation with the ProSeal® laryngeal mask airway. A comparison of sevoflurane, isoflurane and propofol
    Keller, C.*; Brimacombe, J.†; Hoermann, C.*; Loeckinger, A.*; Kleinsasser, A.*
    European Journal of Anaesthesiology: August 2005 – Volume 22 – Issue 8 – p 630–633
    doi: 10.1017/S0265021505001055
    3. Mood change after anaesthesia with remifentanil or alfentanil
    Crozier, T. A.*; Kietzmann, D.†; Döbereiner, B.¶
    European Journal of Anaesthesiology: January 2004 – Volume 21 – Issue 1 – pp 20-24
    4. A randomised trial of the analgesic efficacy of ultrasound-guided transversus abdominis plane block after caesarean delivery under general anaesthesia
    Tan, Terry T.; Teoh, Wendy H.L.; Woo, David C.M.; Ocampo, Cecilia E.; Shah, Mukesh K.; Sia, Alex T.H.
    European Journal of Anaesthesiology: February 2012 – Volume 29 – Issue 2 – p 88–94
    doi: 10.1097/EJA.0b013e32834f015f
    5. Laryngeal tube S II, ProSeal laryngeal mask, and EasyTube during elective surgery: a randomized controlled comparison with the endotracheal tube in nontrained professionals
    Cavus, Erol; Deitmer, Wiebke; Francksen, Helga; Serocki, Goetz; Bein, Berthold; Scholz, Jens; Doerges, Volker
    European Journal of Anaesthesiology: September 2009 – Volume 26 – Issue 9 – p 730–735
    doi: 10.1097/EJA.0b013e32832a9932


  8. Worth reading Nick Brown on this. He cautions against taking Carlisle’s analysis as proof of fraud.


  9. Indeed, Carlisle makes two assumptions that are wrong (how many of his findings are explained by these wrong assumptions remains to be determined)
    1) His method of combining P-values only works if covariates are independent. They are not. Sex and height is a simple example.
    2) He analyses the trial as if simply randomisation was used. This is almost never the case: permuted blocks are common in pharma and minimisation ex-pharma.

    Falsely assuming 1) will produce significant imbalance too easily and falsely assuming 2) will produce suspicious balance too easily.

    Interestingly, Carlisle refers to both of these problems but i don’t think he really takes them into account. I think he does not at all appreciate the extent to which 2) is a problem,

    However, this does not change the fact that he has undertaken an interesting study. However, it does indicate that one should be very cautious in drawing conclusions.


  10. Pingback: Fake news, lies, fraud, errors and statistics. St.Emlyn's - St.Emlyn's

  11. John Carlisle

    Thanks for the comments. The analysis I used is fairly ‘dumb’. As Stephen Senn and others have noted, the test makes a number of assumptions that are bound to be wrong on many occasions. I listed these assumptions in the paper and in other blog sites, for instance on Nick Brown’s site.

    The results of the test don’t identify fraud. ‘Unexpected’ results will be the consequence of: 1) my errors; 2) the various assumptions in the test being incorrect (to varying degrees with different trials); 3) errors by authors / editors / type setters (most of which will be innocent, a few data fabrication); 4) chance.

    I strongly support the statement that conclusions should be drawn cautiously. I think that no trial should be viewed as fraudulent without further work and ultimately after investigation, if deemed appropriate after analyses of multiple trials by an author group.

    I think that some of the tweets have been sensationalist and unwarranted. My main aim is to get the science right – I’m not on a ‘crusade’ to identify fraudsters, although I think it worthwhile to do so.

    Liked by 2 people

  12. I’ve read the discussions, and John’s paper is highly interesting regardless of the criticisms that have been given. As a tool, it is valuable as a screening mechanism, not as a detection mechanism (as John has repeatedly mentioned but is continuously misinterpreted). The 1.7% rate in the paper is the first prevalence estimate of data anomalies in clinical trials, which surprisingly corroborates self-report of data fabrication in surveys (Fanelli, 2009).

    The criticisms across all the blogs seem to follow three main points:

    Correlated measures
    Non-random assignment
    Interchanging of SD and SEM

    For 1, sensitivity analyses would be possible with methods that take into account correlations between the various results, and adjusting the degree of correlation.
    For 2, I find it odd that these permuted block designs are used as justification, if a trial is reported to be a RCT. If it uses permuted block designs, it is therefore not a true RCT anymore and should be noted in a paper. As such, the test then flags for a deviation from the reported protocol, which is an anomaly after all.
    For 3, sensitivity analyses are easy by checking whether interchanging SD for SEM (or vice versa) would remove the flag. In those cases you could give the benefit of the doubt. I agree with John that the ‘dumb’ approach works best to remove arbitrary decisions in deciding whether authors misreported SEM for SD.

    I’ll try to address these issues in a similar project I am doing, where I analyze RCTs in database with these methods. If anyone is interested in following this project, please view
    and feel free to get in touch if you’d like to join the project or have comments.


  13. Hi Leonard, I found your post a terrific summary of John’s impressive work. I bookmarked it for my next week reading list and, with your permission (which I am asking), plan to write a sort of short digest piece in Chinese. Is that OK with you?


  14. Pingback: "Anomalie statistiche" - Ocasapiens - Blog -

  15. Pingback: What is Research Misconduct? Part 3: Fabrication – Science Integrity Digest

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: