5 years ago, a materials science paper by German researchers announced exciting progress in graphene technology. The authors “received partial funding from the European Union Seventh Framework Programme under grant agreement no. 604391 Graphene Flagship“, a one-billion Euro investment by the EU and its member states, running from 2014 till 2024.
Two years after a brief criticism of its figures on PubPeer (and the promise of a quick correction), this paper is about to be retracted by its authors who so far insisted on conclusions being unaffected. All because two of my readers, one anonymous and one named, investigated the corrected data the authors provided and found it to show something quite different from what the study originally claimed. In addition, the university of Erlangen is investigating the case.
One of these experts, the Dutch physicist Maarten van Kampen, now wrote this guest post to explain the affair.
Anatomy of a Retraction
By Maarten van Kampen
In January 2020, the PubPeer user Pseudopaludicola llanera found something amiss with a Nature Communications paper: successive spectra looked very similar, even in their noise. This is very unusual for measurement data and often indicates a problem.
Philipp Vecera, Julio C. Chacón-Torres, Thomas Pichler, Stephanie Reich, Himadri R. Soni, Andreas Görling, Konstantin Edelthalhammer, Herwig Peterlik, Frank Hauke, Andreas Hirsch Precise determination of graphene functionalization by in situ Raman spectroscopy Nature Communications (2017) doi: 10.1038/ncomms15192
Within 10 days the authors responded and acknowledged the issue: the successively measured curves were not offset, but incidentally accumulated by checking the wrong box in the Origin plotting software. As a result only one spectrum per panel was correct: the first black one. The other spectra shown were summed with all preceding ones and thus very different from the actual measurement.
There was also good news: author Stephanie Reich, physics professor at Freie Universität (FU) Berlin, assured Pseudopaludicola on PubPeer that “The analysis and conclusion of our paper remain valid”, and the last and corresponding author Andreas Hirsch, chemistry professor at the Friedrich Alexander University (FAU) Erlangen-Nürnberg explained that
In another bit of good news, the authors announced in January 2020 they already had contacted Nature Communications and the correction could come fast:
Weeks became months, and months became years. After more than two years our Leonid send a friendly reminder to the authors, Nature Communications editor, university ombudspersons, and the funding agencies DFG and Graphene about the lack of progress with the correction. He may also have mentioned things like ‘data manipulation’ and ‘falsified’. This resulted in a quick reply: a correction had been submitted to Nature Communications in November 2021, leaving the conclusions unaffected. And to alleviate your suspense: from here it took 16 days before the authors requested a retraction of their paper.
In this paper ‘Precise determination of graphene functionalization by in situ raman spectroscopy’, a potassium graphite intercalation compound (GIC) was studied. To create the GIC, graphite is ‘cooked’ with potassium (K) in an inert environment. The potassium atoms move in between the graphene sheets that make up the graphite and eventually form a regularly ordered GIC. This way the boring black graphite can be turned into a nice blue- (KC24) or bronze- colored (KC8) GIC that, as added bonus, will burn when exposed to air.
To make this science one can slowly expose GICs to air or water and watch what happens using Raman spectroscopy. A Raman spectrum provides some sort of a fingerprint, allowing one to track the changes in the material. Typical ‘fingerprints’ are shown in Fig. 2b above, with e.g. the ‘Cz mode’ and ‘Fano shape’ characteristic for KC8, and the G peak with a small D (disorder) peak typical for graphite. When looking in more detail at the peaks and humps one can even make the “precise determination of the graphene functionalization” from the paper’s title. When looking at the correct spectra, that is.
In their reply the authors included a link to an online repository containing a comprehensive overview of the raw measurement data and the correction notice itself:
“(…) The Manuscript and Supplementary Information associated with this article contain errors of presentation in the main manuscript Fig. 1c and in the Supplementary Fig. S2 (see Ref. 11). The waterfall plots in the figures were stacked as “cumulative” plots, which adds them up consecutively in addition to displacing them vertically (Origin software). We only intended to displace the spectra vertically; the adding was inadvertently performed. In Fig.1 we present the correct spectra that have only a vertical shift. The main findings of our paper (the development of D bands and their assignment) remain valid and were not affected by the mistake. (…)”
In the years since the PubPeer notification the authors found that only Fig. 1c and Fig. S2 were affected by the Origin summation issue and that this did not change conclusions.
Take for example the original and corrected Fig. S2a below:
To a casual observer the left and right figure may look quite different. However, the correction states that:
The figure shows the evolution of Raman spectra when exposing KC8 (black curve) to hydrogen gas. In the article’s main text the authors write “We expected that KC8 should not give rise to covalent hydrogenation with H2 gas under these conditions”. In other words, the material should not change too much. They saw this expectation corroborated in their Raman spectra:
“Indeed, as can be seen in Fig. 2b, H2 exposure does not yield any covalent binding to the graphene lattice since the Fano-shaped signature of stage I GICs is largely preserved (Fig. 2b, blue). The corresponding evolution of the Raman spectra (Supplementary Fig. 2a) rather indicates H2 intercalation, leading to (H2)@nK+C8n . The intercalation of H2 in between the graphene sheets is clearly corroborated by an increasing intensity of the CZ mode.”.
This analysis may be ‘clear’ for the original figure, but definitely not for the corrected one. The handy vertical ‘Cz’, ‘D’, and ‘G’ lines provided in the correction show that the Cz mode and Fano shape disappear and that graphite-like D and G peaks appear: the material turns into something disordered graphitic and certainly does not stay a KC8 GIC…
How about the corrected Fig. 1c in the main text then? Here the KC8 GIC is exposed to water vapour:
In the original figure the ‘Cz mode’ persists during the oxidation process. Based on this the authors conclude:
“Both the presence of a CZ mode and the absence of any second-order mode in the final spectrum show that the GIC oxidation of the bulk crystal is not complete. Obviously, the oxidation potential of H2O and the limited mobility of K+ in the inner part of the crystal are not sufficient enough to allow for a complete bulk reoxidation, but can be used for a surface or thin film functionalization.”.
In the corrected figure this is all far less ‘obvious’. In fact, in the correction’s text the authors implicitly walk back their main text conclusions and now state that:
“The Cz mode around 550 cm-1 disappears when the sample is exposed to H2O. This effect is clear evidence that the GIC is reacting within the water molecule and the intrinsic KC8 tends to de-intercalate through the presence of H2O.”.
In other words, the GIC is not partially oxidized but actually completely falls apart. It is also noteworthy that in the correction the final water-exposed Raman spectrum (09) is very similar to the final H2 exposed spectrum (05). After correction it does not matter whether one exposes with H2O or H2, see the comparison below.
The issues above already invalidate quite a bit of the main text on the in-situ results, including many conclusions and ‘infographics’ depicting the assumed degradation processes:
In crossing out the bottom-right schematic I am running slightly ahead as we have not yet discussed Fig. 2b of the paper. The figure, reproduced below, turns out to be a mix of ‘summation affected’ (blue and red) and ‘correct’ Raman spectra. It thus was in dire need of correction, but forgotten in the document that the authors send to the journal editor. That the summation issue also strikes here is very surprising, as none of the curves that should have been summed are shown. Additionally, two post-exposure curves are correctly displayed.
Apart from the incorrect representation there are also things amiss with the interpretation. As explained above, the authors originally concluded that after water exposure the oxidation of the KC8 GIC was not complete. With that in mind they further exposed the material to O2, as per the main text:
“Subsequently, the sample was exposed to oxygen in the presence of water. Under these conditions the material should be reoxidized and simultaneously a hydroxylation of the carbon scaffold can be expected. (…) The in situ Raman analysis in Fig. 2b clearly revealed that in this case further covalent binding is promoted. (…) In the Raman spectra this is reflected by an additional defect site-related interband appearing at 1,460 cm-1. (…) These findings prove that the covalent hydroxylation of graphenide requires the presence of both oxygen and water that are omnipresent under ambient conditions. (…)”.
In short: the authors believe that additional exposure to O2 results in graphene-OH bonds, something that is ‘clearly revealed’ by changes in the Raman spectrum around 1460 cm-1. And thus proving the need for O2 for hydroxylation. In the figure below the gray H2O->O2 exposed spectrum is compared to the green H2O-only spectrum, corresponding to (09) curve in the correction. In this direct comparison, panel (a), the differences are minimal. When using the (08) spectrum of the correction instead the curves are as good as identical, panel (b). Together with the summation issues it seems that most conclusions drawn from Fig. 2b are void.
The source of the gray H2O->O2 exposed spectrum is a bit of a mystery. In these in-situ experiments it is usual to follow the same material throughout the exposure sequence. In this case that would mean continuously recording Raman spectra whilst first exposing the material to H2O and subsequently to O2. The text, caption, and infographics in Fig. 2a strongly suggest this sequence of events. However, the authors state that the data in the Fig. 1c correction “correspond to the complete experiment starting from a fully intercalated KC8 sample up to a fully exposed sample to ambient conditions after a H2O vapour”. This dataset represents a H2O-only exposure and does not contain the gray curve. Also the final air (O2) exposed curve (10) is different, and does not show a 1460 cm-1 feature.
When the authors more thoroughly checked their data the above issue turned out to be worse. The same gray ‘H2O->O2‘ spectrum also featured in Fig. 1d and Fig. 4b. But in these latter two cases representing different stages of a H2O-only exposure:
The suboptimal bookkeeping is also evident when looking in more detail at the original data and the correction. The summation-affected curves from the original paper can be corrected by taking the difference between successive curves. In the figure below the corrected Fig. S2a data is shown in the middle panel. It can be seen that these spectra have a one-to-one correspondence to the curves shown in the corrected Fig. S2a. However, their order has been changed: the originally 4th spectrum becomes the 6th and last spectrum in the corrected H2 exposure series. And this is exactly the spectrum that is very similar to the last spectrum in the H2O exposure series, see (our) Fig. 5b.
The authors were first contacted by Leonid on February 1st (2022). The above observations were send in a small report to the authors, Nature Communications editor, ombudsperson, … February 7th.
On February 17th the editor got somewhat worried and sent out an e-mail. Apparently one of the authors had contacted her on February 3th to notify her that “some minor adjustments must be taken into consideration regarding the conclusion of the manuscript in order to remain valid with respect to the corrections”. She had just read the February 7th report and suggested that a ‘Matters Arising’ publication in Nature Communications, followed by either a correction or retraction, would be a better route. And half an hour later Prof Hirsch sent on behalf of the authors an (already prepared) retraction request to the editor:
“(…) following the conversation with Dr. Julio C. Chacon-Torres from 3rd February, concerning the corrections related to our article “Precise Determination of Graphene Functionalization by in situ Raman Spectroscopy” (2017, 8, 15192 – DOI:10.1038/ncomms15192), in my function as the corresponding author I would like to inform you that the authors retract this publication. During the last week, we had several meetings and intense discussions among the co-authors and re-evaluated the data and their presentation in the paper in detail. We discovered mistakes and inconsistencies in the data presentation that invalidate and/or question several conclusions of our work as we outline in the following. (…)”
From a distance it is obviously hard to see what went wrong here. The data handling seems to have been ‘reckless’. Especially for Fig. 2b, showing a mix of summation-affected and ‘correct’ spectra, (intentional) manipulation cannot be excluded as this figure cannot be the result of a single inadvertently checked option in Origin. It is also surprising that it took the authors only 10 days from the first PubPeer report to the ‘conclusions not affected’ statement. And then some 22 months to send out a correction that actually missed most of the issues.
In a following e-mail exchange with an author it was revealed that the university of Erlangen has opened an investigation in the case. The cross-disciplinary collaboration between many groups was mentioned as a contributing factor to the data integrity issues, together with ill-defined responsibilities of the principal investigators and unclear lines of reporting.
I would add to the above that the usual way of writing corrections is also not helpful. The submitted correction is quite typical for the genre, taking pains not to retract any of the previous conclusions. The description of the corrected Fig. 1c is exemplary, implicitly walking back a number of conclusions in the main text and explicitly stressing that the ‘main findings’ remain valid. It would be much more helpful to readers when corrections would explicitly state which (sub-)conclusions changed after correction. I believe that this even may have prevented the current situation. A correction of the form ‘Original conclusion: Cz mode remains, KC8 partially oxidized by H2O. Updated conclusion: Cz mode disappears, KC8 fully oxidized by H2O’ may for example have alarmed a few authors about the results of the subsequent O2 exposure experiment.
The long delay between notification and correction was attributed to COVID and the large collaboration. This seems, again from a distance and without knowing the details, a strange excuse. The authors recognized the issue some 6 weeks before the COVID lockdowns in Europe. During the COVID years the authors continued publishing at their usual rate, suggesting that COVID was at least not affecting their ability to produce new results. Despite the pandemic restrictions and lockdowns, Prof Hirsch published since January 2020 between 70 and 80 new research papers, but not the announced correction.
The above is also true for the second author Chacón-Torres, living in the hard-hit Ecuador. In the relevant period he even submitted two papers together with Prof Hirsch: one article that was submitted three months after the issues were acknowledged on PubPeer, and another article that was submitted 3 months before the submission of the correction. Both papers seem large, international, and cross-disciplinary collaborations.
To end on a positive note: the authors feel the responsibility to not just retract, but also to correct the paper. They plan to re-analyze the data and perform new experiments and additional controls to validate or reject their hypotheses. Which is good, because the work is actually quite cool.
Notes by LS:
- only the first author of the study, Philip Vecera, could not be reached by email. I found out that he changed his last name to Eckerlein and works since 2017 at the chemical company Clariant. I reached out to Dr Eckerlein via LinkedIn but he did not react.
- Which is a pity, because Vecera/Eckerlein’s PhD thesis, supervised by Hirsch and defended at FAU in 2017, contains a number of figures of the Nature Communications paper which are now proven as problematic (see an example above). I wonder if the current FAU investigation will include this dissertation and draw consequences? By the way: Dr Eckerlein used to work for many years as football referee, making sure everyone plays fair and doesn’t foul or cheat.
- I never received any replies from the university or its ombudspeople, but they always received the expert opinions in parallel with Hirsch and other co-authors. I hope when they use these experts’ analyses in their investigative report they will give due credit!
- The analysis of the other expert is available below. This expert also commented about the peer review process: “no way reviewers could evaluate the content when nearly all spectra were digital product but not real data.”
I thank all my donors for supporting my journalism. You can be one of them!
Make a one-time donation:
I thank all my donors for supporting my journalism. You can be one of them!
Make a monthly donation:
Choose an amount
Or enter a custom amount
Your contribution is appreciated.
Your contribution is appreciated.DonateDonate monthly