Research “parasitism” and authorship rights

Two seemingly opposing medical editorials on the subject of data sharing have recently been published. One, by the International Committee of Medical Journal Editors (ICMJE) appeared in all of its member journals, a non-paywalled version can be freely read at PLOS Medicine. Its lead author is ICMJE Secretary Darren Taichman, professor of medicine at the University of Pennsylvania. The message goes:

(ICMJE) believes that there is an ethical obligation to responsibly share data generated by interventional clinical trials because participants have put themselves at risk […]

As a condition of consideration for publication of a clinical trial report in our member journals, the ICMJE proposes to require authors to share with others the deidentified individual-patient data (IPD) underlying the results presented in the article (including tables, figures, and appendices or supplementary material) no later than 6 months after publication”

The other editorial, in the New England Journal of Medicine (NEJM), authored by the journal’s editors Dan Longo and Jeffrey Drazen, both professors of medicine at the Harvard Medical School, takes seemingly a diametrically opposing stand, by accusing those interested in accessing published data of nothing less than parasitism:

…people who had nothing to do with the design and execution of the study but use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited. There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as ‘research parasites’.”

The NEJM editors obviously think it wrong when other scientists attempt to re-analyse published data to assess its true validity. They are worried about the reputation of the original authors, who have published faulty data interpretations and its erroneous conclusions. This approach puts the individual scientist above science.

A major storm on social media began, which I joined as well. It seemed there were the good, transparent and modern ICMJE on one side and evil, secretive and backwards NEJM on the other. In fact I, and many others, got it wrong, the reality was more complicated. After contacting the NEJM editors I understood that the two editorials are not that different after all.

The solution to the riddle lies with another message in the NEJM editorial which drew lots of controversy:

“How would data sharing work best? We think it should happen symbiotically, not parasitically. […] report the new findings with relevant coauthorship to acknowledge both the group that proposed the new idea and the investigative group that accrued the data that allowed it to be tested”.

This means only one thing: scientists who wish to re-evaluate or use the published data generated by someone else must grant co-authorships to the latter, as opposed to the usual practice of acknowledging the source with citations. This expectation goes completely against the ICMJE recommendations on “Defining the Role of Authors and Contributors”, which are:

“The ICMJE recommends that authorship be based on the following 4 criteria:

  • Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND

  • Drafting the work or revising it critically for important intellectual content; AND

  • Final approval of the version to be published; AND

  • Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved”.

Otherwise, ICMJE recommends:

“Contributors who meet fewer than all 4 of the above criteria for authorship should not be listed as authors, but they should be acknowledged”.

Obviously, simply allowing others access to the data you have already published does not qualify you for authorship criteria of their future paper. In such cases, appropriate literature citations and specific mentioning in acknowledgment should fully suffice. However, authorships on research papers are very prized assets, and scientists often ruthlessly jump on every opportunity to co-author another paper. The problem of undeserved authorships is well known to anyone who ever worked in academia. In fact, in the field of obsessively publication-counting field of medicine it is likely to be much more widespread than elsewhere. Many papers feature authors who did nothing beyond sharing a published or even a commercial reagent. Some were simply powerful or important enough to enforce or to be granted an authorship. Therefore, undeserved academic authorships are nothing but bribes, given with the sole purpose to advance own career. And this kind of corruption is very rife, if not epidemic, in medicine and clinical research.

ICMJE editorial also advices how to give due credit to the original source when using shared published data:

“authors of secondary analyses […] must reference the source of the data using a unique identifier of a clinical trial’s data set to provide appropriate credit to those who generated it and allow searching for the studies it has supported”.

However, this kind of credit seems to be what counts as parasitism for Drazen and Longo, whom I have contacted by email. I received this reply from the former:

 “We said that some people described data users as parasites, but we were describing a better way to share data as symbiosites. The NEJM is a member of the ICMJE and strongly endorses its open data proposals-we published the same editorial that you cited in PLOS and you will see I am an author on it.  There is no controversy here”. 

Indeed, Drazen is a co-author of the ICMJE editorial. But why then does he insist on granting authorships, obviously undeserved in light of ICMJE recommendations?

When asked to clarify his demand of authorships to the original owners of the data, Drazen added:

“What we need is a method for people who generate data in clinical trials to get appropriate credit for their work- this needs to go beyond a citation.  We are asking the community to help us solve this vexing problem”.

A closer look at the ICMJE editorial shows that very similar thing is being proposed there:

“In addition, those who generate and then share clinical trial data sets deserve substantial credit for their efforts. Those using data collected by others should seek collaboration with those who collected the data. However, because collaboration will not always be possible, practical, or desired, an alternative means of providing appropriate credit needs to be developed and recognized in the academic community. We welcome ideas about how to provide such credit”.

In the nutshell, it seems all medical editors of ICMJE, including those at NEJM support the call for data sharing. Members of ICMJE  are apparently not generally opposed to independent analysis of published data, even without the involvement of the original authors. Maybe less so the NEJM editor Drazen. When specifically asked to elaborate on his concern about the parasites using “the data to try to disprove what the original investigators had posited”, wrote to me:

“Open data is one way to address reproducibility but that crisis is not rampant in the realm of controlled clinical trials-the realm where open data is under consideration.  As we said in the ICMJE editorial, the major reason for open data is to honor the sacrifice made by research participants.  They are the real heroes in this enterprise!”

This sounds like a half-hearted acceptance of the changing times, in the face of the more and more common stipulations about open data sharing coming from funders, journals and even occasional research institutions.

Under these circumstances, it is less the data itself, but its authorship credits which the clinical scientists are so afraid to lose. These doctors seem to come to grudgingly terms with the new times, where authorships cannot be automatically given and taken, without any scientific contributions to the research project at hand. But they also are not prepared to be simply thanked in acknowledgements, with their paper cited in the list of references. And they are certainly not keen on seeing their high-profile published research (NEJM has a colossal journal impact factor of 56!) being plucked apart by their critical peers.

Therefore, what exactly are Drazen and his ICMJE colleagues proposing? To introduce conditions under which original data from clinical trials can be released, and demands for collaborations (meaning co-authorships)? with its “owners”? Or will ICMJE soon change its authorship criteria, to include the ownership of original published data?

19 thoughts on “Research “parasitism” and authorship rights

  1. See for a so-called ‘data paper’ which was published on 20 January 2016 (Stienen EWM, Desmet P, Aelterman B, Courtens W, Feys S, Vanermen N, Verstraete H, Van de walle M, Deneudt K, Hernandez F, Houthoofdt R, Vanhoorne B, Bouten W, Buijs RJ, Kavelaars MM, Müller W, Herman D, Matheve H, Sotillo A, Lens L (2016) GPS tracking data of Lesser Black-backed Gulls and Herring Gulls breeding at the southern North Sea coast. ZooKeys 555: 115-124. doi: 10.3897/zookeys.555.6173 ).
    Copy/pasted from this paper: “Usage norms. To allow anyone to use this dataset, we have released the data to the public domain under a Creative Commons Zero waiver ( We would appreciate however, if you read and follow these norms for data use ( and provide a link to the original dataset ( whenever possible. If you use these data for a scientific paper, please cite the dataset following the applicable citation norms and/or consider us for co-authorship. We are always interested to know how you have used or visualized the data, or to provide more information, so please contact us via the contact information provided in the metadata, or “


  2. Readers will be surprised that, years after the ICMJE definition of authorship changed from 3 clauses to 4 in 2012, how many COPE member journals, which claim to follow ICMJE authorship guidelines, are still showing – and using/imposing – the old, 3-clause definition on their web-pages. The corrosion of science ethics, in my opinion, lies right before our eyes in which the main proponents of the powerful publishing elite impose one set of rules on the authorship, but then fail to follow their most basic self-imposed rules and guidelines. It makes once wonder who are the real “parasites”…

    The article by Dr. Schneider provides some nice new insight into data sharing. But scientists must read the fine print in the definitions to appreciate how much irregularity there is.

    Read further in two recent papers of mine:

    Teixeira da Silva, J.A., Dobránszki, J. (2016) How authorship is defined by multiple publishing organizations and STM publishers. Accountability in Research: Policies and Quality Assurance 16(2): 97-122.
    DOI: 10.1080/08989621.2015.1047927

    Teixeira da Silva, J.A., Dobránszki, J. (2016) Multiple authorship in scientific manuscripts: ethical challenges, ghost and guest/gift authorship, and the cultural/disciplinary perspective. Science and Engineering Ethics (in press).
    DOI: 10.1007/s11948-015-9716-3


  3. Perhaps we need a second tier of citations, “supercitations” if you will, that can only be used in reference to a data paper which forms the basis for a new paper.

    These supercitations would be counted seperately and individuals could build up a high “super h-index” by publishing useful datasets.


    1. But isn’t a big clinical study and its number of citations a quality argument in itself?
      It would be more reasonable to adopt ethics guidelines on data sharing. Interested researchers should always invite original authors to collaborate, but this mustn’t ever be a prerequisite for data sharing. And a collaboration should be actual research contribution, in agreement to ICMJE authorship definition, otherwise acknowledgements and citation must suffice.


  4. Who actually collects these data? What even makes it possible?

    I think there should be new system of credit for super research assistants who did all the hard work.

    And what about a credit system for the patients who make it possible to gather information?

    Come to think of it, what about a credit system for the taxpayers who mostly pay for research to be possible in the first place?

    Who are the real “parasites” in this case?


  5. I think part of the problem lies in the concept of ‘ownership’ of data. I’ve seen many references to authors not wanting to grant access to ‘their data’ to others who did not collect the data, etc. However just because a person collected the data does not mean that they ‘own’ the data, it merely means that they were the ones who collected it.

    In my view, the person who ‘owns’ the data is the person or group who paid for the data to be collected, not the people who did the collecting. Since the data collectors presumably were paid their normal salary + expenses during the data collection process, this means that they performed a service for which they were recompensed and that’s that. Imagine if a big mining company paid it’s workers to dig precious metals and then the miners simply walked away with the metals and said that ‘hey, we were the ones who collected it so it’s ours’. This is precisely what is occurring with the present system of research data ‘ownership’.

    I’ve seen other commentary like “until and unless we come to grips with the gritty, human, competitive reasons why data aren’t shared, we’re headed to a future of righteous proclamations–and little data movement”, but that’s a bunch of needless bullshit. We don’t need to understand why people are reluctant to change, why people are greedy, stubborn, bullheaded, territorial, etc, we only need to understand that such behaviors do in fact occur and then to immediately implement the most expedient way(s) to circumvent such behaviors.

    In this case it seems like what is needed is for funding bodies to require that data be made freely available within a pre-specified timeframe as a requirement for investigators to receive future funding and also for journals to require datasets be uploaded as a pre-requisite for publication. Done, no endless hand-wringing required.

    Liked by 1 person

  6. Jeffrey Drazen wrote: “What we need is a method for people who generate data in clinical trials to get appropriate credit for their work- this needs to go beyond a citation. We are asking the community to help us solve this vexing problem”.

    Calling people “data parasites” is a strange way to ask for their help.


  7. I am afraid I can’t interpret Longo and Drazen’s editorial as anything but a full-guns attack against responsible science in the modern age, regardless of whether they support open data done their “right” way.

    Collecting data is important and sometimes difficult work. So is doing a good analysis of data. There is already a way to credit people whose work you’re building on: citations. It seems as though Longo and Drazen are giving voice to an arrogance that considers data collection real work, but computational analysis not of real value. Both sides are essential when addressing complex phenomena. No-one’s contribution should be belittled.

    The idea of parasites using authors’ data to disprove the authors’ own points is even more wrong-headed. Authors often do muck up their own data analysis so badly that it only serves to highlight their own biases. The “parasites” who would carefully re-analyze the data and show why the result is likely wrong ought instead to be treated as the same kind of hero as those who painstakingly replicate experiments. This is essential to building accurate knowledge, and we need more of this rather than less given all the concerns about published results being reproducible in only a modest fraction of cases.

    The idea that collaboration is the best way to go needn’t arise in contrast to parasitism or woeful misunderstanding. Sometimes it’s the best way to drive science forwards. But sometimes the more independent or even adversarial approach is necessary to avoid having science driven backwards by uncooperative researchers whose attachment to their hypotheses goes beyond what their data can support.


  8. While everybody’s squabbling about authorship rights, an issue that will continue to be debated until the end of time, at least while there’s still science around, much fewer are discussing what authors’ rights are. This is where scientists should be focusing their efforts to make the system more just.


  9. Do medical researchers care about medicine or about themselves? Since when are data not published before the first paper based on them?

    E.g. de astronomy:
    “[. . .]
    [. . .] data will immediately become public.
    [. . .]

    [. . .]

    5.4 Archive and data analysis
    All DUAL data (~ 25 TB/year, of all processing levels), will be archived [. . .] and
    made publicly available on the internet.
    [. . .]
    [. . .] the raw data and the reconstructed events will be made publicly available continuously
    within a month of data acquisition. When a significant improvement is achieved in the understanding of the
    mission, a new version of the reconstructed event list will be generated and made publicly available.
    [. . .]”

    Parasitism had been coined many years ago by Carolyn Phinney. I quote de an email by her of April 2001 on SciFraud with Subject field “Re: What is “fraud”???”:
    “I’m jumping in on this discussion midway…
    I just wanted to mention that legally fraud is:
    when you 1) lie or make false promises or promises with a reckless
    unconcern for their truth or falsity to 2) induce someone to do something
    that they wouldn’t have done if they knew the truth and 3) therefore they
    are harmed.

    It would be interesting to try to map this onto the broader way in which
    we use the term.

    At least some of the cases we have discussed fit this legal definition.
    Phinney v. Perlmutter was a case of fraud. She lied to me to induce me
    to give her privileged access to my research and grant applications and
    data and methods and I wouldn’t have given her this access had I known
    she was lying and then she took these and claimed they were her own and
    deprived me of the products thereof, etc.

    Now, I think Antonia Demas’s case would fall under the legal definition
    of fraud in most of our minds. Likewise, Marianne Zorza’s case and Madi
    Gupta’s case and Lorie Ellis’s case and many of the other cases where a
    scientist is deprived of their research and grants by virtue of lies and

    I call this kind of fraud “parasitism.”

    There is another whole category that we discuss which I call “cheating”
    which includes vita fraud, data fakery, results fudging, etc.

    I have found CHEATING & PARASITISM to be useful conceptual categories in
    my thinking about science fraud. However, I’m not sure how clearly the
    cheating type of “fraud” fits into legal categories. One needs an
    injured party for most legal categories and here that party is ambiguous.
    And there are other problems…

    I hope some of these reflections are useful. Please don’t send me hate
    mail, if they are not.

    [. . .]”


