paper mills Smut Clyde

A rule-based structure of three pigs

Smut Clyde came to check how the Elsevier journal Microprocessors & Microsystems so far handled its "problems caused by dishonest guest editors and reviewers".

Smut Clyde follows up on a study by Cabanac, Labbé & Magazinov about fake papers and how to spot them, to remind the Elsevier journal Microprocessors & Microsystems that it is a steaming pile of fraudulent papermill garbage.

In 2021, a trio of sleuths published this preprint, which focussed on the business relationship this Elsevier journal developed with the papermills in the last couple of years:

Guillaume Cabanac, Cyril Labbé, Alexander Magazinov Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals arXiv (2021) doi: 10.48550/arXiv.2107.06751

The authors also informed the journal’s Editor-in-Chief Lech Jozwiak, Professor Emeritus at the Eindhoven University of Technology in The Netherlands, who in August 2021 replied with a promise “to analyse and solve the problems caused by dishonest guest editors and reviewers“. Which he obviously didn’t do, as Smut Clyde will demonstrate to you in a moment. No wonder that neither Jozwiak nor his chief editor colleague Francesco Leporati of University of Pavia, Italy, replied to my email.

The Poland-born Editor-in-Chief is himself no stranger to tortured phrases, his own CV proclaims:

“I am proude of my methodology of quality-driven design of embedded and cyber-physical systems, my theory of information relationships and measures, and my information-driven approach to circuit synthesis.”

Well, we have no doubt he is just as “proude” of his journal Microprocessors & Microsystems. Now over to Smut Clyde.

“Finally, the goal has introduced the use of a rule-based structure of three pigs”

By Smut Clyde

He was put to sit on a bucket in front of a banner
To answer stupid questions in a profound manner

The Tiger Lillies (2003)

Educated pigs! Not just a long tradition of sideshow entertainment for English yokels, but also the inspiration for a collaboration between Edward Gorey and the Tiger Lillies. They also come to mind when someone uses mistranslation or synonym-swapping to disguise their plagiarism, transforming ‘HOG’ (‘Histogram of Gradients’) into ‘pig’.

Not to forget “Design of intelligent medical IoT platform and overall nursing management of nasal endoscopic surgery” (Feng, Chu & Wu 2021)… I was naturally concerned to learn that “A poisonous contraption enters the atmosphere and accesses the aggressor labourer“, but to be honest the allusions to “Rhinoceros severe nasal congestion” had an even greater claim upon my attention.

All this is an excuse to write about the spectacle of a journal with a papermill attached, or perhaps vice versa: Microprocessors & Microsystems (μ&μ for short). The symbiotic collaboration is notable for its productivity, for its eclectic mix of customers, and for the aleatoric poetry that emerges from its strategies of text generation, frequently entering the territory of ‘The Policeman’s Beard is Half-Constructed‘ (1983).

μ&μ features in “Tortured Phrases” (Cabanac, Labbé & Magazinov 2021), for although the style of academic malfeasance explored in that essay is not limited to the pages of μ&μ. it is conveniently encapsulated there. The present post is basically a commentary or a series of footnotes to “Tortured Phrases”, so if you haven’t read it yet, do that now and come back.

It is hard to overstate the amount of baldestdash extruded by the papermill in question.* Yes, there is a spreadsheet!

It is hard to imagine anyone looking at the journal’s recent Tables of Contents — tedious screeds of combinatorial monotony — and thinking “Yes, this is a prestigious high-standards publication where our work will be showcased in the finest company and find the readership it deserves”. But I am sure that out of roughly 1000 papers published in μ&μ in 2020/2021, some must be genuine studies that tripped and fell into the editors’ in-tray by mistake (I can’t be fecked looking for them). So let’s be conservative and start with an estimate of 600 outrageous fakes.

These Tables of Contents are all variations on the theme of FPGA and 5G, but to be fair, the monotony is compensated by the variety of contributors. A common observation about the fake-research industry is that papermills exist to assist medical clinicians when they find themselves in need of a publication. But here we meet all manner of lecturers in architecture, business administration, media studies, management, physical education (especially basketball)… it is an admirably egalitarian journal. Just don’t ask which faculty is graced by Professor Polari, the protagonist of “Remote case teaching mode based on computer FPGA platform and data mining” (Rong, 2021). I cannot help wondering if he or she begins all lectures with “Bona to vada your dolly old eke”.

Fig 2, Cabanac et al 2021

Cabanac et al commented on the journal’s expedited publication process, with its accelerated or streamlined cycle of ‘peer review”; and also on the coincidental submission dates on the papermill’s products… no editorial eyebrows were raised when entire shipments of wordwooze arrived in the same delivery truck. Their focus, though, was on the expedited process of paper production. The assembly line resorted to several strategies of text generation in order to churn out enough jibber-jabber to satisfy the demand from the clients. The closeness of collaboration and collegiality between papermill and journal in this case relaxed the usual requirement that manuscripts have to meet some minimal modicum of coherence and comprehensible meaning. A stream of words is all that’s required, pinched off at intervals like links of sausage, divided into “Introduction, Related work, Materials and methods, and Results and discussion” by following a template.

Fig 3, Cabanac et al 2021

“English vocabulary retrieval and recognition based on FPGA and machine learning” (Lu 2020)

Texts are punctuated (‘illustrated’ is too strong a term) with meaningless images that would not survive scrutiny for more than two seconds if anyone had ever looked. Many images are circuit schematics pillaged from Xilinx discussion boards or IEEE papers, signifying nothing but fulfilling the expectation that “Figures go about here”, the visual equivalent of Lorem Ipsum. Others are homebaked bar-graphs and line graphs that would be content-sparse chart-junk even if the numbers they plot were genuine and related in some way to the text (as it is, the papermillers don’t even bother to randomise new numbers each time).

Going back to the strategies of text generation, we find a continuum of granularity. At one end, the chunks of old-fashioned copy-paste plagiarism are large enough to be detected with a simple Interweb search. The continuum ranges through copy-paste with Rogetified text-spinning, where words in the pirated text are switched for synonyms to make the source less recognisable (producing Tortured Phrases in the process)… through to the fine-grained extreme where text is deconstructed into unrelated fragments and reassembled by simple AI. The results are safe from recognition by a G**gle search but vulnerable to detection on stylistic grounds (e.g. by the GPT-2 scanner).

Hilarity ensues when the raw meat fed into the hopper of the sausage-machine grinder is sufficiently inapt. What is all this about squid? “Strategy of library information resource construction based on FPGA and embedded system” (Wang & Liu 2021):

“To tackle this issue, planned a converse intermediary innovation dependent on the joint development and sharing arrangement of the squid library’s data assets in this investigation”
“The upside of squid, the substance of the association demand, is that it is conceivable to check the authorization rundown to guarantee the neighborhood security, access control records, and illicit or is certifiably not a protected association demand has been created.”

Stepping back to look at the nominal content of these confabulations, some do have a clinical or therapeutic messages, and were supposedly written by doctors and nurses, in accordance with the prophecy usual papermill pattern. But as noted, these alternate with FPGA + 5G + basketball, urban design, and whatever, from lecturers and students at polytechnics, vocational schools, agricultural colleges and provincial universities from all across China, all willing to embellish their CVs by purchasing fake papers.

I singled out China but that over-simplifies the picture. A significant minority of problematic papers flowed from Indian institutions, especially at the beginning of μ&μ‘s descent into a parasitical niche. They hint at a different quality of malfeasance, and a different source, and the PubPeer critiques of these subcontinental papers dwell on citation concerns rather than on any unabashed, in-your-face fabrication.** That is, they display clusters of irrelevant references shoehorned into the text, added at the last minute to assuage the demands of an importunate editor or peer-reviewer. Certain names recur in these citational insertions: Illavarason, Vijayakumar. These papers came to attention because of all the retractions (I’m coming to those). The pattern is not an accusation of false authorship, but it’s symptomatic of decay.

Citation extortion, in the wider world of journal-level shenanigans, is often associated with Special Issues. High-flying, highly-cited researchers, prominent in Fuzzy Logic or similarly science-adjacent fields of ‘higher obscurities’, can elevate their citation scores even further by convincing journals to host Special Issues (to be guest-edited by themselves or by pliable meatpuppets). They thereby publish remixes of their own papers, and extort citations from other would-be contributors. In the case of μ&μ, roughly two-thirds of the torrent of wordspume was hoovered up into a series of Special Issues, with titles formed by grabbing Worship Words at random out of a bag.

Then the Tortured Phrases preprint triggered an avalanche of depublications, though confined to these SIs. These were billed as ‘Withdrawals’ in preference to ‘Retractions’, as is the Elsevier custom when papers have been accepted but not yet assigned to paginated issues: in effect, they were wished into the cornfield. The editors blamed general jiggery-pokery and a bypass of Peer-Review for betraying the journal’s high standards. We learned of a ‘configuration error’ that allowed these trust-abusing Guest Editors to accept spurious or unscrutinised manuscripts without clearance from the Editor-in-Chief.

“This article has been withdrawn at the request of the author(s) and/or editor. The Publisher apologizes for any inconvenience this may cause.
Subsequent to acceptance of this special issue paper by the responsible Guest Editor XXX, the integrity and rigor of the peer-review process was investigated and confirmed to fall beneath the high standards expected by Microprocessors & Microsystems. There are also indications that much of the Special Issue includes unoriginal and heavily paraphrased content. Due to a configuration error in the editorial system, unfortunately the Editor in Chief did not receive these papers for approval as per the journal’s standard workflow.”

Sadly, the depublication was only partial; after throwing a few examples to the wolves under the bus, the editors settled for giving each SI its global Expression of Concern to warn readers and subscribers against relying on any of the contents. All those individual Withdrawals were taking up time that they needed for accepting proposals for future Special Issues to fill with software persiflage. Anyway, this attempt to shift blame for all the phrase-torturing to those regrettable, repeated choices of Guest Editors only intensifies attention on all the unmitigated garbage that wasn’t published through the SI pathway, and presumably did cross the desk of the Editor-in-Chief.

Cabanac et al. compared μ&μ to the phenomenon of hijacked, identity-stolen journals. This is where organised fraudsters set up a website that pretends to represent an existing journal, and exploit that journal’s reputation to solicit submissions and $$$ from gullible authors. Remember, though, that μ&μ still operates under the aegis of Elsevier. If (purely hypothetically) scammers infiltrated and captured a scholarly journal, replacing its panels of Editors and peer-reviewers so they could publish anything while genuine academics fled from any association with Tables of Contents full of uninterrupted flim-flam, then the hollowed-out skin of the journal would retain whatever legitimacy it had (not hijacked!), and libraries would still be obliged to subscribe to it as part of their subscription bundles.

Despite the absence of content and comprehensibility alike, these paper-shaped word-dumps are often cited anyway. Authors of ‘narrative reviews’ may find them to be useful building blocks in their own hand-wavy intellectual edifices (actually bothering to read them is not essential for this purpose). To be fair, most of these citations come from later paper-shaped word-dumps from the same papermill, published through the same journal.

I like to think that the many accomplishments of Toby (the Educated Pig) included an encyclopedic knowledge of the works of Shakespeare, if only so that I can make a joke about “Toby or not Toby”.

Hat-tip to ‘Parabagrotis sulinaris’ for finding “Analysis of language features of English corpus based on Java Web” (Su 2021). Also known as the Cover Penis paper.

“A wide range of sources in order to create a balanced text of the representative and the balance of the cover penis. Construction of a new corpus called Bangladesh. The collection is made more than 270,000 words. The corpus is to provide online and offline data structure from the Bengali text includes six theme types of articles.”

* * * * * * * * * * * * * * * * * * * *

* Like “balderdash”, but in the superlative form rather than comparative.

** Though readers may enjoy the clumsy engarblage displayed in “Design and evaluation of dynamic partial reconfiguration using fault tolerance in asynchronous FPGA” (Lekashri & Sakthivel 2019), where “SpiNNaker chips” became “headsail chips”.

Donate to Smut Clyde!

If you liked Smut Clyde’s work, you can leave here a small tip of 10 NZD (USD 7). Or several of small tips, just increase the amount as you like (2x=NZD 20; 5x=NZD 50). Your donation will go straight to Smut Clyde’s beer fund.


17 comments on “A rule-based structure of three pigs

  1. Alexander Magazinov

    “But I am sure that out of roughly 1000 papers published in μ&μ in 2020/2021, some must be genuine studies that tripped and fell into the editors’ in-tray by mistake (I can’t be fecked looking for them).”

    I remember one paper (maybe from Brazil, maybe not) submitted in 2017 and accepted in early 2021. Probably, the authors were (rightfully!) thinking that their manuscript is going to a semi-reasonable journal. Instead, it appeared in a great company!


  2. Klaas van Dijk

    hi Leonid, any idea about the opinions of Eindhoven University of Technology about these side-activities of one of their affiliates?

    “Robert-Jan Smits is President of the Executive Board since May 2019. As President he believes that universities must seek a stronger connection with and relevance to society and its challenges. Educating young talents and carrying out excellent research is essential.”


  3. On the Elsevier page, it says this journal is “Affiliated with Euromicro”, linking to From a cursory look, this does seem like a legitimate organisation, involved with organsing conferences on topics relating to microprocessors. Or at least it was, maybe the organisation was hijacked? Lech Jozwiak is listed as chair of one of the technical committees…

    I’m not to familiar with this field, so I do not know this entity or any of the people apparently involved with it. But according to the Internet Archive, the web-page has existed since at least 1999 (largely with unchanged design, but with regular updates to the content). That is highly indicative of a legitimate organisation.

    The Euromicro website states: “Euromicro together with Elsevier Science Elsevier runs two reputable scientific journals: (JSA): Embedded Software Design and the (MICPRO): Embedded Hardware Design.” So, they do not claim µ&µ as affiliated to them? Maybe the journal is falsely claiming this affiliation, in an attempt to gain some legitimacy?


  4. What follows is well known

    Its MoRPHoLoGY and its SiNTaX are designated as GRaMMaR of a language,
    but no language exists without its SeMaNTiCS.
    Let us think of some lab measurements or a constructed model of some observed phenomena.
    Either quantities and magnitudes obtained in laboratory or maths or symbolic formulation should be enough to communicate a finding to scientific community, but it is not the case

    Neither first ones nor second ones “speak for themselves” and language is used with its additions:
    I teach you this but I hide that from you, style, composition, format, etc.
    And this is due to three issues: communication consists of the use of language and scientific communication
    seeks to make what is claimed acceptable and to convince readers.
    The third: communication is written at the end of an arduous and sustained work
    throughout which the researchers have endured uncertainty, frustration, experiences and emotions.
    For these reasons language must produce an effect and this is ReTHoRiC.
    How many physicists would accept an article without a formula?
    and without a graph?

    This brings us to the problem of the unintelligible and the unreadable.
    All this is exploited as you can see in upper article that I comment:
    To escape anti-plagiarism checks -Plagirism Checker X, Turnitin,…-
    How many nonsense words or metaphors of this newspeak?
    What synonyms and criteria can we use in a search and replace?
    How many permutations of phrases in a sentence?
    How many paragraphs can be transposed?
    without the text turning into unreadable
    and not unintelligible becasuse if text is understandable or not,
    in these cases matters nothing.

    I have spoken of rhetoric and that makes scientific communication a type of discourse.
    A communication should contain a series of resources that make the text a unit.
    A communication cannot be a succession of statements,
    partial points of view or anecdotes.
    When there is no such unity, what is described in this article happens:
    the pig speaks and public thinks that they have spent their money well to be able to see such a prodigy.
    Lets go back to the beginning of my comment:
    the pig speaks but it does not know what it is saying.

    There is another element in the language that is pragmatics
    and that we can see in something like TED Talks.
    It will possibly acquire more relevance in the coming years.
    To explain pragmatics I always quote a poem b y Vicente Aleixandre
    who won Nobel Prize in Literature.
    would link it but I have ever thought that ritten poems are treasures
    that must be discovered by oneself.


  5. Smut Clyde

    Perhaps I was looking at all these junk ‘papers’ from the wrong angle. Perhaps they were composed not by software, but by Sulphur-crested Cockatoos.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: