Scientific Literature Text Mining and the Case for Open Access



Editorial decision for TJOE
Devin Berg
Dr. Sarma, After a review of your revised document submitted in response to the reviewers comments, we have decided to accept your article for publication in the Journal of Open Engineering. Thank you for your contribution! Devin Berg Editor, TJOE
Author's Response to Reviewers
Gopal Sarma
A sincere thanks to the reviewers Titus Brown and Shreejoy Tripathy for their valuable feedback. I have made the following changes to address their concerns: 1) Dr. Brown has asked for additional citations quantifying the fraction of journals which release articles in proprietary or locked formats. After thinking about this question, I decided that it would be better instead to reframe the core argument to focus on the benefits and opportunities created by bulk access to the scientific literature. I have removed most references to the specific problem of locked articles, as it should be implicitly clear that articles need to be released in open formats for them to be used as part of a data science pipeline. 2) I agree with Dr. Tripathy that the phrase “scientific data science” is problematic and replaced it with “scientific literature text mining” as he suggested. I have also added several additional references to ongoing research in the field.
Editorial comments for TJOE
Devin Berg
Dr. Sarma, Thank you for your submission to the Journal of Open Engineering. Please consider the two reviews that have been submitted. The reviewers ask that you address literature of the field more thoroughly and consider a revision to the phrase "scientific data science". Thank you again for your submission and please address the reviewer comments in your revision.
Shreejoy Tripathy
The article presents some nice and interesting ideas on the potential benefits of enabling widespread text-mining access to the scientific literature. Though there are many existing successful applications of literature text-mining (NeuroSynth and NeuroElectro are two neuroscience-specific examples), restricted access to scientific articles for text-mining greatly hinders these and other efforts. I have two major criticisms for the article in its current form. First, I strongly dislike the author’s use of the key phrase “scientific data science”. I feel this phrase needlessly mixes metaphors with the term “data science”. Instead, it seems that a better term for what the author is advocating is “scientific literature text-mining”. Second, the article inadequately cites the considerable existing literature on scientific literature text-mining. The work of Sophia Ananiadou and Peter Murray-Rust offer just two examples of people whose efforts have pushed forward the field of scientific text-mining.
TJOE Review - C. Titus Brown
C. Titus Brown
In Scientific Data Science and the Case for Open Access, Dr. Sarma makes the case for open access as a fundamental enabler of what he calls "scientific data science" - the ingestion and study of the scientific (academic?) literature for later information extraction. I should note that I am on the periphery of the "information extraction" business - I am more interested in enabling it than in doing it myself at the moment - so I am not the best person to comment on the specific references as to the goals and practice of scientific data science. But I do think it's incredibly important! Is there any citable evidence/reviews that discuss the magnitude and impact of the "loophole" at the core of this article (that articles can be released in a locked format)? It's asserted multiple times and I believe it from personal experience, some references and numbers would be great! typos: The Public Access _of_ Policy <- remove of
Note that the typo remains.