Difference between revisions of "User:Shawndouglas/Sandbox"

From LIMSWiki
Jump to navigationJump to search
(Creating next week's journal article in sandbox)
(Saving and adding more.)
Line 39: Line 39:


==Introduction==
==Introduction==
Scientific research and technological development are inherently combinatorial practices (Arthur, 2009). Researchers draw from, and build on, existing work in advancing the state of the art. Increasing the ability of researchers to review and understand previous research can stimulate and accelerate scientific progress. However, the number of scientific publications grows exponentially every year both on the aggregate level and in an individual field (National Science Board, 2016). It is impossible for any single researcher or organization to keep up with the vastness of new scientific publications. The ability to use text analytics to map the current state of the art to detect progress would enable more efficient analyses of data.
Scientific research and technological development are inherently combinatorial practices.<ref name="ArthurTheNature09">{{cite book |title=The Nature of Technology: What It Is and How It Evolves |author=Arthur, W.B. |publisher=Simon and Schuster |page=256 |year=2009 |isbn=9781439165782}}</ref> Researchers draw from, and build on, existing work in advancing the state of the art. Increasing the ability of researchers to review and understand previous research can stimulate and accelerate scientific progress. However, the number of scientific publications grows exponentially every year both on the aggregate level and in an individual field.<ref name="NSBScience16">{{cite web |url=https://www.nsf.gov/statistics/2016/nsb20161/uploads/1/nsb20161.pdf |format=PDF |title=Science and Engineering Indicators 2016 |author=National Science Board |publisher=National Science Foundation |pages=899 |date=11 January 2016}}</ref> It is impossible for any single researcher or organization to keep up with the vastness of new scientific publications. The ability to use text analytics to map the current state of the art to detect progress would enable more efficient analyses of data.
 
The Intelligence Advanced Research Projects Activity recognized the scale problem in 2011, creating the research program Foresight and Understanding from Scientific Exposition. Under this program, SRI and other performers processed “the massive, multi-discipline, growing, noisy, and multilingual body of scientific and patent literature from around the world and automatically generated and prioritized technical terms within emerging technical areas, nominated those that exhibit technical emergence, and provided compelling evidence for the emergence.”<ref name="DNIIARPA11">{{cite web |url=https://www.dni.gov/index.php/newsroom/press-releases/press-releases-2011/item/327-iarpa-launches-new-program-to-enable-the-rapid-discovery-of-emerging-technical-capabilities |title=IARPA Launches New Program to Enable the Rapid Discovery of Emerging Technical Capabilities |publisher=Office of the Director of National Intelligence |date=27 September 2011}}</ref> The work presented here applies and extends that platform to efficiently identify and describe the past and present evolution of research on a given set of materials. This work applies text analytics to demonstrate how these computational tools can be used by analysts to analyze much larger sets of data and develop more iterative and adaptive material assessments to better inform and shape government and industry research strategy and resource allocation.


==References==
==References==

Revision as of 01:19, 22 May 2018

Sandbox begins below

Full article title Application of text analytics to extract and analyze material–application pairs from a large scientific corpus
Journal Frontiers in Research Metrics and Analytics
Author(s) Kalathil, Nikhil; Byrnes, John J.; Randazzese, Lucien; Hartnett, Daragh P.; Freyman, Christina A.
Author affiliation(s) Center for Innovation Strategy and Policy and the Artificial Intelligence Center, SRI International
Primary contact Email: christina dot freyman at sri dot com
Year published 2018
Volume and issue 2
Page(s) 15
DOI 10.3389/frma.2017.00015
ISSN 2504-0537
Distribution license Creative Commons Attribution 4.0 International
Website https://www.frontiersin.org/articles/10.3389/frma.2017.00015/full
Download https://www.frontiersin.org/articles/10.3389/frma.2017.00015/pdf (PDF)

Abstract

When assessing the importance of materials (or other components) to a given set of applications, machine analysis of a very large corpus of scientific abstracts can provide an analyst a base of insights to develop further. The use of text analytics reduces the time required to conduct an evaluation, while allowing analysts to experiment with a multitude of different hypotheses. Because the scope and quantity of metadata analyzed can, and should, be large, any divergence from what a human analyst determines and what the text analysis shows provides a prompt for the human analyst to reassess any preliminary findings. In this work, we have successfully extracted material–application pairs and ranked them on their importance. This method provides a novel way to map scientific advances in a particular material to the application for which it is used. Approximately 438,000 titles and abstracts of scientific papers published from 1992 to 2011 were used to examine 16 materials. This analysis used coclustering text analysis to associate individual materials with specific clean energy applications, evaluate the importance of materials to specific applications, and assess their importance to clean energy overall. Our analysis reproduced the judgments of experts in assigning material importance to applications. The validated methods were then used to map the replacement of one material with another material in a specific application (batteries).

Keywords: machine learning classification, science policy, coclustering, text analytics, critical materials, big data

Introduction

Scientific research and technological development are inherently combinatorial practices.[1] Researchers draw from, and build on, existing work in advancing the state of the art. Increasing the ability of researchers to review and understand previous research can stimulate and accelerate scientific progress. However, the number of scientific publications grows exponentially every year both on the aggregate level and in an individual field.[2] It is impossible for any single researcher or organization to keep up with the vastness of new scientific publications. The ability to use text analytics to map the current state of the art to detect progress would enable more efficient analyses of data.

The Intelligence Advanced Research Projects Activity recognized the scale problem in 2011, creating the research program Foresight and Understanding from Scientific Exposition. Under this program, SRI and other performers processed “the massive, multi-discipline, growing, noisy, and multilingual body of scientific and patent literature from around the world and automatically generated and prioritized technical terms within emerging technical areas, nominated those that exhibit technical emergence, and provided compelling evidence for the emergence.”[3] The work presented here applies and extends that platform to efficiently identify and describe the past and present evolution of research on a given set of materials. This work applies text analytics to demonstrate how these computational tools can be used by analysts to analyze much larger sets of data and develop more iterative and adaptive material assessments to better inform and shape government and industry research strategy and resource allocation.

References

  1. Arthur, W.B. (2009). The Nature of Technology: What It Is and How It Evolves. Simon and Schuster. p. 256. ISBN 9781439165782. 
  2. National Science Board (11 January 2016). "Science and Engineering Indicators 2016" (PDF). National Science Foundation. pp. 899. https://www.nsf.gov/statistics/2016/nsb20161/uploads/1/nsb20161.pdf. 
  3. "IARPA Launches New Program to Enable the Rapid Discovery of Emerging Technical Capabilities". Office of the Director of National Intelligence. 27 September 2011. https://www.dni.gov/index.php/newsroom/press-releases/press-releases-2011/item/327-iarpa-launches-new-program-to-enable-the-rapid-discovery-of-emerging-technical-capabilities. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version — by design — lists them in order of appearance.