Abstract
1. Introduction
2. Method and data sources
3. Results
4. Discussion
5. Conclusions
Acknowledgments
References
Abstract
In this study, we design, develop, implement and test an analytical framework and measurement model to detect scientific discoveries with ‘breakthrough’ characteristics. To do so, we have developed a series of computerised search algorithms that data mine large quantities of research publications. These algorithms facilitate early-stage detection of ‘breakout’ papers that emerge as highly cited and distinctive and are considered to be potential breakthroughs. Combining computer-aided data mining with decision heuristics, enabled us to assess structural changes within citation patterns with the international scientific literature. In our case studies, we applied a citation impact time window of 24–۳۶ months after publication of each research paper. In this paper, we report on our test results, in which five algorithms were applied to the entire Web of Science database. We analysed the citation impact patterns of all research articles from the period 1990–۱۹۹۴٫ We succeeded in detecting many papers with distinctive impact profiles (breakouts). A small subset of these breakouts is classified as ‘breakthroughs’: Nobel Prize research papers; papers occurring in Nature’s Top-100 Most Cited Papers Ever; papers still (highly) cited by review papers or patents; or those frequently mentioned in today’s social media. We also compare the outcomes of our algorithms with the results of a ‘baseline’ detection algorithm developed by Redner in 2005, which selects the world’s most highly cited ‘hot papers’. The detection rates of the algorithms vary, but overall, they present a powerful tool for tracing breakout papers in science. The wider applicability of these algorithms, across all science fields, has not yet been ascertained. Whether or not our early-stage breakout papers present a ‘breakthrough’ remains a matter of opinion, where input from subject experts is needed for verification and confirmation, but our detection approach certain helps to limit the search domain to trace and track important emerging topics in science.
Introduction
Scientific and scholarly research may result in a new discovery1 . The nature and impact of such a discovery on the cognitive structure and evolution of science may vary considerably. Some of those discoveries, each showing a major impact on future scientific research, are considered to signal possible breaches, focus shifts, or even turning points in science. The term breakthrough is usually used for those discoveries that have such a major impact on science. The impact of discoveries may extend beyond the domain of science and may be crucial steps towards technological applications, and to innovations and products. Several well-known studies such as the Hindsight study (Isenson, 1969), the studies conducted by Jewkes et al. (1958), studies performed by the IIT Research Institute (1968, 1969) investigating the research and development process leading to innovation, the Battelle study (Globe et al., 1973), the Retrosight project (Wooding, 2007) and the TRACES study (Walsh, 1973) searched for the impact of scientific discoveries on the development of technology2 . Common to these and other studies is the conclusion that it can take many years before a scientific discovery finds its way into new or adapted technology3 . The interaction between science and technology is discussed in Grupp (1992) and Schmoch (1993). Scientific discoveries and their incorporation in technology are often interlinked in complex ways within research and development (R&D)4 systems, and may span several years, decades, or even centuries.