Journal:How big data, comparative effectiveness research, and rapid-learning health care systems can transform patient care in radiation oncology

From LIMSWiki
Jump to navigationJump to search
Full article title How big data, comparative effectiveness research, and rapid-learning health care
systems can transform patient care in radiation oncology
Journal Frontiers in Oncology
Author(s) Sanders, Jason C.; Showalter, Timothy N.
Author affiliation(s) University of Virginia School of Medicine
Primary contact Email:
Editors Deng, Jun
Year published 2018
Volume and issue 8
Page(s) 155
DOI 10.3389/fonc.2018.00155
ISSN 2234-943X
Distribution license Creative Commons Attribution 4.0 International
Download (PDF)


Big data and comparative effectiveness research methodologies can be applied within the framework of a rapid-learning health care system (RLHCS) to accelerate discovery and to help turn the dream of fully personalized medicine into a reality. We synthesize recent advances in genomics with trends in big data to provide a forward-looking perspective on the potential of new advances to usher in an era of personalized radiation therapy, with emphases on the power of RLHCS to accelerate discovery and the future of individualized radiation treatment planning.

Keywords: big data, radiation oncology, comparative effectiveness research, rapid-learning health care system, personalized radiation therapy

Comparative effectiveness research (CER) and big data

The Committee on CER Prioritization was created by the Institute of Medicine in 2009. They defined CER as “a strategy that focuses on the practical comparison of two or more health intervention to discern what works best for which patients and populations.”[1] In essence, the goal of CER is to identify "which treatment will work best, in which patient, under what circumstances.”[2] Big data refers to data sets that are so large that they cannot be analyzed directly by individuals or traditional processing software. Big data analytics (BDA) is a growing field with a multitude of methods that is being utilized in various sectors from business to medicine.[3] The advent of the electronic medical record (EMR) has resulted in the digitization of massive data sets of medical information, including clinic encounters, laboratory values, imaging data sets and reports, pathology reports, patient outcomes, and family history, as well as genomic and biological data, etc.

To help with the analysis of big data, the National Institutes of Health (NIH) has created the Big Data to Knowledge (BD2K) program, which has invested over $200 million in grant awards to foster the development of methods and tools to analyze big data in biomedical research.[4] Additionally, the BD2K program will move to make sure that biomedical big data is “findable, accessible, interoperable, and reusable” (FAIR).[4] Over the past decade, CER methodologies have become increasingly prevalent in radiation oncology research, and there is much enthusiasm surrounding BDA.

Rapid-learning health care system (RLHCS) and personalized medicine

The number of articles on big data in health care has increased exponentially from under 500 articles in 2005 to over 2500 articles in 2015.[5] As the amount of biomedical big data and our ability to analyze these data continues to advance, so will the implications and use of the information we are able to extract. One of the most important steps toward advancing our ability to analyze these big data for biomedical discovery is the creation of RLHCS, which will allow for the sharing of patient data between EMRs, ideally in real-time.[6] An ideal RLHCS would take patient data that was routinely generated as part of standard patient care and compile that data into a large data system.[6][7][8] This aggregate data would then be available for both BDA to accelerate identification of new hypotheses and CER to rapidly generate evidence through hypothesis-testing studies. Clinical data from patient records can be used readily to identify novel relationships among clinical factors and patient outcomes, or to evaluate treatment effectiveness in specific subgroups, that cannot be studied adequately in randomized, controlled trials. The extreme power of RLCHS, though, is even more exciting when one considers the possibility of adding biospecimens to accelerate discovery in genomics and proteomics. As RLHCSs are created and their data sets are expanded, we will continue to identify specific genomic and proteomic data to help define cohorts and stratify patients into risk groups and treatment response groups, and potentially to help design highly tailored therapy regimens.[9] In this sense, the RLHCS would usher in a more fertile era for improving biomedical research than ever before. BDA and CER provide the research methodologies needed to rapidly generate evidence from the RLHCS. It should be noted, however, that there are substantial practical obstacles that must be addressed to achieve the vision of the RLHCS. These include patient concerns regarding privacy and security of sensitive information, interconnectivity among different health records, and regulatory barriers to the exchange of health information.

Integrating an RLHCS with oncology

The integration of CER, big data, and BDA is especially important in the field of oncology, where multiple groups are investing significant time and resources in efforts to expand the availability of data and advance the methods used to extract meaningful information from that data.[4][10][11][12][13][14] The American Society of Clinical Oncology started their own RLHCS, CancerLinQ, to overcome the lack of interoperability between EMRs and accomplish their goal of being able to “analyze and share data on every patient with cancer.”[15] While the vision of RLCHS has not yet been fully achieved, the potential impact on society has stimulated enthusiasm toward this effort.

Implications for radiation oncology

Patient reported outcomes (PROs)

Patient reported outcomes and quality-of-life (QoL) have become a major area of focus in health care overall, particularly in oncology. The availability of PROs within EMRs provides the foundation for an RLHCS that can be leveraged to expand insights into how cancer treatments impact patient QoL. By incorporating the PROs for massive numbers of patients, RLHCS will be able to identify small variations and subgroups of patients that might be missed in the smaller number of patients included in traditional randomized controlled trials. These PROs and QoL domains can then be incorporated into clinical decision-making to help guide both providers and patients.[16] In doing this, PROs can act as a link between the objective clinical data and the subjective patient outcomes and experiences to help improve the overall care of the patient.[17] One may also conceive of potential genomics-based determinants of QoL that could be identified using BDA if RLHCSs include biospecimens linked to clinical data and PROs. Finally, surveillance of an RLHCS may also be performed to identify temporal trends in PROs to estimate outcomes after implementation of new technologies.

Dose selection and radiosensitivity

The use of tumor-specific genes and radiosensitivity to guided treatment decisions has already been established in human papilloma virus-associated squamous-cell carcinoma of the oropharynx.[18] Numerous studies have looked at identifying genes that may have implications on tumor radiosensitivity or patient toxicity.[19][20][21][22] The identification of these genes and their potential implications has led to the creation of the fields of radiogenetics and radiogenomics. Efforts are currently underway to generate meaningful gene assays that will help predict tumor response to radiation. Eschrich et al. created a 10-gene model to calculate a radiosensitivity index and applied this to patients with head-and-neck, rectal, and esophageal cancer to help stratify patients into either responders or non-responders, with 80% sensitivity and 82% specificity.[22] Similarly, Zhao et al. retrospectively created a 24-gene assay and applied this to risk matched patients who either received postoperative radiation or no radiation following prostatectomy. Patients with a high score on the gene index who received postoperative radiation were less likely to have distant metastasis at 10 years.[23]

As efforts to identify genes and gene assays that may be predictors of radiosensitivity continue to be validated, we will potentially be able to integrate these findings in dose selection and toxicity prediction for individual patients based on their native and tumor genetics. Scott and colleagues have recently described a genomics-based strategy for personalizing radiation therapy dose, which would support dose de-escalation for radiosensitive tumors.[24] While the clinical implication of radiosensitivity assays are still developing, big data will be key to developing future assays rapidly, as well as incorporating the genomics tools into clinical decision-making. Big data provides an opportunity to refine molecular signatures based upon real-world data and to merge genomic assay results with other clinical data elements to optimize predictive analytics. An RLHCS would provide the ideal substrate for levering big data and CER to accelerate genomics-based discovery to make precision radiation oncology a reality.

Personalized treatment recommendations

Radiation oncology is unique in that treatment plans for patients are often already technically and physically personalized due to patient-specific variations in anatomy, tumor characteristics, and stage. Since a patient’s treatment plan is usually based upon a CT scan in treatment position, radiation can be considered an inherently personalized form of medicine. However, treatment planning approaches and radiation doses are generally selected based upon class solution, with technical details such as beam arrangements and dose–volume constraints adherent to generalized rules. Multiple studies have already begun to look at how BDA methods such as machine learning and neural networks can be used to aid in dose optimization and toxicity prediction modeling in radiation oncology[17][25][26][27], which could provide more optimal treatment plan alternatives for individual patients. As the data and technology behind RLHCS continues to progress, we will likely be able to utilize a full spectrum of patient-specific clinical factors, PROs, genomics, patient preference, and priorities, and a menu of treatment plan alternatives in order to optimize an individual patient’s radiation therapy. In order to deliver high quality, high impact insights into radiation oncology, it is important that large datasets include detailed technical information.


Much of the excitement regarding big data has centered on potential for genomic discovery, high-level radiation treatment planning, and leveraging EMRs to identify associations among factors that may provide new insights into potential causal relationships that can be further studied to accelerate progress in cancer care. Although these are certainly promising areas for discovery, we most eagerly anticipate the power of big data to connect a broad range of characteristics to accelerate evidence generation and inform personalized decision-making. We envision the use of big data and CER methods to inform the individual decisions of patients and providers by synthesizing clinical and genomic data and querying an RLHCS for the latest data on effectiveness of treatment options in relevant subgroups of patients.


Author contributions

Both authors contributed to the development and editing of the manuscript and approved the final submitted version.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


  1. Institute of Medicine of the National Academies (2009). Initial National Priorities for Comparative Effectiveness Research. National Academies Press. ISBN 9780309138369. 
  2. Greenfield, S.; Rich, E. (2012). "Welcome to the Journal of Comparative Effectiveness Research". Journal of Comparative Effectiveness Research 1 (1): 1–3. doi:10.2217/cer.11.13. PMID 24237290. 
  3. Sivarajah, U.; Kamal, M.M.; Irani, Z.; Weerakkody, V. (2017). "Critical analysis of Big Data challenges and analytical methods". Journal of Business Research 70: 263–86. doi:10.1016/j.jbusres.2016.08.001. 
  4. 4.0 4.1 4.2 Margolis, R.; Derr, L.; Dunn, M. et al. (2014). "The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: Capitalizing on biomedical big data". JAMIA 21 (6): 957–8. doi:10.1136/amiajnl-2014-002974. PMC PMC4215061. PMID 25008006. 
  5. de la Torre Díez, I.; Cosgava, H.M.; Garcia-Zapirain, B.; López-Coronado, M. (2016). "Big Data in Health: a Literature Review from the Year 2005". Journal of Medical Systems 40 (9): 209. doi:10.1007/s10916-016-0565-7. PMID 27520614. 
  6. 6.0 6.1 Ginsburg, G.S.; Kuderer, N.M. (2012). "Comparative effectiveness research, genomics-enabled personalized medicine, and rapid learning health care: A common bond". Journal of Clinical Oncology 30 (34): 4233-42. doi:10.1200/JCO.2012.42.6114. PMC PMC3504328. PMID 23071236. 
  7. Ginsburg, G.S.; Staples, J.; Abernethy, A.P. (2011). "Academic medical centers: Ripe for rapid-learning personalized health care". Science Translational Medicine 3 (101): 101cm27. doi:10.1126/scitranslmed.3002386. PMID 21937754. 
  8. Abernethy, A.P.; Etheredge, L.M.; Ganz, P.A. et al. (2010). "Rapid-learning system for cancer care". Journal of Clinical Oncology 28 (27): 4268-74. doi:10.1200/JCO.2010.28.5478. PMC PMC2953977. PMID 20585094. 
  9. Ramsey, S.D.; Veenstra, D.; Tunis, S.R. et al. (2011). "How comparative effectiveness research can help advance 'personalized medicine' in cancer treatment". Health Affairs 30 (12): 2259–68. doi:10.1377/hlthaff.2010.0637. PMC PMC3477796. PMID 22147853. 
  10. Helft, M. (2014). "Can big data cure cancer?". Fortune 170 (2): 70–4, 76, 78. PMID 25318238. 
  11. Williams, A.M.; Liu, Y.; Regner, K.R. et al. (2018). "Artificial intelligence, physiological genomics, and precision medicine". Physiological Genomics 50 (4): 237–43. doi:10.1152/physiolgenomics.00119.2017. PMC PMC5966805. PMID 29373082. 
  12. Savage, N. (2014). "Big data versus the big C". Scientific American 311 (1): S20–1. PMID 24974705. 
  13. Shah, A.; Stewart, A.K.; Kolacevski, A. et al. (2016). "Building a rapid learning health care system for oncology: Why CancerLinQ collects identifiable health information to achieve its vision". Journal of Clinical Oncology 34 (7): 756–63. doi:10.1200/JCO.2015.65.0598. PMID 26755519. 
  14. Trifiletti, D.M.; Showalter, T.N. (2015). "Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery". Frontiers in Oncology 5: 274. doi:10.3389/fonc.2015.00274. PMC PMC4672039. PMID 26697409. 
  15. "Shaping the Future of Oncology: Envisioning Cancer Care in 2030: Outcomes of the ASCO Board of Directors Strategic Planning and Visioning Process, 2011-2012". American Society of Clinical Oncology. 2011. 
  16. Sarin, R. (2014). "Big Data V4 for integrating patient reported outcomes and quality-of-life indices in clinical practice". Journal of Cancer Research and Therapies 10 (3): 453-5. doi:10.4103/0973-1482.142741. PMID 25313720. 
  17. 17.0 17.1 Kim, K.H.; Lee, S.; Shim, J.B. et al. (2017). "Predictive modelling analysis for development of a radiotherapy decision support system in prostate cancer: A preliminary study". Journal of Radiotherapy in Practice 16 (2): 161–70. doi:10.1017/S1460396916000583. 
  18. Chen, A.M.; Felix, C.; Wang, P.C. et al. (2017). "Reduced-dose radiotherapy for human papillomavirus-associated squamous-cell carcinoma of the oropharynx: A single-arm, phase 2 study". The Lancet, Oncology 18 (6): 803–11. doi:10.1016/S1470-2045(17)30246-2. PMID 28434660. 
  19. West, C.M.; Barnett, G.C. (2011). "Genetics and genomics of radiotherapy toxicity: Towards prediction". Genome Medicine 3 (8): 52. doi:10.1186/gm268. PMC PMC3238178. PMID 21861849. 
  20. Torres-Roca, J.F.; Eschrich, S.; Zhao, H. et al. (2005). "Prediction of radiation sensitivity using a gene expression classifier". Cancer Research 65 (16): 7169-76. doi:10.1158/0008-5472.CAN-05-0656. PMID 16103067. 
  21. Chistiakov, D.A.; Voronova, N.V.; Chistiakov, P.A. (2008). "Genetic variations in DNA repair genes, radiosensitivity to cancer and susceptibility to acute tissue reactions in radiotherapy-treated cancer patients". Acta Oncologica 47 (5): 809-24. doi:10.1080/02841860801885969. PMID 18568480. 
  22. 22.0 22.1 Eschrich, S.A.; Pramana, J.; Zhang, H. et al. (2009). "A gene expression model of intrinsic tumor radiosensitivity: Prediction of response and prognosis after chemoradiation". International Journal of Radiation Oncology, Biology, and Physics 75 (2): 489-96. doi:10.1016/j.ijrobp.2009.06.014. PMC PMC3038688. PMID 19735873. 
  23. Zhao, S.G.; Chang, S.L.; Spratt, D.E. et al. (2016). "Development and validation of a 24-gene predictor of response to postoperative radiotherapy in prostate cancer: A matched, retrospective analysis". The Lancet, Oncology 17 (11): 1612–20. doi:10.1016/S1470-2045(16)30491-0. PMID 27743920. 
  24. Scott, J.G.; Berglund, A.; Schell, M.J. et al. (2017). "A genome-based model for adjusting radiotherapy dose (GARD): A retrospective, cohort-based study". The Lancet, Oncology 18 (2): 202-211. doi:10.1016/S1470-2045(16)30648-9. PMID 27993569. 
  25. Kim, K.H.; Lee, S.; Shim, J.B. et al. (2017). "A text-based data mining and toxicity prediction modeling system for a clinical decision support in radiation oncology: A preliminary study". Journal of the Korean Physical Society 71 (4): 231–7. doi:10.3938/jkps.71.231. 
  26. Arimura, H.; Nakamoto, T. (2016). "Applications of Machine Learning for Radiation Therapy". Igaku Butsuri 36 (1): 35–8. doi:10.11323/jjmp.36.1_35. PMID 28428495. 
  27. Nicolae, A.; Morton, G.; Chung, H. et al. (2017). "Evaluation of a Machine-Learning Algorithm for Treatment Planning in Prostate Low-Dose-Rate Brachytherapy". International Journal of Radiation Oncology, Biology, and Physics 97 (4): 822-829. doi:10.1016/j.ijrobp.2016.11.036. PMID 28244419. 


This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.