Subscribe to RSS
DOI: 10.1055/s-0042-1749119
Diversity in Machine Learning: A Systematic Review of Text-Based Diagnostic Applications
Abstract
Objective As the storage of clinical data has transitioned into electronic formats, medical informatics has become increasingly relevant in providing diagnostic aid. The purpose of this review is to evaluate machine learning models that use text data for diagnosis and to assess the diversity of the included study populations.
Methods We conducted a systematic literature review on three public databases. Two authors reviewed every abstract for inclusion. Articles were included if they used or developed machine learning algorithms to aid in diagnosis. Articles focusing on imaging informatics were excluded.
Results From 2,260 identified papers, we included 78. Of the machine learning models used, neural networks were relied upon most frequently (44.9%). Studies had a median population of 661.5 patients, and diseases and disorders of 10 different body systems were studied. Of the 35.9% (N = 28) of papers that included race data, 57.1% (N = 16) of study populations were majority White, 14.3% were majority Asian, and 7.1% were majority Black. In 75% (N = 21) of papers, White was the largest racial group represented. Of the papers included, 43.6% (N = 34) included the sex ratio of the patient population.
Discussion With the power to build robust algorithms supported by massive quantities of clinical data, machine learning is shaping the future of diagnostics. Limitations of the underlying data create potential biases, especially if patient demographics are unknown or not included in the training.
Conclusion As the movement toward clinical reliance on machine learning accelerates, both recording demographic information and using diverse training sets should be emphasized. Extrapolating algorithms to demographics beyond the original study population leaves large gaps for potential biases.
Keywords
machine learning - diagnosis - computer assisted - clinical decision-making - electronic health records - gender dataProtection of Human and Animal Subjects
Human subjects were not included in this project.
Publication History
Received: 09 November 2021
Accepted: 04 April 2022
Article published online:
25 May 2022
© 2022. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
-
References
- 1 EMC Digital Universe with Research and Analysis by ICD. 2014 Available at: https://www.emc.com/leadership/digital-universe/2014iview/index.htm
- 2 Parasrampuria S, Henry J. Hospitals' use of electronic health records data, 2015–2017. Office of the National Coordinator for Health Information Technology. 2019 Accessed April 21, 2022 at: https://www.healthit.gov/sites/default/files/page/2019-04/AHAEHRUseDataBrief.pdf
- 3 Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA 2013; 309 (13) 1351-1352
- 4 Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med 2001; 23 (01) 89-109
- 5 Maity NG, Das S. Machine learning for improved diagnosis and prognosis in healthcare. IEEE Aerospace Conference, 2017: 1-9
- 6 Shah M, Shu D, Prasath VBS, Ni Y, Schapiro AH, Dufendach KR. Machine learning for detection of correct peripherally inserted central catheter tip position from radiology reports in infants. Appl Clin Inform 2021; 12 (04) 856-863
- 7 Hudson DL, Cohen ME. Merging medical informatics and automated diagnostic methods. Annu Int Conf IEEE Eng Med Biol Soc 2013; 2013: 4783-4786
- 8 Zou J, Schiebinger L. AI can be sexist and racist—it's time to make it fair. Nature. Accessed October 18, 2020 at: https://www.nature.com/articles/d41586-018-05707-8?source=post_page—–817fa60d75e9
- 9 PubMed [database on the Internet].. Bethesda, MD: National Library of Medicine (US). Accessed April 21, 2022 at: https://pubmed.ncbi.nlm.nih.gov/
- 10 OVID [database on the Internet].. New York, NY: Ovid Technologies. Accessed April 21, 2022 at: http://www.ovid.com
- 11 ISI Web of Knowledge [database on the Internet]. Stamford, CT: The Thompson Corporation. Accessed July 13, 2020) at: http://www.isiknowledge.com
- 12 Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33 (01) 159-174
- 13 McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb) 2012; 22 (03) 276-282
- 14 Moreira LB, Namen AA. A hybrid data mining model for diagnosis of patients with clinical suspicion of dementia. Comput Methods Programs Biomed 2018; 165: 139-149
- 15 Schipper JD, Dankel II DD, Arroyo AA, Schauben JL. A knowledge-based clinical toxicology consultant for diagnosing single exposures. Artif Intell Med 2012; 55 (02) 87-95
- 16 Giannini HM, Ginestra JC, Chivers C. et al. A machine learning algorithm to predict severe sepsis and septic shock: development, implementation, and impact on clinical practice. Crit Care Med 2019; 47 (11) 1485-1492
- 17 Pestian JP, Sorter M, Connolly B. et al; STM Research Group. A machine learning approach to identifying the thought markers of suicidal subjects: a prospective multicenter trial. Suicide Life Threat Behav 2017; 47 (01) 112-121
- 18 Thabtah F, Abdelhamid N, Peebles D. A machine learning autism classification based on logistic regression analysis. Health Inf Sci Syst 2019; 7 (01) 12
- 19 Baxt WG, Shofer FS, Sites FD, Hollander JE. A neural computational aid to the diagnosis of acute myocardial infarction. Ann Emerg Med 2002; 39 (04) 366-373
- 20 Cohen IL, Sudhalter V, Landon-Jimenez D, Keogh M. A neural network approach to the classification of autism. J Autism Dev Disord 1993; 23 (03) 443-466
- 21 Narayan S, Sathiyamoorthy E. A novel recommender system based on FFT with machine learning for predicting and identifying heart diseases. Neural Comput Appl 2019; 31: 93-102
- 22 Sun LM, Chiu HW, Chuang CY, Liu L. A prediction model based on an artificial intelligence system for moderate to severe obstructive sleep apnea. Sleep Breath 2011; 15 (03) 317-323
- 23 Bascil MS, Oztekin H. A study on hepatitis disease diagnosis using probabilistic neural network. J Med Syst 2012; 36 (03) 1603-1606
- 24 Redman JS, Natarajan Y, Hou JK. et al. Accurate identification of fatty liver disease in data warehouse utilizing natural language processing. Dig Dis Sci 2017; 62 (10) 2713-2718
- 25 Park SY, Kim SM. Acute appendicitis diagnosis using artificial neural networks. Technol Health Care 2015; 23 (23, Suppl 2): S559-S565
- 26 Nemati S, Holder A, Razmi F, Stanley MD, Clifford GD, Buchman TG. An interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit Care Med 2018; 46 (04) 547-553
- 27 Shen Y, Yuan K, Chen D. et al. An ontology-driven clinical decision support system (IDDAP) for infectious disease diagnosis and antibiotic prescription. Artif Intell Med 2018; 86: 20-32
- 28 Wilding P, Morgan MA, Grygotis AE, Shoffner MA, Rosato EF. Application of backpropagation neural networks to diagnosis of breast and ovarian cancer. Cancer Lett 1994; 77 (2-3): 145-153
- 29 Agyei-Mensah SO, Lin FC. Application of neural networks in medical diagnosis: the case of sexually-transmitted diseases. Australas Phys Eng Sci Med 1992; 15 (04) 186-192
- 30 Astion ML, Wener MH, Thomas RG, Hunder GG, Bloch DA. Application of neural networks to the classification of giant cell arteritis. Arthritis Rheum 1994; 37 (05) 760-770
- 31 Seixas JM, Faria J, Souza Filho JB, Vieira AF, Kritski A, Trajman A. Artificial neural network models to support the diagnosis of pleural tuberculosis in adult patients. Int J Tuberc Lung Dis 2013; 17 (05) 682-686
- 32 Pace F, Buscema M, Dominici P. et al. Artificial neural networks are able to recognize gastro-oesophageal reflux disease patients solely on the basis of clinical data. Eur J Gastroenterol Hepatol 2005; 17 (06) 605-610
- 33 Baldini C, Ferro F, Luciano N, Bombardieri S, Grossi E. Artificial neural networks help to identify disease subsets and to predict lymphoma in primary Sjögren's syndrome. Clin Exp Rheumatol 2018; 36 Suppl 112 (03) 137-144
- 34 Hoshi K, Kawakami J, Sato W. et al. Assisting the diagnosis of thyroid diseases with Bayesian-type and SOM-type neural networks making use of routine test data. Chem Pharm Bull (Tokyo) 2006; 54 (08) 1162-1169
- 35 Murray SG, Avati A, Schmajuk G, Yazdany J. Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling. J Am Med Inform Assoc 2019; 26 (01) 61-65
- 36 Hu Z, Simon GJ, Arsoniadis EG, Wang Y, Kwaan MR, Melton GB. Automated detection of postoperative surgical site infections using supervised methods with electronic health record data. Stud Health Technol Inform 2015; 216: 706-710
- 37 Moneta C, Parodi G, Rovetta S. et al. Automated diagnosis and disease characterization using neural network analysis. J Rheumatol 1995; 22 (03) 571-572
- 38 Hripcsak G, Knirsch CA, Jain NL, Pablos-Mendez A. Automated tuberculosis detection. J Am Med Inform Assoc 1997; 4 (05) 376-381
- 39 Gu Y, Kennelly J, Warren J, Nathani P, Boyce T. Automatic detection of skin and subcutaneous tissue infections from primary care electronic medical records. Stud Health Technol Inform 2015; 214: 74-80
- 40 Karystianis G, Nevado AJ, Kim CH, Dehghan A, Keane JA, Nenadic G. Automatic mining of symptom severity from psychiatric evaluation notes. Int J Methods Psychiatr Res 2018; 27 (01) e1602
- 41 Chuang CL. Case-based reasoning support for liver disease diagnosis. Artif Intell Med 2011; 53 (01) 15-23
- 42 Aronsky D, Fiszman M, Chapman WW, Haug PJ. Combining decision support methodologies to diagnose pneumonia. Proc AMIA Symp 2001; 12-16
- 43 Polat K, Yosunkaya S, Güneş S. Comparison of different classifier algorithms on the automated detection of obstructive sleep apnea syndrome. J Med Syst 2008; 32 (03) 243-250
- 44 Pesonen E, Eskelinen M, Juhola M. Comparison of different neural network algorithms in the diagnosis of acute appendicitis. Int J Biomed Comput 1996; 40 (03) 227-233
- 45 Su CT, Wang PC, Chen YC, Chen LF. Data mining techniques for assisting the diagnosis of pressure ulcer development in surgical patients. J Med Syst 2012; 36 (04) 2387-2399
- 46 Herasevich V, Afessa B, Chute CG, Gajic O. Designing and testing computer based screening engine for severe sepsis/septic shock. AMIA Annu Symp Proc 2008; 966: 966
- 47 Victor E, Aghajan ZM, Sewart AR, Christian R. Detecting depression using a framework combining deep multimodal neural networks with a purpose-built automated evaluation. Psychol Assess 2019; 31 (08) 1019-1027
- 48 Corey KE, Kartoun U, Zheng H, Shaw SY. Development and validation of an algorithm to identify nonalcoholic fatty liver disease in the electronic medical record. Dig Dis Sci 2016; 61 (03) 913-919
- 49 Kitporntheranunt M, Wiriyasuttiwong W. Development of a medical expert system for the diagnosis of ectopic pregnancy. J Med Assoc Thai 2010; 93 (Suppl. 02) S43-S49
- 50 Mansourypoor F, Asadi S. Development of a reinforcement learning-based evolutionary fuzzy rule-based system for diabetes diagnosis. Comput Biol Med 2017; 91: 337-352
- 51 Pesonen E, Ohmann C, Eskelinen M, Juhola M. Diagnosis of acute appendicitis in two databases. Evaluation of different neighborhoods with an LVQ neural network. Methods Inf Med 1998; 37 (01) 59-63
- 52 Shang JS, Lin YS, Goetz AM. Diagnosis of MRSA with neural networks and logistic regression approach. Health Care Manage Sci 2000; 3 (04) 287-297
- 53 Ozkan IA, Koklu M, Sert IU. Diagnosis of urinary tract infection based on artificial intelligence methods. Comput Methods Programs Biomed 2018; 166: 51-59
- 54 Barnhart-Magen G, Gotlib V, Marilus R, Einav Y. Differential diagnostics of thalassemia minor by artificial neural networks model. J Clin Lab Anal 2013; 27 (06) 481-486
- 55 Hornbrook MC, Goshen R, Choman E. et al. Early colorectal cancer detected by machine learning model using gender, age, and complete blood count data. Dig Dis Sci 2017; 62 (10) 2719-2727
- 56 Ng K, Steinhubl SR, deFilippi C, Dey S, Stewart WF. Early detection of heart failure using electronic health records: practical implications for time before diagnosis, data diversity, data quantity, and data density. Circ Cardiovasc Qual Outcomes 2016; 9 (06) 649-658
- 57 Blecker S, Sontag D, Horwitz LI. et al. Early identification of patients with acute decompensated heart failure. J Card Fail 2018; 24 (06) 357-362
- 58 Chase HS, Mitrani LR, Lu GG, Fulgieri DJ. Early recognition of multiple sclerosis using natural language processing of the electronic health record. BMC Med Inform Decis Mak 2017; 17 (01) 24
- 59 Daunhawer I, Kasser S, Koch G. et al. Enhanced early prediction of clinically relevant neonatal hyperbilirubinemia with machine learning. Pediatr Res 2019; 86 (01) 122-127
- 60 Hu D, Dong W, Lu X, Duan H, He K, Huang Z. Evidential MACE prediction of acute coronary syndrome using electronic health records. BMC Med Inform Decis Mak 2019; 19 (Suppl. 02) 61
- 61 Viktor HL, Cloete I, Beyers N. Extraction of rules for tuberculosis diagnosis using an artificial neural network. Methods Inf Med 1997; 36 (02) 160-162
- 62 Donald R, Howells T, Piper I. et al; BrainIT Group. Forewarning of hypotensive events using a Bayesian artificial neural network in neurocritical care. J Clin Monit Comput 2019; 33 (01) 39-51
- 63 Zhou L, Baughman AW, Lei VJ. et al. Identifying patients with depression using free-text clinical documents. Stud Health Technol Inform 2015; 216: 629-633
- 64 Ren Z, Hu Y, Xu L. Identifying tuberculous pleural effusion using artificial intelligence machine learning algorithms. Respir Res 2019; 20 (01) 220
- 65 Vlachonikolis IG, Karras DA, Hatzakis MJ, Paritsis N. Improved statistical classification methods in computerized psychiatric diagnosis. Med Decis Making 2000; 20 (01) 95-103
- 66 Hao SR, Geng SC, Fan LX, Chen JJ, Zhang Q, Li LJ. Intelligent diagnosis of jaundice with dynamic uncertain causality graph model. J Zhejiang Univ Sci B 2017; 18 (05) 393-401
- 67 Abbas H, Garberson F, Glover E, Wall DP. Machine learning approach for early detection of autism by combining questionnaire and home video screening. J Am Med Inform Assoc 2018; 25 (08) 1000-1007
- 68 Matam BR, Duncan H, Lowe D. Machine learning based framework to predict cardiac arrests in a paediatric intensive care unit: prediction of cardiac arrests. J Clin Monit Comput 2019; 33 (04) 713-724
- 69 Wilson MB, Ali SA, Kovatch KJ, Smith JD, Hoff PT. Machine learning diagnosis of peritonsillar abscess. Otolaryngol Head Neck Surg 2019; 161 (05) 796-799
- 70 Masino AJ, Harris MC, Forsyth D. et al. Machine learning models for early sepsis recognition in the neonatal intensive care unit using readily available electronic health record data. PLoS One 2019; 14 (02) e0212665
- 71 Flechet M, Falini S, Bonetti C. et al. Machine learning versus physicians' prediction of acute kidney injury in critically ill adults: a prospective evaluation of the AKIpredictor. Crit Care 2019; 23 (01) 282
- 72 Liu T, Lin Z, Ong ME. et al. Manifold ranking based scoring system with its application to cardiac arrest prediction: a retrospective study in emergency department patients. Comput Biol Med 2015; 67: 74-82
- 73 Thirukumaran CP, Zaman A, Rubery PT. et al. Natural language processing for the identification of surgical site infections in orthopaedics. J Bone Joint Surg Am 2019; 101 (24) 2167-2174
- 74 Afzal N, Mallipeddi VP, Sohn S. et al. Natural language processing of clinical notes for identification of critical limb ischemia. Int J Med Inform 2018; 111: 83-89
- 75 Ellenius J, Groth T, Lindahl B. Neural network analysis of biochemical markers for early assessment of acute myocardial infarction. Stud Health Technol Inform 1997; 43 (Pt A): 382-385
- 76 Ibrahim F, Faisal T, Salim MI, Taib MN. Non-invasive diagnosis of risk in dengue patients using bioelectrical impedance analysis and artificial neural network. Med Biol Eng Comput 2010; 48 (11) 1141-1148
- 77 Hsieh CH, Lu RH, Lee NH, Chiu WT, Hsu MH, Li YC. Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks. Surgery 2011; 149 (01) 87-93
- 78 Cook BL, Progovac AM, Chen P, Mullin B, Hou S, Baca-Garcia E. Novel use of natural language processing (NLP) to predict suicidal ideation and psychiatric symptoms in a text-based mental health intervention in Madrid. Comput Math Methods Med 2016; 2016: 8708434
- 79 Lipschuetz M, Guedalia J, Rottenstreich A. et al. Prediction of vaginal birth after cesarean deliveries using machine learning. Am J Obstet Gynecol 2020; 222 (06) 613.e1-613.e12
- 80 Sabra S, Mahmood Malik K, Alobaidi M. Prediction of venous thromboembolism using semantic and sentiment analyses of clinical narratives. Comput Biol Med 2018; 94: 1-10
- 81 Sanders DL, Aronsky D. Prospective evaluation of a Bayesian network for detecting asthma exacerbations in a pediatric emergency department. AMIA Annu Symp Proc 2006; 2006: 1085
- 82 Chen R, Stewart WF, Sun J, Ng K, Yan X. Recurrent neural networks for early detection of heart failure from longitudinal electronic health record data: implications for temporal modeling with respect to time before diagnosis, data density, data quantity, and data type. Circ Cardiovasc Qual Outcomes 2019; 12 (10) e005114
- 83 McCoy Jr TH, Pellegrini AM, Perlis RH. Research domain criteria scores estimated through natural language processing are associated with risk for suicide and accidental death. Depress Anxiety 2019; 36 (05) 392-399
- 84 Han L, Luo S, Yu J, Pan L, Chen S. Rule extraction from support vector machines using ensemble learning approach: an application for diagnosis of diabetes. IEEE J Biomed Health Inform 2015; 19 (02) 728-734
- 85 Teoh D. Towards stroke prediction using electronic health records. BMC Med Inform Decis Mak 2018; 18 (01) 127
- 86 Baxt WG. Use of an artificial neural network for the diagnosis of myocardial infarction. Ann Intern Med 1991; 115 (11) 843-848
- 87 Wang SV, Rogers JR, Jin Y, Bates DW, Fischer MA. Use of electronic healthcare records to identify complex patients with atrial fibrillation for targeted intervention. J Am Med Inform Assoc 2017; 24 (02) 339-344
- 88 Corwin DJ, Propert KJ, Zorc JJ, Zonfrillo MR, Wiebe DJ. Use of the vestibular and oculomotor examination for concussion in a pediatric emergency department. Am J Emerg Med 2019; 37 (07) 1219-1223
- 89 Hopkins BS, Mazmudar A, Driscoll C. et al. Using artificial intelligence (AI) to predict postoperative surgical site infection: a retrospective cohort of 4046 posterior spinal fusions. Clin Neurol Neurosurg 2020; 192: 105718
- 90 Wang SJ, Ohno-Machado L, Fraser HS, Kennedy RL. Using patient-reportable clinical history factors to predict myocardial infarction. Comput Biol Med 2001; 31 (01) 1-13
- 91 Welsh G, Wahner-Roedler D, Froehling DA, Trusko B, Elkin P. Whole record surveillance is superior to chief complaint surveillance for predicting influenza. AMIA Annu Symp Proc 2008; 1173: 1173
- 92 Polubriaginof FCG, Ryan P, Salmasian H. et al. Challenges with quality of race and ethnicity data in observational databases. J Am Med Inform Assoc 2019; 26 (8-9): 730-736
- 93 Sholle ET, Pinheiro LC, Adekkanattu P. et al. Underserved populations with missing race ethnicity data differ significantly from those with structured race/ethnicity documentation. J Am Med Inform Assoc 2019; 26 (8-9): 722-729
- 94 Flanagin A, Frey T, Christiansen SL. AMA Manual of Style Committee. Updated guidance on the reporting of race and ethnicity in medical and science journals. JAMA 2021; 326 (07) 621-627
- 95 Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA 2019; 322 (24) 2377-2378
- 96 Gijsberts CM, Groenewegen KA, Hoefer IE. et al. Race/ethnic differences in the associations of the Framingham risk factors with carotid IMT and cardiovascular events. PLoS One 2015; 10 (07) e0132321
- 97 Powe NR. Black kidney function matters: use or misuse of race?. JAMA 2020; 324 (08) 737-738
- 98 Weinberger DR, Dzirasa K, Crumpton-Young LL. Missing in action: African ancestry brain research. Neuron 2020; 107 (03) 407-411
- 99 McCarthy AM, Bristol M, Domchek SM. et al. Health care segregation, physician recommendation, and racial disparities in BRCA1/2 testing among women with breast cancer. J Clin Oncol 2016; 34 (22) 2610-2618
- 100 Suther S, Kiros GE. Barriers to the use of genetic testing: a study of racial and ethnic disparities. Genet Med 2009; 11 (09) 655-662
- 101 Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med 2018; 178 (11) 1544-1547
- 102 Vyas DA, Eisenstein LG, Jones DS. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms. N Engl J Med 2020; 383 (09) 874-882
- 103 Bamshad M. Genetic influences on health: does race matter?. JAMA 2005; 294 (08) 937-946 [ Erratum in: JAMA. 2005 Oct 5;294(13):1620. PMID: 16118384]
- 104 Liu X, Anstey J, Li R, Sarabu C, Sono R, Butte AJ. Rethinking PICO in the machine learning era: ML-PICO. Appl Clin Inform 2021; 12 (02) 407-416
- 105 Adlung L, Cohen Y, Mor U. et al. Machine learning in clinical decision making. Med 2021; 2 (06) 642-665
- 106 Holzinger A, Langs G, Denk H, Zatloukal K, Müller H. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov 2019; 9 (04) e1312
- 107 Thomsen K, Iversen L, Titlestad TL, Winther O. Systematic review of machine learning for diagnosis and prognosis in dermatology. J Dermatolog Treat 2020; 31 (05) 496-510
- 108 de Filippis R, Carbone EA, Gaetano R. et al. Machine learning techniques in a structural and functional MRI diagnostic approach in schizophrenia: a systematic review. Neuropsychiatr Dis Treat 2019; 15: 1605-1627
- 109 Kassem MA, Hosny KM, Damaševičius R, Eltoukhy MM. Machine learning and deep learning methods for skin lesion classification and diagnosis: a systematic review. Diagnostics (Basel) 2021; 11 (08) 1390