Subscribe to RSS
DOI: 10.1055/s-0043-1776917
A Machine Learning Algorithm using Clinical and Demographic Data for All-Cause Preterm Birth Prediction
Funding None.Abstract
Objective Preterm birth remains the predominant cause of perinatal mortality throughout the United States and the world, with well-documented racial and socioeconomic disparities. To develop and validate a predictive algorithm for all-cause preterm birth using clinical, demographic, and laboratory data using machine learning.
Study Design We performed a cohort study of pregnant individuals delivering at a single institution using prospectively collected information on clinical conditions, patient demographics, laboratory data, and health care utilization. Our primary outcome was all-cause preterm birth before 37 weeks. The dataset was randomly divided into a derivation cohort (70%) and a separate validation cohort (30%). Predictor variables were selected amongst 33 that had been previously identified in the literature (directed machine learning). In the derivation cohort, both statistical (logistic regression) and machine learning (XG-Boost) models were used to derive the best fit (C-Statistic) and then validated using the validation cohort. We measured model discrimination with the C-Statistic and assessed the model performance and calibration of the model to determine whether the model provided clinical decision-making benefits.
Results The cohort includes a total of 12,440 deliveries among 12,071 individuals. Preterm birth occurred in 2,037 births (16.4%). The derivation cohort consisted of 8,708 (70%) and the validation cohort consisted of 3,732 (30%). XG-Boost was chosen due to the robustness of the model and the ability to deal with missing data and collinearity between predictor variables. The top five predictor variables identified as drivers of preterm birth, by feature importance metric, were multiple gestation, number of emergency department visits in the year prior to the index pregnancy, initial unknown body mass index, gravidity, and prior preterm delivery. Test performance characteristics were similar between the two populations (derivation cohort area under the curve [AUC] = 0.70 vs. validation cohort AUC = 0.63).
Conclusion Clinical, demographic, and laboratory information can be useful to predict all-cause preterm birth with moderate precision.
Key Points
-
Machine learning can be used to create models to predict preterm birth.
-
In our model, all-cause preterm birth can be predicted with moderate precision.
-
Clinical, demographic, and laboratory information can be useful to predict all-cause preterm birth.
Keywords
preterm birth - machine learning - social determinants of health - XG- boost - predictive algorithmNote
This study was presented at the Society for Maternal-Fetal Medicine 42nd Annual Meeting, Virtual Poster, February 2022.
Publication History
Received: 12 December 2022
Accepted: 18 October 2023
Article published online:
04 December 2023
© 2023. Thieme. All rights reserved.
Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA
-
References
- 1 Liu L, Oza S, Hogan D. et al. Global, regional, and national causes of under-5 mortality in 2000-15: an updated systematic analysis with implications for the sustainable development goals. Lancet 2016; 388 (10063): 3027-3035
- 2 Martin JA, Hamilton BE, Osterman MJK, Driscoll AK. Births: final data for 2018 figure 1. Number and rate of triplet and higher-order multiple births: United States, 1980–2018. 2019. Accessed April 17, 2020 at: https://www.cdc.gov/nchs/products/index.htm
- 3 Purisch SE, Gyamfi-Bannerman C. Epidemiology of preterm birth. Vol. 41, Seminars in Perinatology. W.B. Saunders; 2017: 387-91
- 4 Talati AN, Hackney DN, Mesiano S. Pathophysiology of preterm labor with intact membranes. Vol. 41, Seminars in Perinatology. W.B. Saunders; 2017: 420-6
- 5 Koning SM, Ehrenthal DB. Stressor landscapes, birth weight, and prematurity at the intersection of race and income: elucidating birth contexts through patterned life events. Popul Heal 2019; 8: 100460
- 6 Hackney DN, Durie DE, Dozier AM, Suter BJ, Glantz JC. Is the accuracy of prior preterm birth history biased by delivery characteristics?. Matern Child Health J 2012; 16 (06) 1241-1246
- 7 Mayne SL, Pellissier BF, Kershaw KN. Neighborhood physical disorder and adverse pregnancy outcomes among women in Chicago: a cross-sectional analysis of electronic health record data. J Urban Health 2019; 96 (06) 823-834
- 8 Blumenshine P, Egerter S, Barclay CJ, Cubbin C, Braveman PA. Socioeconomic disparities in adverse birth outcomes: a systematic review. Am J Prev Med 2010; 39 (03) 263-272
- 9 Ncube CN, Enquobahrie DA, Burke JG, Ye F, Marx J, Albert SM. Transgenerational transmission of preterm birth risk: the role of race and generational socio-economic neighborhood context. Matern Child Health J 2017; 21 (08) 1616-1626
- 10 Ncube CN, Enquobahrie DA, Albert SM, Herrick AL, Burke JG. Association of neighborhood context with offspring risk of preterm birth and low birthweight: a systematic review and meta-analysis of population-based studies. Vol. 153, Social Science and Medicine. Elsevier Ltd; 2016: 156-64
- 11 Zhang J, Landy HJ, Ware Branch D. et al; Consortium on Safe Labor. Contemporary patterns of spontaneous labor with normal neonatal outcomes. Obstet Gynecol 2010; 116 (06) 1281-1287
- 12 Goldenberg RL, Culhane JF, Iams JD, Romero R. Epidemiology and causes of preterm birth. Lancet 2008; 371 (9606): 75-84
- 13 Muglia LJ, Katz M. The enigma of spontaneous preterm birth. N Engl J Med 2010; 362 (06) 529-535
- 14 Chen T, Guestrin C. XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY: ACM Press; 2016: 785-794 . Accessed March 7, 2020 at: http://dl.acm.org/citation.cfm?doid=2939672.2939785
- 15 Lee KS, Ahn KH. Application of artificial intelligence in early diagnosis of spontaneous preterm labor and birth. Diagnostics (Basel) 2020; 10 (09) 733
- 16 Shah ND, Steyerberg EW, Kent DM. Big data and predictive analytics: recalibrating expectations. JAMA 2018; 320 (01) 27-28
- 17 Benedetto U, Dimagli A, Sinha S. et al. Machine learning improves mortality risk prediction after cardiac surgery: systematic review and meta-analysis. J Thorac Cardiovasc Surg 2022; 163 (06) 2075-2087.e9