Jack Hester



Aditi DeokarCEPC 0911Professor HesterAugust 11, 2020Prediction of Gestational Diabetes Mellitus with Machine Learning using Visceral Adipose Tissue MeasurementsAbstract:Introduction: In gestational diabetes mellitus (GDM), temporary glucose intolerance develops during pregnancy, and can cause a wide variety of adverse maternal and fetal outcomes, from morbidity to future obesity and diabetes in the mother or the child. The incidence of GDM is increasing, and early detection methods for GDM will help patients receive treatment sooner to avert some of these outcomes. Methods: This study used a variety of machine learning techniques to predict the risk of GDM in a cohort of 100 patients of varying gestational ages with clinical information. Notably, we used visceral adipose tissue (VAT) measurements as a risk factor, which have not to our knowledge previously been used in a machine learning predictive model for GDM. Results and Discussion: Of the machine learning models tested, gradient boosting performed the best, with a cross-validation recall of 71.4% and AUC-ROC score of 0.864. VAT was the most important feature for the gradient boosting algorithm, indicating its importance in gestational diabetes prediction. Future work can further investigate the changes in VAT and other predictive factors over different gestational ages and develop a model that accounts for these changes.Introduction:Gestational diabetes mellitus (GDM) is a disease in which temporary glucose intolerance develops during pregnancy ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1172/JCI200524531.The","ISBN":"0021-9738","PMID":"15765129","abstract":"Gestational diabetes mellitus (GDM) is defined as glucose intolerance of various degrees that is first detected during pregnancy. GDM is detected through the screening of pregnant women for clinical risk factors and, among at-risk women, testing for abnormal glucose tolerance that is usually, but not invariably, mild and asymptomatic. GDM appears to result from the same broad spectrum of physiological and genetic abnormalities that characterize diabetes outside of pregnancy. Indeed, women with GDM are at high risk for having or developing diabetes when they are not pregnant. Thus, GDM provides a unique opportunity to study the early pathogenesis of diabetes and to develop interventions to prevent the disease.","author":[{"dropping-particle":"","family":"Buchanan","given":"Thomas a","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiang","given":"Anny H","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"The Journal of Clinical Investigation","id":"ITEM-1","issue":"3","issued":{"date-parts":[["2005"]]},"page":"485-491","title":"Gestational diabetes mellitus","type":"article-journal","volume":"115"},"uris":[""]}],"mendeley":{"formattedCitation":"(Buchanan & Xiang, 2005)","plainTextFormattedCitation":"(Buchanan & Xiang, 2005)","previouslyFormattedCitation":"(Buchanan & Xiang, 2005)"},"properties":{"noteIndex":0},"schema":""}(Buchanan & Xiang, 2005). The incidence of GDM is increasing in several ethnic groups, including non-Hispanic whites, Hispanics, African Americans, and Asians ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.2337/diacare.28.3.579","ISSN":"01495992","PMID":"15735191","abstract":"OBJECTIVE - The prevalence of gestational diabetes mellitus (GDM) varies in direct proportion with the prevalence of type 2 diabetes in a given population or ethnic group. Given that the number of people with diabetes worldwide is expected to increase at record levels through 2030, we examined temporal trends in GDM among diverse ethnic groups. RESEARCH DESIGN AND METHODS - Kaiser Permanente of Colorado (KPCO) has used a standard protocol to universally screen for GDM since 1994. This report is based on 36,403 KPCO singleton pregnancies occurring between 1994 and 2002 and examines trends in GDM prevalence among women with diverse ethnic backgrounds. RESULTS - The prevalence of GDM among KPCO members doubled from 1994 to 2002 (2.1-4.1%, P < 0.001), with significant increases in all racial/ethnic groups. In logistic regression, year of diagnosis (odds ratio [OR] and 95% CI per 1 year = 1.12 [1.09-1.14]), mother's age (OR per 5 years = 1.7 [1.6-1.8]) and ethnicity other than non-Hispanic white (OR = 2.1 [1.9-2.4]) were all significantly associated with GDM. Birth year remained significant (OR = 1.06, P = 0.006), even after adjusting for prior GDM history. CONCLUSIONS - This study shows that the prevalence of GDM is increasing in a universally screened multiethnic population. The increasing GDM prevalence suggests that the vicious cycle of diabetes in pregnancy initially described among Pima Indians may also be occurring among other U.S. ethnic groups. ? 2005 by the American Diabetes Association.","author":[{"dropping-particle":"","family":"Dabelea","given":"Dana","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Snell-Bergeon","given":"Janet K.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Hartsfield","given":"Cynthia L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Bischoff","given":"Kimberly J.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Hamman","given":"Richard F.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"McDuffie","given":"Robert S.","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Diabetes Care","id":"ITEM-1","issue":"3","issued":{"date-parts":[["2005"]]},"page":"579-584","title":"Increasing prevalence of gestational diabetes mellitus (GDM) over time and by birth cohort: Kaiser Permanente of Colorado GDM screening program","type":"article-journal","volume":"28"},"uris":[""]}],"mendeley":{"formattedCitation":"(Dabelea et al., 2005)","plainTextFormattedCitation":"(Dabelea et al., 2005)","previouslyFormattedCitation":"(Dabelea et al., 2005)"},"properties":{"noteIndex":0},"schema":""}(Dabelea et al., 2005). This is most likely due to the increase in obesity throughout the world, as fast food consumption is a risk factor for GDM ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1371/journal.pone.0106627","ISSN":"19326203","PMID":"25215961","abstract":"Background: Gestational diabetes prevalence is increasing, mostly because obesity among women of reproductive age is continuously escalating. We aimed to investigate the incidence of gestational diabetes according to the consumption of fast food in a cohort of university graduates. Methods: The prospective dynamic \"Seguimiento Universidad de Navarra\" (SUN) cohort included data of 3,048 women initially free of diabetes or previous gestational diabetes who reported at least one pregnancy between December 1999 and March 2011. Fast food consumption was assessed through a validated 136-item semi-quantitative food frequency questionnaire. Fast food was defined as the consumption of hamburgers, sausages, and pizza. Three categories of fast food were established: low (0-3 servings/month), intermediate (.3 servings/month and #2 servings/week) and high (.2 servings/week). Non-conditional logistic regression models were used to adjust for potential confounders. Results: We identified 159 incident cases of gestational diabetes during follow-up. After adjusting for age, baseline body mass index, total energy intake, smoking, physical activity, family history of diabetes, cardiovascular disease/hypertension at baseline, parity, adherence to Mediterranean dietary pattern, alcohol intake, fiber intake, and sugar-sweetened soft drinks consumption, fast food consumption was significantly associated with a higher risk of incident gestational diabetes, with multivariate adjusted OR of 1.31 (95% conficence interval [CI]:0.81-2.13) and 1.86 (95% CI: 1.13-3.06) for the intermediate and high categories, respectively, versus the lowest category of baseline fast food consumption (p for linear trend: 0.007). Conclusion: Our results suggest that pre-pregnancy higher consumption of fast food is an independent risk factor for gestational diabetes.","author":[{"dropping-particle":"","family":"Dominguez","given":"Ligia J.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Martínez-González","given":"Miguel A.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Basterra-Gortari","given":"Francisco Javier","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Gea","given":"Alfredo","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Barbagallo","given":"Mario","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Bes-Rastrollo","given":"Maira","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"PLoS ONE","id":"ITEM-1","issue":"9","issued":{"date-parts":[["2014"]]},"page":"1-7","title":"Fast food consumption and gestational diabetes incidence in the SUN project","type":"article-journal","volume":"9"},"uris":[""]}],"mendeley":{"formattedCitation":"(Dominguez et al., 2014)","plainTextFormattedCitation":"(Dominguez et al., 2014)","previouslyFormattedCitation":"(Dominguez et al., 2014)"},"properties":{"noteIndex":0},"schema":""}(Dominguez et al., 2014). GDM is associated with an increased risk of maternal and fetal morbidity, as well as more specific complications including preterm delivery, need for cesarean section, infants being large for their gestational age, and future obesity and diabetes in the mother and the child ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1172/JCI200524531.The","ISBN":"0021-9738","PMID":"15765129","abstract":"Gestational diabetes mellitus (GDM) is defined as glucose intolerance of various degrees that is first detected during pregnancy. GDM is detected through the screening of pregnant women for clinical risk factors and, among at-risk women, testing for abnormal glucose tolerance that is usually, but not invariably, mild and asymptomatic. GDM appears to result from the same broad spectrum of physiological and genetic abnormalities that characterize diabetes outside of pregnancy. Indeed, women with GDM are at high risk for having or developing diabetes when they are not pregnant. Thus, GDM provides a unique opportunity to study the early pathogenesis of diabetes and to develop interventions to prevent the disease.","author":[{"dropping-particle":"","family":"Buchanan","given":"Thomas a","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiang","given":"Anny H","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"The Journal of Clinical Investigation","id":"ITEM-1","issue":"3","issued":{"date-parts":[["2005"]]},"page":"485-491","title":"Gestational diabetes mellitus","type":"article-journal","volume":"115"},"uris":[""]},{"id":"ITEM-2","itemData":{"DOI":"10.1016/S0020-7292(01)00496-9","ISSN":"00207292","abstract":"Objectives: To study prevalence, risk factors, and maternal and infant outcomes of women with gestational diabetes mellitus (GDM). Methods: A retrospective cohort study was performed based on 111 563 pregnancies delivered between 1991 through 1997 in 39 hospitals in northern and central Alberta, Canada. Multivariate logistic regression was used to estimate the odds ratios with 95% confidence intervals, and to control for confounding variables. Results: The prevalence of GDM was 2.5%. Risk factors for GDM included age > 35 years, obesity, history of prior neonatal death, and prior cesarean section. Teenage mothers and women who drank alcohol were less likely to have GDM. Mothers with GDM were at increased risk of presenting with pre-eclampsia, premature rupture of membranes, cesarean section, and preterm delivery. Infants born to mothers with GDM were at higher risk of being macrosomic or large-for-gestational-age. Conclusions: Specific conditions predispose to GDM which itself is associated with a significantly increased risk of maternal and fetal morbidity. ? 2001 Elsevier Science B.V. International Federation of Gynecology and Obstetrics.","author":[{"dropping-particle":"","family":"Xiong","given":"X.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Saunders","given":"L. D.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wang","given":"F. L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Demianczuk","given":"N. N.","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"International Journal of Gynecology and Obstetrics","id":"ITEM-2","issue":"3","issued":{"date-parts":[["2001"]]},"page":"221-228","title":"Gestational diabetes mellitus: Prevalence, risk factors, maternal and infant outcomes","type":"article-journal","volume":"75"},"uris":[""]}],"mendeley":{"formattedCitation":"(Buchanan & Xiang, 2005; Xiong et al., 2001)","plainTextFormattedCitation":"(Buchanan & Xiang, 2005; Xiong et al., 2001)","previouslyFormattedCitation":"(Buchanan & Xiang, 2005; Xiong et al., 2001)"},"properties":{"noteIndex":0},"schema":""}(Buchanan & Xiang, 2005; Xiong et al., 2001).GDM is currently diagnosed using an oral glucose tolerance test (OGTT) between 24 and 28 weeks of pregnancy ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1177/1753495X14536891","ISSN":"17534968","abstract":"We conducted a National survey between February and June 2012 to evaluate the practices concerning screening,diagnosis and management of Gestational Diabetes (GDM) in England.\nOur survey has shown consistency in screening using the NICE criteria, use of 2 h 75 g OGTT at 24–28 weeks, in providing dietary support, use of metformin and ultrasound for fetal growth. But there is wide variation in the criteria used to diagnose GDM, self-monitoring of blood glucose, induction of labour and six weeks postnatal testing.\nA total of 102/126 (80%) maternity units responded. The National Institute of Health and Clinical Excellence (NICE) recommended screening criteria were used by 83% of units. All the units performed 2 h 75 g oral glucose tolerance test (OGTT) between 24 and 28 weeks. There was a wide variation in the diagnostic blood glucose values used by different units. About 86% of units used a 2 h blood glucose value of ≥7.8 mmol/l and 45% of units used fasting value ≥6.1 mmol/l to diagnose GDM. Only 26% of units advised self-monitoring of blood glucose pre meal and 1 h post-meal, whereas 64% of units advised monitoring 2 h after the meal. Metformin was started when women did not respond to dietary measures in 101 units (99%). Regular growth scans every four weeks from 28 weeks onwards were performed by 99 units (97%). Women on metformin with no complications were offered induction of labour at 38 completed weeks in 97 units (95%). 84 maternity units (82.3%) offered OGTT six weeks postnatally.","author":[{"dropping-particle":"","family":"Sukumaran","given":"S.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Madhuvrata","given":"Priya","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Bustani","given":"R.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Song","given":"S.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Farrell","given":"T. A.","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Obstetric Medicine","id":"ITEM-1","issue":"3","issued":{"date-parts":[["2014"]]},"page":"111-115","title":"Screening, diagnosis and management of gestational diabetes mellitus: A national survey","type":"article-journal","volume":"7"},"uris":[""]}],"mendeley":{"formattedCitation":"(Sukumaran et al., 2014)","plainTextFormattedCitation":"(Sukumaran et al., 2014)","previouslyFormattedCitation":"(Sukumaran et al., 2014)"},"properties":{"noteIndex":0},"schema":""}(Sukumaran et al., 2014), but because this time period is after fetal and placental development, earlier detection and treatment may better alleviate the detrimental outcomes associated with GDM ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1155/2020/4168340","ISSN":"2314-6745","abstract":" Background . Gestational diabetes mellitus (GDM) contributes to adverse pregnancy and birth outcomes. In recent decades, extensive research has been devoted to the early prediction of GDM by various methods. Machine learning methods are flexible prediction algorithms with potential advantages over conventional regression. Objective . The purpose of this study was to use machine learning methods to predict GDM and compare their performance with that of logistic regressions. Methods . We performed a retrospective, observational study including women who attended their routine first hospital visits during early pregnancy and had Down’s syndrome screening at 16-20 gestational weeks in a tertiary maternity hospital in China from 2013.1.1 to 2017.12.31. A total of 22,242 singleton pregnancies were included, and 3182 (14.31%) women developed GDM. Candidate predictors included maternal demographic characteristics and medical history (maternal factors) and laboratory values at early pregnancy. The models were derived from the first 70% of the data and then validated with the next 30%. Variables were trained in different machine learning models and traditional logistic regression models. Eight common machine learning methods (GDBT, AdaBoost, LGB, Logistic, Vote, XGB, Decision Tree, and Random Forest) and two common regressions (stepwise logistic regression and logistic regression with RCS) were implemented to predict the occurrence of GDM. Models were compared on discrimination and calibration metrics. Results . In the validation dataset, the machine learning and logistic regression models performed moderately (AUC 0.59-0.74). Overall, the GBDT model performed best (AUC 0.74, 95% CI 0.71-0.76) among the machine learning methods, with negligible differences between them. Fasting blood glucose, HbA1c, triglycerides, and BMI strongly contributed to GDM. A cutoff point for the predictive value at 0.3 in the GBDT model had a negative predictive value of 74.1% (95% CI 69.5%-78.2%) and a sensitivity of 90% (95% CI 88.0%-91.7%), and the cutoff point at 0.7 had a positive predictive value of 93.2% (95% CI 88.2%-96.1%) and a specificity of 99% (95% CI 98.2%-99.4%). Conclusion . In this study, we found that several machine learning methods did not outperform logistic regression in predicting GDM. We developed a model with cutoff points for risk stratification of GDM. ","author":[{"dropping-particle":"","family":"Ye","given":"Yunzhen","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiong","given":"Yu","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Zhou","given":"Qiongjie","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wu","given":"Jiangnan","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Li","given":"Xiaotian","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiao","given":"Xirong","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Journal of Diabetes Research","id":"ITEM-1","issued":{"date-parts":[["2020"]]},"page":"1-10","title":"Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study","type":"article-journal","volume":"2020"},"uris":[""]}],"mendeley":{"formattedCitation":"(Ye et al., 2020)","plainTextFormattedCitation":"(Ye et al., 2020)","previouslyFormattedCitation":"(Ye et al., 2020)"},"properties":{"noteIndex":0},"schema":""}(Ye et al., 2020). Machine learning can be used to determine from clinical factors which patients are at higher risk for GDM and should be tested and monitored early on in pregnancy. This is a cost-effective alternative to testing all pregnant women earlier, because GDM manifests in mid to late pregnancy for most women, so earlier testing would require multiple rounds of testing in all women ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1155/2020/4168340","ISSN":"2314-6745","abstract":" Background . Gestational diabetes mellitus (GDM) contributes to adverse pregnancy and birth outcomes. In recent decades, extensive research has been devoted to the early prediction of GDM by various methods. Machine learning methods are flexible prediction algorithms with potential advantages over conventional regression. Objective . The purpose of this study was to use machine learning methods to predict GDM and compare their performance with that of logistic regressions. Methods . We performed a retrospective, observational study including women who attended their routine first hospital visits during early pregnancy and had Down’s syndrome screening at 16-20 gestational weeks in a tertiary maternity hospital in China from 2013.1.1 to 2017.12.31. A total of 22,242 singleton pregnancies were included, and 3182 (14.31%) women developed GDM. Candidate predictors included maternal demographic characteristics and medical history (maternal factors) and laboratory values at early pregnancy. The models were derived from the first 70% of the data and then validated with the next 30%. Variables were trained in different machine learning models and traditional logistic regression models. Eight common machine learning methods (GDBT, AdaBoost, LGB, Logistic, Vote, XGB, Decision Tree, and Random Forest) and two common regressions (stepwise logistic regression and logistic regression with RCS) were implemented to predict the occurrence of GDM. Models were compared on discrimination and calibration metrics. Results . In the validation dataset, the machine learning and logistic regression models performed moderately (AUC 0.59-0.74). Overall, the GBDT model performed best (AUC 0.74, 95% CI 0.71-0.76) among the machine learning methods, with negligible differences between them. Fasting blood glucose, HbA1c, triglycerides, and BMI strongly contributed to GDM. A cutoff point for the predictive value at 0.3 in the GBDT model had a negative predictive value of 74.1% (95% CI 69.5%-78.2%) and a sensitivity of 90% (95% CI 88.0%-91.7%), and the cutoff point at 0.7 had a positive predictive value of 93.2% (95% CI 88.2%-96.1%) and a specificity of 99% (95% CI 98.2%-99.4%). Conclusion . In this study, we found that several machine learning methods did not outperform logistic regression in predicting GDM. We developed a model with cutoff points for risk stratification of GDM. ","author":[{"dropping-particle":"","family":"Ye","given":"Yunzhen","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiong","given":"Yu","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Zhou","given":"Qiongjie","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wu","given":"Jiangnan","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Li","given":"Xiaotian","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiao","given":"Xirong","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Journal of Diabetes Research","id":"ITEM-1","issued":{"date-parts":[["2020"]]},"page":"1-10","title":"Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study","type":"article-journal","volume":"2020"},"uris":[""]}],"mendeley":{"formattedCitation":"(Ye et al., 2020)","plainTextFormattedCitation":"(Ye et al., 2020)","previouslyFormattedCitation":"(Ye et al., 2020)"},"properties":{"noteIndex":0},"schema":""}(Ye et al., 2020).Ye et al. (2020) compared the performance of several other machine learning models with that of logistic regression in predicting the risk of GSM from 104 clinical factors. They found that none of the machine learning models outperformed logistic regression in their ?area under the receiver operating characteristic (AUC-ROC) curve. Gradient boosting did best, with a ?0.709 AUC-ROC score as compared to the 0.7351 AUC-ROC score of logistic regression.Risk factors for GDM include high blood pressure ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.2337/dc08-1193","ISSN":"01495992","PMID":"18809624","abstract":"OBJECTIVE - While women with prior gestational diabetes mellitus (GDM) are more likely to display features of the metabolic syndrome, including hypertension, in the years after delivery, it is unclear whether these components are also present before pregnancy. We examined the relationship between blood pressure (BP) measured before and during early pregnancy (<20 weeks) and the risk of GDM in a nested case-control study. RESEARCH DESIGN AND METHODS - Case (n = 381) and control (n = 942) subjects were selected from a cohort of women delivering between 1996 and 1998 and screened for GDM between 24 and 28 weeks' gestation. GDM was defined by the National Diabetes Data Group criteria. BP and covariates data were obtained by review of the medical records. Women were categorized according to BP levels recommended by the American Heart Association outside of pregnancy: < 120/80 mmHg (normal), 120-139/80-89 mmHg (prehypertension), and ≥140 and/or ≥90 mmHg or use of antihypertensive medications (hypertension). RESULTS - During early pregnancy, women with prehypertension had a small increased risk of GDM (odds ratio [OR] 1.56 [95% CI 1.16-2.10]), and women with hypertension had a twofold increased risk of GDM (2.04 [1.14-3.65]) compared with women with normal BP after adjusting for age, race/ethnicity, gestational week of BP, BMI, and parity. Similar results were seen among the subset of women with BP levels measured before pregnancy (1.44 [0.95-2.19] for prehypertension and 2.01 [1.01-3.99] for hypertension). CONCLUSIONS - Clinicians should be aware that women presenting with hypertension may warrant early screening or intervention to prevent GDM. ? 2008 by the American Diabetes Association.","author":[{"dropping-particle":"","family":"Hedderson","given":"Monique M.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Ferrara","given":"Assiamira","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Diabetes Care","id":"ITEM-1","issue":"12","issued":{"date-parts":[["2008"]]},"page":"2362-2367","title":"High blood pressure before and during early pregnancy is associated with an increased risk of gestational diabetes mellitus","type":"article-journal","volume":"31"},"uris":[""]}],"mendeley":{"formattedCitation":"(Hedderson & Ferrara, 2008)","plainTextFormattedCitation":"(Hedderson & Ferrara, 2008)","previouslyFormattedCitation":"(Hedderson & Ferrara, 2008)"},"properties":{"noteIndex":0},"schema":""}(Hedderson & Ferrara, 2008), first fasting glucose ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1016/j.diabet.2012.03.006","ISSN":"12623636","abstract":"Aim: To evaluate the correspondence between first-trimester fasting glycaemia and the results of the OGTT in diagnosing gestational diabetes (GDM). Methods: The medical records of all consecutive women who had undergone a diagnostic OGTT, performed according to the IADPSG, during the past year were retrospectively reviewed. All first-trimester fasting glucose values greater or equal to 5.1. mmol/L (92. mg/dL), recommended as a diagnostic value, were also verified for each patient in this cohort. Moreover, a ROC curve and a multiple logistic-regression model were constructed to calculate the predictive capability of this cut-off value in diagnosing GDM. Results: In our population of 738 eligible pregnant women, an 11.9% prevalence of GDM was revealed by OGTT. However, when the first-trimester fasting glucose value for each patient was retrospectively considered, there were a further 29 patients who should have been diagnosed as GDM cases (glycaemia ≥ 5.1. mmol/L), although their OGTT was normal. Yet, when the value of fasting glucose was considered not diagnostic, but only predictive, an AUC of 0.614 (95% CI: 0.544-0.684) and an aOR of 7.1 (95% CI: 3.8-13.1) was obtained in these patients compared with the reference group (fasting glucose < 5.1. mmol/L). Conclusion: There was no complete correspondence in diagnosing GDM between the first-trimester fasting glucose value and the results of a 2-h 75-g OGTT performed early in the third trimester. However, albeit not diagnostic, a fasting glucose value greater or equal to 5.1. mmol/L may be considered a highly predictive risk factor for GDM. ? 2012 Elsevier Masson SAS.","author":[{"dropping-particle":"","family":"Corrado","given":"F.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"D'Anna","given":"R.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Cannata","given":"M. L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Interdonato","given":"M. L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Pintaudi","given":"B.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Benedetto","given":"A.","non-dropping-particle":"Di","parse-names":false,"suffix":""}],"container-title":"Diabetes and Metabolism","id":"ITEM-1","issue":"5","issued":{"date-parts":[["2012"]]},"page":"458-461","publisher":"Elsevier Masson SAS","title":"Correspondence between first-trimester fasting glycaemia, and oral glucose tolerance test in gestational diabetes diagnosis","type":"article-journal","volume":"38"},"uris":[""]}],"mendeley":{"formattedCitation":"(Corrado et al., 2012)","plainTextFormattedCitation":"(Corrado et al., 2012)","previouslyFormattedCitation":"(Corrado et al., 2012)"},"properties":{"noteIndex":0},"schema":""}(Corrado et al., 2012), age > 35 years, and obesity ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1016/S0020-7292(01)00496-9","ISSN":"00207292","abstract":"Objectives: To study prevalence, risk factors, and maternal and infant outcomes of women with gestational diabetes mellitus (GDM). Methods: A retrospective cohort study was performed based on 111 563 pregnancies delivered between 1991 through 1997 in 39 hospitals in northern and central Alberta, Canada. Multivariate logistic regression was used to estimate the odds ratios with 95% confidence intervals, and to control for confounding variables. Results: The prevalence of GDM was 2.5%. Risk factors for GDM included age > 35 years, obesity, history of prior neonatal death, and prior cesarean section. Teenage mothers and women who drank alcohol were less likely to have GDM. Mothers with GDM were at increased risk of presenting with pre-eclampsia, premature rupture of membranes, cesarean section, and preterm delivery. Infants born to mothers with GDM were at higher risk of being macrosomic or large-for-gestational-age. Conclusions: Specific conditions predispose to GDM which itself is associated with a significantly increased risk of maternal and fetal morbidity. ? 2001 Elsevier Science B.V. International Federation of Gynecology and Obstetrics.","author":[{"dropping-particle":"","family":"Xiong","given":"X.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Saunders","given":"L. D.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wang","given":"F. L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Demianczuk","given":"N. N.","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"International Journal of Gynecology and Obstetrics","id":"ITEM-1","issue":"3","issued":{"date-parts":[["2001"]]},"page":"221-228","title":"Gestational diabetes mellitus: Prevalence, risk factors, maternal and infant outcomes","type":"article-journal","volume":"75"},"uris":[""]}],"mendeley":{"formattedCitation":"(Xiong et al., 2001)","plainTextFormattedCitation":"(Xiong et al., 2001)","previouslyFormattedCitation":"(Xiong et al., 2001)"},"properties":{"noteIndex":0},"schema":""}(Xiong et al., 2001). Although first fasting glucose is indicative of glucose intolerance, it is a predictive, not diagnostic factor for GDM because there is no clear cut-off value ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1016/j.diabet.2012.03.006","ISSN":"12623636","abstract":"Aim: To evaluate the correspondence between first-trimester fasting glycaemia and the results of the OGTT in diagnosing gestational diabetes (GDM). Methods: The medical records of all consecutive women who had undergone a diagnostic OGTT, performed according to the IADPSG, during the past year were retrospectively reviewed. All first-trimester fasting glucose values greater or equal to 5.1. mmol/L (92. mg/dL), recommended as a diagnostic value, were also verified for each patient in this cohort. Moreover, a ROC curve and a multiple logistic-regression model were constructed to calculate the predictive capability of this cut-off value in diagnosing GDM. Results: In our population of 738 eligible pregnant women, an 11.9% prevalence of GDM was revealed by OGTT. However, when the first-trimester fasting glucose value for each patient was retrospectively considered, there were a further 29 patients who should have been diagnosed as GDM cases (glycaemia ≥ 5.1. mmol/L), although their OGTT was normal. Yet, when the value of fasting glucose was considered not diagnostic, but only predictive, an AUC of 0.614 (95% CI: 0.544-0.684) and an aOR of 7.1 (95% CI: 3.8-13.1) was obtained in these patients compared with the reference group (fasting glucose < 5.1. mmol/L). Conclusion: There was no complete correspondence in diagnosing GDM between the first-trimester fasting glucose value and the results of a 2-h 75-g OGTT performed early in the third trimester. However, albeit not diagnostic, a fasting glucose value greater or equal to 5.1. mmol/L may be considered a highly predictive risk factor for GDM. ? 2012 Elsevier Masson SAS.","author":[{"dropping-particle":"","family":"Corrado","given":"F.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"D'Anna","given":"R.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Cannata","given":"M. L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Interdonato","given":"M. L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Pintaudi","given":"B.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Benedetto","given":"A.","non-dropping-particle":"Di","parse-names":false,"suffix":""}],"container-title":"Diabetes and Metabolism","id":"ITEM-1","issue":"5","issued":{"date-parts":[["2012"]]},"page":"458-461","publisher":"Elsevier Masson SAS","title":"Correspondence between first-trimester fasting glycaemia, and oral glucose tolerance test in gestational diabetes diagnosis","type":"article-journal","volume":"38"},"uris":[""]}],"mendeley":{"formattedCitation":"(Corrado et al., 2012)","plainTextFormattedCitation":"(Corrado et al., 2012)","previouslyFormattedCitation":"(Corrado et al., 2012)"},"properties":{"noteIndex":0},"schema":""}(Corrado et al., 2012). This study aimed to use information on these common risk factors, along with visceral adipose tissue measurements, which are predictive of GDM ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.2337/dc09-0290","ISSN":"01495992","PMID":"19389819","abstract":"OBJECTIVE - To assess whether abdominal adiposity in early pregnancy is associated with a higher risk of glucose intolerance at a later gestational stage. RESEARCH DESIGN AND METHODS - Subcutaneous and visceral fat was measured with ultrasonography at ~12 weeks' gestation. A 50-g glucose challenge test (GCT) was performed between 24 and 28 weeks' gestation. The risk of having a positive GCT (≥7.8 mmol/l) was determined in association with subcutaneous and visceral adipose tissue depths above their respective upper-quartile values relative to their bottom three quartile values. RESULTS - Sixty-two women underwent GCTs. A visceral adipose tissue depth above the upper quartile value was significantly associated with a positive GCT in later pregnancy (adjusted odds ratio 16.9 [95% CI 1.5-194.6]). No associations were seen for subcutaneous adipose tissue. CONCLUSIONS - Measurement of visceral adipose tissue depth in early pregnancy may be associated with glucose intolerance later in pregnancy. ? 2009 by the American Diabetes Association.","author":[{"dropping-particle":"","family":"Martin","given":"Aisling Mary","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Berger","given":"Howard","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Nisenbaum","given":"Rosane","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Lausman","given":"Andrea Y.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"MacGarvie","given":"Sharon","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Crerar","given":"Carrie","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Ray","given":"Joel G.","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Diabetes Care","id":"ITEM-1","issue":"7","issued":{"date-parts":[["2009"]]},"page":"1308-1310","title":"Abdominal visceral adiposity in the first trimester predicts glucose intolerance in later pregnancy","type":"article-journal","volume":"32"},"uris":[""]}],"mendeley":{"formattedCitation":"(Martin et al., 2009)","plainTextFormattedCitation":"(Martin et al., 2009)","previouslyFormattedCitation":"(Martin et al., 2009)"},"properties":{"noteIndex":0},"schema":""}(Martin et al., 2009) but have not to our knowledge been used in machine learning models, to predict the risk of gestational diabetes in pregnant women.Methods:We used the database “Visceral adipose tissue measurements during pregnancy” ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.13026/p729-7p53","author":[{"dropping-particle":"","family":"Rocha","given":"A. d. S.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Diemen","given":"L.","non-dropping-particle":"von","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Kretzer","given":"D.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Matos","given":"S.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Rombaldi Bernardi","given":"J.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Magalh?es","given":"J. A.","non-dropping-particle":"","parse-names":false,"suffix":""}],"id":"ITEM-1","issued":{"date-parts":[["2020"]]},"number":"1.0.0","publisher":"PhysioNet","title":"Visceral adipose tissue measurements during pregnancy","type":"article"},"uris":[""]}],"mendeley":{"formattedCitation":"(Rocha et al., 2020)","plainTextFormattedCitation":"(Rocha et al., 2020)","previouslyFormattedCitation":"(Rocha et al., 2020)"},"properties":{"noteIndex":0},"schema":""}(Rocha et al., 2020) from PhysioNet ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"author":[{"dropping-particle":"","family":"Goldberger","given":"A.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Amaral","given":"L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Glass","given":"L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Hausdorff","given":"J.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Ivanov","given":"P. C.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Mark","given":"R.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Mietus","given":"J. E.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Moody","given":"G. B.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Peng","given":"C. K.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Stanley","given":"H. E.","non-dropping-particle":"","parse-names":false,"suffix":""}],"id":"ITEM-1","issue":"23","issued":{"date-parts":[["2000"]]},"page":"e215–e220","title":"PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals.","type":"article-journal","volume":"101"},"uris":[""]}],"mendeley":{"formattedCitation":"(Goldberger et al., 2000)","plainTextFormattedCitation":"(Goldberger et al., 2000)","previouslyFormattedCitation":"(Goldberger et al., 2000)"},"properties":{"noteIndex":0},"schema":""}(Goldberger et al., 2000), which contains clinical information on 133 mothers with and without gestational diabetes mellitus (GDM). The clinical information included maternal age, existence of previous diabetes mellitus (DM), blood pressure, visceral adipose tissue (VAT) measurement in the periumbilical region, gestational age at time of inclusion, number of pregnancies, level of first fasting glucose, and pregestational body mass index (BMI). The following pregnancy outcomes were also included: gestational age at birth, type of delivery (vaginal or caesarean section), child birth weight and the diagnosis of GDM.Patients who had missing data were excluded, leaving 101 patients in the database. As there was only one mother with previous DM, she was also excluded, and the existence of previous DM was not used in the models. Of the remaining 100 patients, 13 had GDM.These 100 patients were then divided into training (80%) and testing (20%) sets. The training set contained 10 patients with GDM, and the testing set contained 3. All of the data was then normalized using Python scikit-learn library’s StandardScaler, which subtracted the mean and divided by the variance for each of the clinical features. Because the data was imbalanced, data for the GDM patients was upsampled using the Python imblearn library (this increased the weight given to the GDM patient data so that both classes became more comparable during model training while using the true distribution during validation) before hyperparameters of each of the machine learning models (described below) were tuned using scikit-learn’s GridSearchCV. GridSearchCV performs a cross-validated grid-search across all hyperparameter combinations given and determines the best score. A series of machine learning models from the Python scikit-learn library and XGBoost from the Python XGBoost library were then trained with these hyperparameters and cross-validated with 7 folds. The number of folds was chosen as 7 to have a reasonable number of observations in both the training and the test folds. These models are described in detail below.Logistic Regression: Logistic regression is a binary classification model which creates a linear weighted combination of the input features and then compresses the result into a probability between 0 and 1 using a sigmoid function. This probability is then turned into a binary output depending on the threshold value for the probability. The weights on the linear combination are determined by the solver function.Support Vector Machine: Support vector machines are linear classifiers that aim to maximize the margin between the two classes. In the case of two dimensions, this means that the support vector machine aims to find the line which can divide the two classes with the greatest margin, and for higher dimensions the same theory is applied. For nonlinear boundaries, the kernel trick can be applied, where the data is transformed to a higher dimension (by squaring, cubing, or some other function) in which a linear boundary exists.K-Nearest Neighbors: K-Nearest Neighbors is a classification model which calculates the Euclidean distance between each new test point and all of the training points using all of the features, and classifies the point as the category that the majority of the k closest points (where k is specified by the programmer) are in.Decision Tree: The decision tree is a relatively simple classification model. It begins with all of the data at the root node. The data is then divided into binary groups (nodes) based on the feature that best separates the data into the classes. This partitioning continues until either all of the data is classified into nodes that contain only one category or the specified maximum depth is reached. The tree can then be used to classify new data based on the features used in the tree.Random Forest: Random forest is a decision tree-based model that employs the ensemble technique bagging. In random forest, a subset of the data is selected, and a decision tree is created from that subset. This process is then repeated (with resampling, so each time any data point could be selected) for the desired number of trees. When predicting, a data point is classified independently by each tree and the final classification is that of the majority of the trees.Gradient Boosting: Gradient boosting is another decision tree-based model, but employs the ensemble technique boosting rather than bagging. In gradient boosting, a decision tree is first fit to the data. Then, misclassified observations are assigned more weight, and another tree is fit. This process repeats for the desired number of iterations.XGBoost: XGBoost is an optimized implementation of gradient boosting which is more flexible and computationally efficient than regular gradient boosting.Multi-Layer Perceptron: The multi-layer perceptron is a type of neural network, specifically a feedforward artificial neural network. A neural network consists of an input layer (the input features), the output layer (the classification), and hidden layers. Hidden layers are each connected to their respective previous layer and next layer by weights, which are determined by nonlinear functions that are optimized to best predict the classifications. Multi-layer perceptrons are feedforward neural networks, meaning that the weights are adjusted as more training data is provided to the model. The f1 score, accuracy, precision, recall, and AUC-ROC score were used to determine the quality of the cross-validated models. These metrics are all commonly used in evaluating machine learning models. The accuracy is the total proportion of data points that were classified correctly. For an imbalanced dataset such as this one, the accuracy can be deceptively high, for example if all data points are classified as negative. The precision is the proportion of data points that were classified as positive that were actually positive, and the recall is the proportion of actual positives that were classified as positive. Precision and recall are more informative for this study. In the cross-validated grid search with GridSearchCV that determined the best hyperparameters, recall was optimized because for medical diagnosis, recall is much more informative than the other metrics included, as false positives are preferred than false negatives. The f1 score is a weighted harmonic mean of precision and recall which gives precision and recall equal weights, and the AUC-ROC score is the area under the receiver operating characteristic curve, which is a measure of the ability of the model to separate the classes. An AUC-ROC of 0.5 indicates no separability, an AUC-ROC of 1.0 indicates perfect separability, and an AUC-ROC of 0 indicates that the model separates the classes in the opposite direction. After obtaining these metrics, we also recorded the feature importances for the decision tree, gradient boosting, XGBoost and random forest models.Results and Discussion:Of the several machine learning models we used, the gradient boosting, random forest, and support vector machine algorithms performed best in recall (71.4%). XGBoost performed best in accuracy (80.9%), and gradient boosting performed best in f1 score (48.6%), precision (44.0%), and AUC-ROC score (0.864). Thus, gradient boosting was overall the best machine learning model (Table 1). The hyperparameters for all of our models are listed in Table 2.Ye et al. had also found that gradient boosting was the best machine learning model, as measured by AUC-ROC score (0.709), but their logistic regression model was their best predictor (0.7351). However, we found that our gradient boosting model (0.864) outperformed both our logistic regression model (0.753) and Ye et al.’s logistic regression model in AUC-ROC score. Our gradient boosting model also slightly outperformed Artzi et al.’s gradient boosting model (0.85 AUC-ROC score), created based on electronic health records.A graph of the feature importances for each of the clinical features in the decision tree, gradient boosting, XGBoost and random forest models is shown in Figure 1. VAT measurement is the most important feature for the decision tree, gradient boosting and random forest models. Pregestational body mass index is the most important feature for the XGBoost model, with VAT a close second. As VAT was a feature that had previously not, to our knowledge, been used in predictive machine learning models for GDM, it is notable that VAT was the most important feature for the decision-tree related models overall. Our inclusion of VAT may have been the reason that our gradient boosted trees outperformed Ye et al.’s logistic regression and gradient boosted trees and Artzi et al.’s gradient boosted trees ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1155/2020/4168340","ISSN":"2314-6745","abstract":" Background . Gestational diabetes mellitus (GDM) contributes to adverse pregnancy and birth outcomes. In recent decades, extensive research has been devoted to the early prediction of GDM by various methods. Machine learning methods are flexible prediction algorithms with potential advantages over conventional regression. Objective . The purpose of this study was to use machine learning methods to predict GDM and compare their performance with that of logistic regressions. Methods . We performed a retrospective, observational study including women who attended their routine first hospital visits during early pregnancy and had Down’s syndrome screening at 16-20 gestational weeks in a tertiary maternity hospital in China from 2013.1.1 to 2017.12.31. A total of 22,242 singleton pregnancies were included, and 3182 (14.31%) women developed GDM. Candidate predictors included maternal demographic characteristics and medical history (maternal factors) and laboratory values at early pregnancy. The models were derived from the first 70% of the data and then validated with the next 30%. Variables were trained in different machine learning models and traditional logistic regression models. Eight common machine learning methods (GDBT, AdaBoost, LGB, Logistic, Vote, XGB, Decision Tree, and Random Forest) and two common regressions (stepwise logistic regression and logistic regression with RCS) were implemented to predict the occurrence of GDM. Models were compared on discrimination and calibration metrics. Results . In the validation dataset, the machine learning and logistic regression models performed moderately (AUC 0.59-0.74). Overall, the GBDT model performed best (AUC 0.74, 95% CI 0.71-0.76) among the machine learning methods, with negligible differences between them. Fasting blood glucose, HbA1c, triglycerides, and BMI strongly contributed to GDM. A cutoff point for the predictive value at 0.3 in the GBDT model had a negative predictive value of 74.1% (95% CI 69.5%-78.2%) and a sensitivity of 90% (95% CI 88.0%-91.7%), and the cutoff point at 0.7 had a positive predictive value of 93.2% (95% CI 88.2%-96.1%) and a specificity of 99% (95% CI 98.2%-99.4%). Conclusion . In this study, we found that several machine learning methods did not outperform logistic regression in predicting GDM. We developed a model with cutoff points for risk stratification of GDM. ","author":[{"dropping-particle":"","family":"Ye","given":"Yunzhen","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiong","given":"Yu","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Zhou","given":"Qiongjie","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wu","given":"Jiangnan","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Li","given":"Xiaotian","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Xiao","given":"Xirong","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Journal of Diabetes Research","id":"ITEM-1","issued":{"date-parts":[["2020"]]},"page":"1-10","title":"Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study","type":"article-journal","volume":"2020"},"uris":[""]},{"id":"ITEM-2","itemData":{"DOI":"10.1038/s41591-019-0724-8","ISSN":"1546170X","PMID":"31932807","abstract":"Gestational diabetes mellitus (GDM) poses increased risk of short- and long-term complications for mother and offspring1–4. GDM is typically diagnosed at 24–28 weeks of gestation, but earlier detection is desirable as this may prevent or considerably reduce the risk of adverse pregnancy outcomes5,6. Here we used a machine-learning approach to predict GDM on retrospective data of 588,622 pregnancies in Israel for which comprehensive electronic health records were available. Our models predict GDM with high accuracy even at pregnancy initiation (area under the receiver operating curve (auROC) = 0.85), substantially outperforming a baseline risk score (auROC = 0.68). We validated our results on both a future validation set and a geographical validation set from the most populated city in Israel, Jerusalem, thereby emulating real-world performance. Interrogating our model, we uncovered previously unreported risk factors, including results of previous pregnancy glucose challenge tests. Finally, we devised a simpler model based on just nine questions that a patient could answer, with only a modest reduction in accuracy (auROC = 0.80). Overall, our models may allow early-stage intervention in high-risk women, as well as a cost-effective screening approach that could avoid the need for glucose tolerance tests by identifying low-risk women. Future prospective studies and studies on additional populations are needed to assess the real-world clinical utility of the model.","author":[{"dropping-particle":"","family":"Artzi","given":"Nitzan Shalom","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Shilo","given":"Smadar","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Hadar","given":"Eran","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Rossman","given":"Hagai","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Barbash-Hazan","given":"Shiri","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Ben-Haroush","given":"Avi","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Balicer","given":"Ran D.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Feldman","given":"Becca","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wiznitzer","given":"Arnon","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Segal","given":"Eran","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Nature Medicine","id":"ITEM-2","issue":"1","issued":{"date-parts":[["2020"]]},"page":"71-76","publisher":"Springer US","title":"Prediction of gestational diabetes based on nationwide electronic health records","type":"article-journal","volume":"26"},"uris":[""]}],"mendeley":{"formattedCitation":"(Artzi et al., 2020; Ye et al., 2020)","plainTextFormattedCitation":"(Artzi et al., 2020; Ye et al., 2020)","previouslyFormattedCitation":"(Artzi et al., 2020; Ye et al., 2020)"},"properties":{"noteIndex":0},"schema":""}(Artzi et al., 2020; Ye et al., 2020).Table 1 – Model evaluation metrics for cross-validation of machine learning models predicting gestational diabetes on scaled and upsampled clinical patient data (best scores in test set bolded)Logistic RegressionSupport Vector MachineK-Nearest NeighborsDecision TreeRandom ForestGradient BoostingXGBoostMulti-Layer PerceptronTest F1 Score0.302 ± 0.2170.378 ± 0.0860.314 ± 0.1480.343 ± 0.2560.386 ± 0.2020.486 ± 0.2390.460 ± 0.2240.179 ± 0.238Test Accuracy0.770 ± 0.0780.699 ± 0.1080.669 ± 0.0780.750 ± 0.1140.690 ± 0.1560.800 ± 0.1320.809 ± 0.1230.731 ± 0.102Test Precision0.231 ± 0.1790.279 ± 0.1070.219 ± 0.1090.326 ± 0.3460.286 ± 0.1730.440 ± 0.3070.439 ± 0.3070.119 ± 0.159Test Recall0.500 ± 0.4080.714 ± 0.2670.643 ± 0.3780.571 ± 0.4500.714 ± 0.3930.714 ± 0.3930.643 ± 0.3780.357 ± 0.476Test AUC-ROC score0.753 ± 0.1730.811 ± 0.1610.670 ± 0.1290.726 ± 0.1460.685 ± 0.2190.864 ± 0.0960.862 ± 0.0440.765 ± 0.184Train F1 Score0.541 ± 0.0370.407 ± 0.0441.000 ± 0.0000.555 ± 0.0640.384 ± 0.0320.637 ± 0.0340.639 ± 0.0600.773 ± 0.065Train Accuracy0.822 ± 0.0220.723 ± 0.0281.000 ± 0.0000.812 ± 0.0460.700 ± 0.0470.857 ± 0.0240.872 ± 0.0250.922 ± 0.027Train Precision0.408 ± 0.0330.282 ± 0.0311.000 ± 0.0000.409 ± 0.0630.264 ± 0.0270.478 ± 0.0360.506 ± 0.0520.634 ± 0.089Train Recall0.807 ± 0.0640.729 ± 0.0941.000 ± 0.0000.896 ± 0.1520.719 ± 0.1310.961 ± 0.0490.871 ± 0.0891.000 ± 0.000Train AUC-ROC score0.899 ± 0.0210.784 ± 0.0251.000 ± 0.0000.880 ± 0.0580.734 ± 0.0380.947 ± 0.0080.944 ± 0.0140.987 ± 0.005Table 2 – Hyperparameters selected for machine learning models by GridSearchCVMachine Learning ModelHyperparameters Selected by GridSearchCVLogistic Regressionpenalty = l2, solver = newton-cg, max_iter = 50, class_weight = NoneSupport Vector Machinekernel = sigmoidK-Nearest Neighborsalgorithm = auto, n_neighbors = 7, weights = distanceDecision Treemax_depth = 2, n_estimators = 5, splitter = randomRandom Forestmax_depth = 2, n_estimators = 50, criterion = gini, max_features = autoGradient Boostingmax_depth = 2, n_estimators = 5, criterion = maeXGBoostmax_depth = 2, subsample = 0.25, n_estimators = 50, colsample_bytree = 0.25, learning_rate = 0.05Multi-Layer Perceptronhidden_layer_sizes = (20, ) , max_iter = 50032435819316100Figure 1 – Feature importances for decision-tree based models (decision tree, gradient boosting, XGBoost, and random forest). The horizontal axis represents various clinical features used in the models, while the vertical axis represents the average of the feature importances in the 7 folds of the cross-validation for each of the respective models. Central armellini fat (VAT) is the most important feature for the decision tree, gradient boosting and random forest models, while pregestational body mass index is the most important feature for the XGBoost model.Shortcomings of this study included the limited number of patients in total, particularly GDM patients, and that all of the patients in this study came from a single institution in Brazil. Further validation on a larger amount of data and data from other institutions is necessary to eliminate any bias present in our dataset. Also, the clinical features used in this study were some of the most common risk factors for GDM, but other features such as diet, socioeconomic status, environmental factors, and lifestyle choices are also risk factors for GDM and may add value to future machine learning predictive models ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1038/s41598-018-26412-6","ISSN":"20452322","PMID":"29802340","abstract":"Gestational diabetes mellitus (GDM) is a common health problem during pregnancy and its prevalence is increasing globally, especially in China. The aim of this study was to investigate socioeconomic, environmental and lifestyle factors associated with GDM in Chinese women. A matched pair case-control study was conducted with 276 GDM women and 276 non-GDM women in two hospitals in Beijing, China. Matched factors include age and pre-pregnancy body mass index (BMI). GDM subjects were defined based on the International Association of Diabetes Study Group criteria for GDM. A conditional logistic regression model with backward stepwise selection was performed to predict the odds ratio (OR) for associated factors of GDM. The analyses of data show that passive smoking at home (OR = 1.52, p = 0.027), passive smoking in the workplace (OR = 1.71, p = 0.01), and family history of diabetes in first degree relatives (OR = 3.07, p = 0.004), were significant factors associated with GDM in Chinese women. These findings may be utilized as suggestions to decrease the incidence of GDM in Chinese women by improving the national tobacco control policy and introducing public health interventions to focus on the social environment of pregnant women in China.","author":[{"dropping-particle":"","family":"Carroll","given":"Xianming","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Liang","given":"Xianhong","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Zhang","given":"Wenyan","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Zhang","given":"Wenjing","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Liu","given":"Gaifen","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Turner","given":"Nannette","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Leeper-Woodford","given":"Sandra","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Scientific Reports","id":"ITEM-1","issue":"1","issued":{"date-parts":[["2018"]]},"page":"1-10","title":"Socioeconomic, environmental and lifestyle factors associated with gestational diabetes mellitus: A matched case-control study in Beijing, China","type":"article-journal","volume":"8"},"uris":[""]}],"mendeley":{"formattedCitation":"(Carroll et al., 2018)","plainTextFormattedCitation":"(Carroll et al., 2018)","previouslyFormattedCitation":"(Carroll et al., 2018)"},"properties":{"noteIndex":0},"schema":""}(Carroll et al., 2018).Additionally, data from patients in this study included gestational ages from 6 weeks to 32 weeks. This variation in gestational age likely affected the predictive capabilities of visceral adipose tissue measurements and first fasting glucose levels. Visceral adipose tissue measurements increase during pregnancy for all women, not just those with GDM ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.1038/oby.2008.40","ISSN":"19307381","PMID":"18356843","abstract":"Objective: To examine whether childbearing is associated with increased visceral adiposity and whether the increase is proportionally larger than other depots. Methods and Procedures: This prospective study examined changes in adiposity assessed via computed tomography (CT) and dual-energy X-ray absorptiometry among 122 premenopausal women (50 black, 72 white) examined in 1995-1996 and again in 2000-2001. During the 5-year interval, 14 women had one interim birth and 108 had no interim births. Multiple linear regression models estimated mean (95% confidence interval (CI)) 5-year changes in anthropometric and adiposity measures by interim births adjusted for age, race, and changes in total and subcutaneous adiposity. Results: We found no significant differences between one interim birth and no interim births for 5-year changes in weight, BMI, total body fat, subcutaneous adipose tissue, or total abdominal adipose tissue. Visceral adipose tissue increased by 40 and 14% above initial levels for 1 birth and 0 birth groups, respectively. Having 1 birth vs. 0 births was associated with a greater increase in visceral adipose tissue of 18.0 cm 2 (4.8, 31.2), P < 0.01; gain of 27.1 cm2 (14.5, 39.7) vs. 9.2 cm2 (4.8, 13.6), and a borderline greater increase in waist girth of 2.3 cm (0, 4.5), P = 0.05; gain of 6.3 cm (4.1, 8.5) vs. 4.0 cm (3.2, 4.8), controlling for gain in total body fat and covariates. Discussion: Pregnancy may be associated with preferential accumulation of adipose tissue in the visceral compartment for similar gains in total body fat. Further investigation is needed to confirm these findings and determine whether excess visceral fat deposition with pregnancy adversely affects metabolic risk profiles among women. ? 2008 The Obesity Society.","author":[{"dropping-particle":"","family":"Gunderson","given":"Erica P.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Sternfeld","given":"Barbara","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wellons","given":"Melissa F.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Whitmer","given":"Rachel A.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Chiang","given":"Vicky","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Quesenberry Jr","given":"Charles P.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Lewis","given":"Cora E.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Sidney","given":"Stephen","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Obesity","id":"ITEM-1","issue":"5","issued":{"date-parts":[["2008"]]},"page":"1078-1084","title":"Childbearing may increase visceral adipose tissue independent of overall increase in body fat","type":"article-journal","volume":"16"},"uris":[""]}],"mendeley":{"formattedCitation":"(Gunderson et al., 2008)","plainTextFormattedCitation":"(Gunderson et al., 2008)","previouslyFormattedCitation":"(Gunderson et al., 2008)"},"properties":{"noteIndex":0},"schema":""}(Gunderson et al., 2008), and fasting glucose levels are also affected by the gestational age, dropping during the first trimester, remaining constant in the second trimester, and again dropping in the third trimester ADDIN CSL_CITATION {"citationItems":[{"id":"ITEM-1","itemData":{"DOI":"10.2337/db14-0877","ISSN":"1939327X","PMID":"25614666","abstract":"Pregnancy presents a unique physiological challenge that requires changes coordinated by placentally and non-placentally derived hormones to prepare the mother for the metabolic stress presented by fetal development and to ensure appropriate nutrient allocation between mother and fetus. Of particular importance is the maintenance of normal glucose metabolism during pregnancy. Here, we describe physiological changes in glucose metabolism during pregnancy and highlight new insights into these adaptations that have emerged over the past decade using novel methodologies, specifically genome-wide association studies (GWAS) and metabolomics. While GWAS have identified some novel associations with metabolic traits during pregnancy, the majority of the findings overlap with those observed in nonpregnant populations and individuals with type 2 diabetes (T2D). Metabolomics studies have provided new insight into key metabolites involved in gestational diabetes mellitus (GDM). Both of these approaches have suggested that a strong link exists between GDM and T2D. Most recently, a role of the gut microbiome in pregnancy has been observed, with changes in the microbiome during the third trimester having metabolic consequences for the mother. In this Perspectives in Diabetes article, we highlight how these new data have broadened our understanding of gestational metabolism, and emphasize the importance of future studies to elucidate differences between GDM and T2D.","author":[{"dropping-particle":"","family":"Angueira","given":"Anthony R.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Ludvik","given":"Anton E.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Reddy","given":"Timothy E.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Wicksteed","given":"Barton","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Lowe","given":"William L.","non-dropping-particle":"","parse-names":false,"suffix":""},{"dropping-particle":"","family":"Layden","given":"Brian T.","non-dropping-particle":"","parse-names":false,"suffix":""}],"container-title":"Diabetes","id":"ITEM-1","issue":"2","issued":{"date-parts":[["2015"]]},"page":"327-334","title":"New insights into gestational glucose metabolism: Lessons learned from 21st century approaches","type":"article-journal","volume":"64"},"uris":[""]}],"mendeley":{"formattedCitation":"(Angueira et al., 2015)","plainTextFormattedCitation":"(Angueira et al., 2015)","previouslyFormattedCitation":"(Angueira et al., 2015)"},"properties":{"noteIndex":0},"schema":""}(Angueira et al., 2015). While our models did use current gestational age as a feature, future work on a larger dataset might use it more explicitly, perhaps by developing separate models for use at different gestational ages.References:ADDIN Mendeley Bibliography CSL_BIBLIOGRAPHY Angueira, A. R., Ludvik, A. E., Reddy, T. E., Wicksteed, B., Lowe, W. L., & Layden, B. T. (2015). New insights into gestational glucose metabolism: Lessons learned from 21st century approaches. Diabetes, 64(2), 327–334. , N. S., Shilo, S., Hadar, E., Rossman, H., Barbash-Hazan, S., Ben-Haroush, A., Balicer, R. D., Feldman, B., Wiznitzer, A., & Segal, E. (2020). Prediction of gestational diabetes based on nationwide electronic health records. Nature Medicine, 26(1), 71–76. , T. a, & Xiang, A. H. (2005). Gestational diabetes mellitus. The Journal of Clinical Investigation, 115(3), 485–491. , X., Liang, X., Zhang, W., Zhang, W., Liu, G., Turner, N., & Leeper-Woodford, S. (2018). Socioeconomic, environmental and lifestyle factors associated with gestational diabetes mellitus: A matched case-control study in Beijing, China. Scientific Reports, 8(1), 1–10. , F., D’Anna, R., Cannata, M. L., Interdonato, M. L., Pintaudi, B., & Di Benedetto, A. (2012). Correspondence between first-trimester fasting glycaemia, and oral glucose tolerance test in gestational diabetes diagnosis. Diabetes and Metabolism, 38(5), 458–461. , D., Snell-Bergeon, J. K., Hartsfield, C. L., Bischoff, K. J., Hamman, R. F., & McDuffie, R. S. (2005). Increasing prevalence of gestational diabetes mellitus (GDM) over time and by birth cohort: Kaiser Permanente of Colorado GDM screening program. Diabetes Care, 28(3), 579–584. , L. J., Martínez-González, M. A., Basterra-Gortari, F. J., Gea, A., Barbagallo, M., & Bes-Rastrollo, M. (2014). Fast food consumption and gestational diabetes incidence in the SUN project. PLoS ONE, 9(9), 1–7. , A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., Mietus, J. E., Moody, G. B., Peng, C. K., & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. 101(23), e215–e220.Gunderson, E. P., Sternfeld, B., Wellons, M. F., Whitmer, R. A., Chiang, V., Quesenberry Jr, C. P., Lewis, C. E., & Sidney, S. (2008). Childbearing may increase visceral adipose tissue independent of overall increase in body fat. Obesity, 16(5), 1078–1084. , M. M., & Ferrara, A. (2008). High blood pressure before and during early pregnancy is associated with an increased risk of gestational diabetes mellitus. Diabetes Care, 31(12), 2362–2367. , A. M., Berger, H., Nisenbaum, R., Lausman, A. Y., MacGarvie, S., Crerar, C., & Ray, J. G. (2009). Abdominal visceral adiposity in the first trimester predicts glucose intolerance in later pregnancy. Diabetes Care, 32(7), 1308–1310. [dataset] Rocha, A. d. S., von Diemen, L., Kretzer, D., Matos, S., Rombaldi Bernardi, J., & Magalh?es, J. A. (2020). Visceral adipose tissue measurements during pregnancy (1.0.0). PhysioNet. , S., Madhuvrata, P., Bustani, R., Song, S., & Farrell, T. A. (2014). Screening, diagnosis and management of gestational diabetes mellitus: A national survey. Obstetric Medicine, 7(3), 111–115. , X., Saunders, L. D., Wang, F. L., & Demianczuk, N. N. (2001). Gestational diabetes mellitus: Prevalence, risk factors, maternal and infant outcomes. International Journal of Gynecology and Obstetrics, 75(3), 221–228. (01)00496-9Ye, Y., Xiong, Y., Zhou, Q., Wu, J., Li, X., & Xiao, X. (2020). Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study. Journal of Diabetes Research, 2020, 1–10. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download