Predicting mental health problems in adolescence using machine learning techniquesShow others and affiliations
2020 (English)In: PLOS ONE, E-ISSN 1932-6203, Vol. 15, no 4, article id e0230389Article in journal (Refereed) Published
Abstract [en]
BACKGROUND: Predicting which children will go on to develop mental health symptoms as adolescents is critical for early intervention and preventing future, severe negative outcomes. Although many aspects of a child's life, personality, and symptoms have been flagged as indicators, there is currently no model created to screen the general population for the risk of developing mental health problems. Additionally, the advent of machine learning techniques represents an exciting way to potentially improve upon the standard prediction modelling technique, logistic regression. Therefore, we aimed to I.) develop a model that can predict mental health problems in mid-adolescence II.) investigate if machine learning techniques (random forest, support vector machines, neural network, and XGBoost) will outperform logistic regression.
METHODS: In 7,638 twins from the Child and Adolescent Twin Study in Sweden we used 474 predictors derived from parental report and register data. The outcome, mental health problems, was determined by the Strengths and Difficulties Questionnaire. Model performance was determined by the area under the receiver operating characteristic curve (AUC).
RESULTS: Although model performance varied somewhat, the confidence interval overlapped for each model indicating non-significant superiority for the random forest model (AUC = 0.739, 95% CI 0.708-0.769), followed closely by support vector machines (AUC = 0.735, 95% CI 0.707-0.764).
CONCLUSION: Ultimately, our top performing model would not be suitable for clinical use, however it lays important groundwork for future models seeking to predict general mental health outcomes. Future studies should make use of parent-rated assessments when possible. Additionally, it may not be necessary for similar studies to forgo logistic regression in favor of other more complex methods.
Place, publisher, year, edition, pages
PLOS , 2020. Vol. 15, no 4, article id e0230389
National Category
Public Health, Global Health, Social Medicine and Epidemiology
Identifiers
URN: urn:nbn:se:oru:diva-81105DOI: 10.1371/journal.pone.0230389ISI: 000535955000012PubMedID: 32251439Scopus ID: 2-s2.0-85082942139OAI: oai:DiVA.org:oru-81105DiVA, id: diva2:1422458
Funder
Forte, Swedish Research Council for Health, Working Life and Welfare, 2012-1678Swedish Research Council, 2017-02552 2016-01989 2017-00641
Note
Funding Agencies:
Swedish Council for Working Life, funds under the ALF
Söderstrom Königska Foundation
European Union (EU) 721567
Swedish Initiative for Research on Microdata in the Social And Medical Sciences (SIMSAM) 340-2013-5867
2020-04-072020-04-072021-06-14Bibliographically approved