Posters

An application of machine learning for the identification of adolescent smoking risk factors

Abstract

Purpose: Smoking is known to be a modifiable risk behavior that causes various health problems that include cancer and respiratory disease. Moreover, the literature reveals that adolescent smoking behaviors are likely to persist through adulthood, and this is the case in countries worldwide. In South Korea, despite many effeorts to reduce smoking among Korean adolescents, this modifiable risk behavior remains a significant social problem. An effective intervention to target and modify the behavior of adolescents concerning smoking must understand and address the factors that underlie and influence the behavior of smoking. These factors canbe surfaced in data using an appropriate approach. Machine learning is an approach that is well suited to reveal patterns of infromation in large, complex datasets that are useful in predicting outcomes (Chekround, 2016). For example, machine learning has been used to predict readmission in in-patients (Mortazavi, 2016; Frizzell, 2016). However, this approach had not yet been applied to address an adolescents risk behavior, such as smoking. Therefore, the goal of this study was to identify the predictors of adolescents smoking behaviors in South Korea using a machine-learning approach.

Methods: The 2015 Korean Youth Risk Behviors Web-based Survey (KYRBS) was used as the data source of this study. The KYRBS is an annual, nationwide survey conducted in South Korea to examine health behaviors that include cigarette smoking, individual hygiene, and alcohol consumption. Data gatered in the 2015 KYRBS was collected via self-report questionnaires responded to by 68,043 students in grades 7 through 12 in randomly-selected 800 schools in South Korea. For this study, we used 5,123 surveys which completed items concerning smooking on the questionnaires. This study utilized the machine-learning pipeline developed by Fayyad (1996) and Yoon (2015). To reduce the "surse of dimensionality," in which a high number of inter-related variables in large dataset interfere with the accuracy of the machine-learning model, we selected clinically meaningful features based on the concpetual framework for adolescent risk behaviors (Jessor, 1991). Then, we applied three machine learning algorithms embedded in Weka (i.e., J48, Naïve Bayes, and Logistic Regression) to build a predictive model for the smoking behavior of the adolescents represented by the KYRBY dataset. The final model was selected based on the accuracy of not only the predictive model, but also the F-measure calculated using precision and recall rate.

Results: Through the feature selection process, we classified 40 features into three predictive categories. Among three machine algorithms we applied, we found that the Logistic Regression algorithm demonstrated the highest level of accuracy (i.e., 84.0% of adolescent smokers were correctly classified; F-measure = 0.795). Using this model, grade (-0.06) and alcohol consumption (-0.56) were the top two features with the highest coefficietns. In other words, middle school students and students who had never drank alcohol were highly associated with the behavior of smoking.

Conclusion: Our studey demonstrates that a machine-learning approach is effective in identifying behavioral predictors from a large, complex dataset— in this case, the behavioral predicators associated with smoking using the KYRBY. However, our study results were inconsistent with those reported in the literature. Previous study shooed that increasing grade and previous alcohol consumption were associated with adolescents' smoking behaviors (Mendol, 2013; Talip, 2015). Further study with association between smoking behaviors and alcohol consumption among Korean adolescent is needed. Although this study did have some limitations (e.g., the data from the KYRBY is cross-sectional), our machine-learning approach shows promise, and subsequent research using longitudinal data can take into account the trends of association implicit in creating a predictive model.

Sophia Chung
Youngji Li

Author Details

Sophia J. Chung, PhD, MSN, RN; Youngji Li

Sigma Membership

Gamma

Type

Poster

Format Type

Text-based Document

Study Design/Type

N/A

Research Approach

N/A

Keywords:

Adolescents, Cigarette Smoking, Machine Learning

Recommended Citation

Chung, Sophia and Li, Youngji, "An application of machine learning for the identification of adolescent smoking risk factors" (2017). INRC (Congress). 230.
https://www.sigmarepository.org/inrc/2017/posters_2017/230

Conference Name

28th International Nursing Research Congress

Conference Host

Sigma Theta Tau International

Conference Location

Dublin, Ireland

Conference Year

2017

Rights Holder

All permission requests should be directed accordingly and not to the Sigma Repository.

All submitting authors or publishers have affirmed that when using material in their work where they do not own copyright, they have obtained permission of the copyright holder prior to submission and the rights holder has been acknowledged as necessary.

Acquisition

Proxy-submission

Download

Additional Files

download (310 kB)

COinS

An application of machine learning for the identification of adolescent smoking risk factors

Dublin, Ireland

Posters

An application of machine learning for the identification of adolescent smoking risk factors

Abstract

Author Details

Sigma Membership

Type

Format Type

Study Design/Type

Research Approach

Keywords:

Recommended Citation

Conference Name

Conference Host

Conference Location

Conference Year

Rights Holder

Acquisition

Additional Files

Search

Browse

Author Corner

Sigma Links

Posters

An application of machine learning for the identification of adolescent smoking risk factors

Abstract

Authors

Author Details

Sigma Membership

Type

Format Type

Study Design/Type

Research Approach

Keywords:

Recommended Citation

Conference Name

Conference Host

Conference Location

Conference Year

Rights Holder

Acquisition

Additional Files

Share

Search

Browse

Author Corner

Sigma Links