Child morbidity and mortality in resource-limited settings is a major public health problem. The previous studies were mainly concerned with determining the prevalence of child deaths and identifying associated factors. Extracting knowledge and discovering insights from hidden patterns in child data through supervised machine learning algorithms is limited. Therefore, this study aimed to predict the under-five death of children using a best performance-supervised machine learning algorithm.
A total of 1813 samples were used from the 2019 Ethiopian Demographic and Health Survey dataset. 70% and 30% of total instances were used for training the model and measuring the performance of each algorithm with 10-fold cross-validation techniques respectively. Five supervised machine learning algorithms were considered for model building and comparison. All the included algorithms were evaluated using confusion matrix elements. Information gain value was used to select important attributes to predict child deaths. The If/then logical association was used to generate rules based on relationships among attributes using Weka version 3.8.6 software.
J48 is the second-best performance algorithm next to the random forest to predict child death, with 77.8% and 93.9% accuracy, respectively. Late initiation of breastfeeding, mothers with no formal education, short birth intervals, poor wealth status of the mother, and unexposed to media were the top five important attributes to predict child deaths. A total of six associated rules were generated that could determine the magnitude of child deaths. Of these, if children were rural residents, had a short birth interval, and if born as multiples (twins), then the probability of child death was 83.6%.
Five machine learning algorithms were included to predict child deaths and generate rules. Hence, the random forest algorithm was the best algorithm to predict child deaths. However, the study was limited since important attributes were not included in the data source, and irrelevant values were found. So, researchers are encouraged to use machine learning algorithms for future studies including important attributes that could predict child death. The current findings would be useful for stakeholders’ preparedness, and taking proactive childcare interventions. Encouraging women in education, media access, and economic development programs are essential interventions for child death reduction.
Under-five mortality is the most important indicator to measure the health status of children, and it is a key marker for the development of countries . The under-five mortality rate is the probability of children dying before their fifth birthday . Globally, nearly 44% of all under-five deaths occurred before their first month of birth , and an estimated 4.1 million child deaths occurred in 2017 . According to the Centers for Disease Control and Prevention, child mortality in the United States in 2020 was predicted to be 5.4 deaths per 1000 live births .
The risk of under-five mortality is highest in low-income countries. The under-five mortality rate in low-income countries was predicted to be 69 deaths per 1000 live births in 2017, which is almost 14 times the rate in high-income countries [6,7]. In Bangladesh, 522 under-five children died per 1000 live births . In 2001, under-five mortality in Nepal was projected to be 91 deaths per 1000 live births . Though under-five mortality shows a reduction from 166 to 67 per 1000 live births over a period of 16 years , Ethiopia appears to have the fifth-highest number of new-born deaths in the world . Under-five mortality is projected to cause 472,000 children to die annually in Ethiopia before their fifth birthday, which places Ethiopia sixth in the world according to the number of under-five deaths [7,12]. According to WHO 2017, more than half of under-five deaths are due to infectious diseases that are easily preventable and treatable through simple and affordable interventions . Under-five mortality is also caused by undernutrition, which further leads to stunting and wasting .
Conclusions and recommendations
This study aimed to identify the best-supervised machine learning algorithms to classify and select important attributes to predict the death of children before their fifth birthday. In line with the objectives, six supervised machine learning algorithms were considered that accurately predict the death of children before celebrating their fifth birthday. Different confusion matrix element was used to compare the candidate-supervised machine learning algorithms. Based on the result, the random forest algorithm was the best performance model to predict the death of children before celebrating their fifth birthday. Attributes such as late initiation of breastfeeding, mothers with no formal education, the short birth interval, poor wealth status of the mother, and being unexposed to media were the top important attributes to predict child deaths.
Generating associated rules for child death was another objective of the study. Accordingly, six rules were generated that were associated with the deaths of children before celebrating their fifth birthday. The findings of this study would have practical implications by supporting policymakers and stakeholders in developing childcare intervention mechanisms and preparing themselves to care for children as early as possible. Stakeholders are recommended to encourage mothers to initiate breastfeeding at the appropriate time. Improving mothers’ wealth status, closing the gap in media access, and creating awareness among mothers would be critical interventions to enhance the survival of children in Ethiopia. The generated rules would also have theoretical implications by extracting and representing knowledge. Moreover, researchers would use this study as a baseline and framework for further research studies, including important attributes that would predict child mortality in low-income countries.