UiTM STA555 Project Report Sample
UiTM STA555 Project Report Sample
1.0 Introduction.............................................................................................................3
5.0 Methodology
5.1 Data Collection.............................................................................................8
5.2 Data Description...........................................................................................8
5.3 Model 1: Logistic Regression Model.........................................................8-9
5.4 Model 2: Decision Tree Model.................................................................9-10
5.5 Model 3: Neural Network Model..................................................................10
7.0 Conclusion............................................................................................................32
8.0 References.............................................................................................................33
9.0 Appendixes.......................................................................................................34-40
1
1.0 Introduction
Mental health problems are actually very common. 5 of the 10 leading causes
of disability worldwide are mental health problems. Around 450 million people
suffer from mental disorders and one in four families has at least one member
with a mental disorder at any point in time.
Work is good for mental health but a negative working environment can lead to
physical and mental health problems. There are many risk factors for mental
health that may be present in the working environment. Most risks relate to
interactions between type of work, the organizational environment, the skills of
employees, and the support available for employees to carry out their work.
56 per cent for major depression which is one of mental health problems are the
rates of people who are not seeking for any treatment. There are some factors
that influencing them to not seek for treatment for their mental health problems.
The most common reason for people to not seek treatment for mental health
issue is because they feel a sense of shame in being mentally unfit. This is
because there is a lot of stigma and discrimination associated to such
disorders. Other than that, some of them have lack of support from people
around them such as families, friends and co-workers. They are not willing to
accept or acknowledge a mental health issue a family member or a friend is
suffering from. They prefer living in denial rather than accepting and seeking
treatment.
2
3.0 Research Objectives
3.1 Objective 1
To identify the most suitable model to use between the three models which is
logistic regression model, decision tree model and neural network model.
3.2 Objective 2
3
4.0 Literature Review
4.1 Introduction
Access to mental health treatment remains a major problem globally, but more
obvious in developing countries. In general, mental health problems even
though are acknowledged as great contributors to the global burden of disease,
they receive little attention at global, regional and local levels compared to other
illnesses such as communicable diseases. Approximately 1 in 4 adults will
experience a mental health problem at some point during their lives. Over the
past decade government policies and funding has been aimed at improving
access to mental health treatment. However barriers to accessing care still
remain.
The most important factors that could influence access to mental health
treatment among people with mental health problems include the perception of
the causes of mental illness. In addition, mental health treatment is scarce for
most of the population resulting in patients and their families using what is
available and also travels long distance to access services. Efforts to improve
access to mental health treatment should be approached holistically, as it is
influenced by social, family and health system factors.
There are some previous researchers that used logistic regression model in
their studies regarding mental health issues. Posttraumatic stress disorder is
one of the mental health problems. A study in 2007 regarding mental health
treatment seeking by military members with posttraumatic stress disorder found
that a significant portion of military members with posttraumatic stress disorder
seek mental health treatment, 1 in 3 never did. The results of the logistic
regression showed about two-thirds (62.2%) of military members with
posttraumatic stress disorder sought some form of mental health treatment in
their lifetime while a significant portion (35.2%) never sought any form of mental
health treatment.
Another study in 2014 regarding mental health treatment in the primary care
setting found that 30% of the adult population has a mental health disorder
within any 12-month period and most of them will be diagnosed, treated, and
managed in primary care. 8.1% of 184636 patients had poor mental health. For
this group, 49.5% of them obtained care from only a primary care physician for
treatment, 5.0% obtained care from only a mental health provider and just
13.6% received care from both mental health and primary care providers.
4
Approximately 28.6% of adults with better mental health did not report any
mental health treatment visits compared with 17.7% of adults with poor mental
health. The study also found that patients who obtained care solely from
primary care providers tended to be female, of lower income, have less
schooling, and were older than persons who obtained care solely from mental
health providers.
A study regarding child mental health disorders by using neural network model
was conducted in 2011. The researchers did many experiments for finding
better neural network structure for suiting the child mental health intelligent
diagnosis system. The researchers think that full connecting mode suit the
medical system better and at the same time adding suitable hidden node
number can improve convergence effect and reduce error of network. But
adding hidden layer number does not always improve network convergence
effect under the experiment condition.
5
The study found that diagnosis and therapy system of child mental health
disorders can diagnose 61 kind child mental health disorders. This includes
more than 95% child mental health disorders such as hyperactivity, conduct
disorder, tic disorder, depression and anxiety. Moreover, after each diagnosis,
the computer will give a treatment method suggesting. Comparing the diagnosis
by computer with the senior child psychiatrists the diagnosis consistent rate is
99%.
Another study in 2002 was conducted regarding the analysis of common mental
disorders factors by using neural networks. The aim of the study is to analyse
common mental disorders factors using multilayer perceptron trained with
simulated annealing algorithm. The study found that by using neural networks
model, the variables which showed higher relation with common mental
disorders were years of schooling, marital status, sex, working conditions,
possession of house, incoming and age. The variable that is more associated
with common mental disorders is years of schooling with 89,29%.
6
5.0 Methodology
The dataset was taken from kaggle.com. The data was made public, which
gives us an interesting opportunity to analyze the attitudes of tech-workers from
48 different countries towards mental health. The data is ordered by date from
August 2014 until February 2016. There are 1260 responses with 26 different
variables in the dataset.
The data is related with attitudes towards mental health and frequency of
mental health disorders in the tech workplace. This survey had questions
pertaining to how mental health is perceived at tech workplaces by employees
and their employers. In this study, 1 dependent variable and 7 independent
variables are selected from the dataset.
1
𝑙𝑜𝑔𝑖𝑡 = ln ( ) = 𝐵0 + 𝐵1𝑋
1−𝑝
.
7
While the odd of logistic function is
𝑝
𝑂𝑑𝑑𝑠 = ( ) = 𝑒 (𝐵0+𝐵1𝑋)
1−𝑝
Logistic regression coefficients can be used to estimate odds ratios for each of
the independent variables in the model.
The goal is to estimate the probability that an event occur, p. A method called
maximum likelihood is used, to find the best-fit line for logistic regression.
Logistic regression does not rely on distributional assumptions in the same
sense that discriminant analysis does. As with other forms of regression,
multicollinearity among the predictors can lead to biased estimates and inflated
standard errors.
Method selection allows the user to specify how independent variables are
entered into the analysis. There are 3 different methods that can construct a
variety of regression models from the same set of variables which is forward,
backward and stepwise selection method. The significance values in the output
are based on fitting a single model. Therefore, the significance values are
generally invalid when a stepwise method is used. All independent variables
selected are added to a single regression model. However, different entry
methods can be specified for different subsets of variables.
Decision tree uses the target variable to determine how each input should be
partitioned. In the end, the decision tree breaks the data into segments, defined
by the splitting rules at each step. Taken together, the rules for all the segments
form the decision tree model.
Decision tree repeatedly splits the data set according to a criterion that
maximizes the separation of the data, resulting in a tree-like structure. The most
common criterion is information gain. This means that at each split, the
decrease in entropy due to this split is maximized.
The goal is to build a tree that uses the values of the input fields to create rules
that result in leaves that do a good job of assigning a target value to each
record. The first task is to split the records into children by creating a rule on the
input variables. To perform the split, the algorithm considers all possible splits
on all input variables. The measure used to evaluate a potential split is purity of
8
the target variable in the children. The best split is the one that increases purity
in the children by the greatest amount creates nodes of similar size, or at least
does not create nodes containing very few records
A neural network can have any number of hidden layers, but in general, one
hidden layer is sufficient. The wider the layer, the greater the capacity of the
network to recognize patterns. The neural network will have greater capability in
memorizing the pattern in the training set. This will result in overfitting.
Neural networks are good for prediction and estimation problems. A good
problem has the characteristics such as the inputs are well understood which
means the user have a good idea of which features of the data are important,
but not necessarily how to combine them. Other than that, the output is well
understood. This means the user should know what they are trying to model.
There a few keys to use neural networks successfully. The most important
issue is choosing the right training set. Second, the data must be represented in
such a way as to maximize the ability of the network to recognize patterns in it.
Next, the results produced by the network must be interpreted. Finally,
understand some specific details about how they work, such as network
topology and parameters controlling training.
9
6.0 Results and Discussion
Confusion Matrix
1. Backward Elimination
TRAIN
Predicted
1 0 Total
Actual 1 296 54 350
0 105 236 341
Total 401 290 691
VALIDATE
Predicted
1 0 Total
Actual 1 237 50 287
0 104 177 281
Total 341 227 568
10
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 296
= = 0.84571
(𝑇𝑃 + 𝐹𝑁) (296 + 54)
Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.84571.
Validate:
𝑇𝑃 237
= = 0.82578
(𝑇𝑃 + 𝐹𝑁) (237 + 50)
Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.82578.
Conclusion: The model’s ability to predict negative outcome correctly for train is
0.69208.
Validate:
𝑇𝑁 177
= = 0.62989
(𝑇𝑁 + 𝐹𝑃) (177 + 104)
iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (296 + 236)
= = 0.76990
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (296 + 236 + 105 + 54)
Conclusion: The model’s ability to predict both positive or negative outcome for
train is 0.76990.
Validate:
(𝑇𝑃 + 𝑇𝑁) (237 + 177)
= = 0.72887
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (50 + 177 + 104 + 237)
Conclusion: The model’s ability to predict both positive or negative outcome for
validate is 0.72887.
11
2. Forward Selection
TRAIN
Predicted
1 0 Total
Actual 1 292 58 350
0 98 243 341
Total 390 301 691
VALIDATE
Predicted
1 0 Total
Actual 1 235 52 287
0 101 180 281
Total 336 232 568
12
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 292
= = 0.83429
(𝑇𝑃 + 𝐹𝑁) (292 + 58)
Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.83429.
Validate:
𝑇𝑃 235
= = 0.81882
(𝑇𝑃 + 𝐹𝑁) (235 + 52)
Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.81882.
𝑇𝑁 243
= = 0.71261
(𝑇𝑁 + 𝐹𝑃) (243 + 98)
Conclusion: The model’s ability to predict negative outcome correctly for train is
0.71261.
Validate:
𝑇𝑁 180
= = 0.64057
(𝑇𝑁 + 𝐹𝑃) (180 + 101)
iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (292 + 243)
= = 0.77424
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (292 + 243 + 58 + 98)
Conclusion: The model’s ability to predict both positive or negative outcome for
the train is 0.77424.
Validate:
(𝑇𝑃 + 𝑇𝑁) (235 + 180)
= = 0.73063
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (52 + 180 + 101 + 235)
Conclusion: The model’s ability to predict both positive or negative outcome for
the validate is 0.73063.
13
3. Stepwise Regression
TRAIN
Predicted
1 0 Total
Actual 1 296 54 350
0 105 236 341
Total 401 290 691
VALIDATE
Predicted
1 0 Total
Actual 1 237 50 287
0 104 177 281
Total 341 227 568
14
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 296
= = 0.84571
(𝑇𝑃 + 𝐹𝑁) (296 + 54)
Conclusion: The model’s ability to predict positive outcome correctly for train is
0.84571.
Validate:
𝑇𝑃 237
= = 0.82578
(𝑇𝑃 + 𝐹𝑁) (237 + 50)
Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.82578.
𝑇𝑁 236
= = 0.69208
(𝑇𝑁 + 𝐹𝑃) (236 + 105)
Conclusion: The model’s ability to predict negative outcome correctly for the
train is 0.69208.
Validate:
𝑇𝑁 177
= = 0.62989
(𝑇𝑁 + 𝐹𝑃) (177 + 104)
iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (296 + 236)
= = 0.76990
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (54 + 236 + 105 + 296)
Conclusion: The model’s ability to predict both positive or negative outcome for
the train is 0.76990.
Validate:
(𝑇𝑃 + 𝑇𝑁) (237 + 177)
= = 0.72887
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (50 + 177 + 104 + 237)
Conclusion: The model’s ability to predict both positive or negative outcome for
the validate is 0.72887.
15
Model Interpretation
1. Backward Elimination
From the output above, the variables that significant which is have
p-value < 0.05 are IMP_REP_work_interfere, benefits, family_history, and
leave.
16
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.079-1)*100 The odd of never have work interfere
Never vs Sometimes =92.1% is 92.1% lower than sometimes work
interfere.
IMP_REP_work_interfere (2.001-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =100.1% 100.1% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.638-1)*100 The odd of rarely have work interfere
Rarely vs Sometimes =63.8% is 63.8% higher than sometimes work
interfere.
benefits (0.334-1)*100 The odd of benefits ‘Don’t know’ is
=66.6% 66.6% lower than benefits ‘Yes’.
benefits (0.458-1)*100 The odd of benefits ‘no’ is 54.2%
=54.2% lower than benefits ‘yes’.
Family_history (0.308-1)*100 The odd of did not have family history
=69.2% in mental health is 69.2% lower than
have family history in mental health.
leave (0.787-1)*100 The odd of leave ‘don’t know’ is
=21.3% 21.3% lower than leave ‘very easy’.
leave (1.718-1)*100 The odd of leave ‘somewhat difficult’
=71.8% is 71.8% higher than leave ‘very
easy’.
leave (0.564-1)*100 The odd of leave ‘somewhat easy’ is
=43.6% 43.6% lower than leave ‘very easy’.
leave (2.132-1)*100 The odd of leave ‘very difficult’ is
=113.2% 113.2% higher than leave ‘very easy’.
Logistic function:
𝑝
𝑙𝑛 (1−𝑝) =
0.3871-2.1966*IMP_REP_work_interfere(Never)+1.0303*IMP_REP_work_
interfere(Often)+0.8297*IMP_REP_work_interfere(Rarely)-0.4708*benefits
(Don’t know)-0.1552*benefits(no)-0.5880*family_history(no)-0.3371*leave(Don’t
know)+0.4441*leave(somewhatdifficult)-0.6700*leave(somewhat easy)
+0.6600*leave(very difficult)
17
2. Forward Selection
18
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.082-1)*100 The odd of never have work interfere is
Never vs Sometimes =91.8% 91.8% lower than sometimes work
interfere.
IMP_REP_work_interfere (1.964-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =96.4% 96.4% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.613-1)*100 The odd of rarely have work interfere is
Rarely vs Sometimes =61.3% 61.3% higher than sometimes work
interfere.
Benefits (0.348-1)*100 The odd of benefits ‘Don’t know’ is 65.2%
Don’t Know vs Yes =65.2% lower than benefits ‘Yes’.
Benefits (0.480-1)*100 The odd of benefits ‘no’ is 52% lower than
No vs Yes =52% benefits ‘yes’.
Family_history (0.324-1)*100 The odd of did not have family history in
No vs Yes =67.6% mental health is 67.6% lower than have
family history in mental health.
Leave (0.771-1)*100 The odd of leave ‘don’t know’ is 22.9%
Don’t Know vs Vey Easy =22.9% lower than leave ‘very easy’.
Leave (1.548-1)*100 The odd of leave ‘somewhat difficult’ is
Somewhat difficult vs =54.8% 54.8% higher than leave ‘very easy’.
Very easy
Leave (0.568-1)*100 The odd of leave ‘somewhat easy’ is
Somewhat easy vs very =43.2% 43.2% lower than leave ‘very easy’.
easy
Leave (2.136-1)*100 The odd of leave ‘very difficult’ is 113.6%
Very difficult vs Very =113.6% higher than leave ‘very easy’.
easy
Logistic function:
𝑝
𝑙𝑛 ( ) =
1−𝑝
3.8142-2.1678*IMP_REP_work_interfere(Never)+1.0134*IMP_REP_work_
interfere(Often) +0.8162* IMP_REP_work_interfere(Rarely)-3.0183*
REP_Gender(Female)-3.5612*REP_Gender(Male)-0.4597*benefits(Don’t
know)-0.1371*benefits(no)-0.5642*family_history(no)-0.3336*leave(Don’t
know)+0.3627*leave(somewhat difficult)-0.6402*leave
(somewhat easy)+0.6850*leave(very difficult)
19
3. Stepwise Regression
20
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.079-1)*100 The odd of never have work interfere is
Never vs Sometimes =92.1% 92.1% lower than sometimes work
interfere.
IMP_REP_work_interfere (2.001-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =100.1% 100.1% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.638-1)*100 The odd of rarely have work interfere is
Rarely vs Sometimes =63.8% 63.8% higher than sometimes work
interfere.
benefits (0.334-1)*100 The odd of benefits ‘Don’t know’ is
=66.6% 66.6% lower than benefits ‘Yes’.
Logistic function:
𝑝
𝑙𝑛 (1−𝑝) =
0.3871-2.1966*IMP_REP_work_interfere(Never)+1.0303*IMP_REP_work_
interfere(Often)+0.8297*IMP_REP_work_interfere(Rarely)-0.4708*benefits
(Don’t know)-0.1552*benefits(no)-0.5880*family_history(no)-0.3371*leave(Don’t
know)+0.4441*leave(somewhatdifficult)-0.6700*leave(somewhat easy)
+0.6600*leave(very difficult)
21
Model Selection
There is no underfit model since there is no negative value of gap of the mean
square error between train and valid. And there is no overfit model since there
is no highest value of gap between valid and train.
22
6.2 Model 2: Decision Tree Model
Confusion Matrix
TRAIN
Predicted
Actual 1 0 Total
1 331 19 350
0 93 248 341
Total 424 267 691
VALIDATE
Predicted
Actual 1 0 Total
1 272 15 287
0 86 195 281
Total 358 210 568
23
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 331
= = 0.8073
(𝑇𝑃 + 𝐹𝑁) (331 + 19)
Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.8073.
Validate:
𝑇𝑃 272
= = 0.94774
(𝑇𝑃 + 𝐹𝑁) (272 + 15)
Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.94774.
Conclusion: The model’s ability to predict negative outcome correctly for train is
0.7273.
Validate:
𝑇𝑁 195
= = 0.69395
(𝑇𝑁 + 𝐹𝑃) (195 + 86)
iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (331 + 248)
= = 0.8379
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (331 + 248 + 93 + 19)
Conclusion: The model’s ability to predict both positive and negative outcome is
0.8379.
Validate:
(𝑇𝑃 + 𝑇𝑁) (272 + 195)
= = 0.82218
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (272 + 195 + 86 + 15)
Conclusion: The model’s ability to predict both positive and negative outcome is
0.82218.
24
Model Interpretation
25
*------------------------------------------------------------*
Node = 4
*------------------------------------------------------------*
if work_interfere <= NA or MISSING
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 4
Number of Observations = 151
Predicted: treatment=Yes = 0.01
Predicted: treatment=No = 0.99
*------------------------------------------------------------*
Node = 7
*------------------------------------------------------------*
if benefits IS ONE OF: YES
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 7
Number of Observations = 178
Predicted: treatment=Yes = 0.89
Predicted: treatment=No = 0.11
*------------------------------------------------------------*
Node = 8
*------------------------------------------------------------*
if work_interfere >= NEVER
AND mental_health_consequence IS ONE OF: NO, YES or MISSING
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 8
Number of Observations = 74
Predicted: treatment=Yes = 0.08
Predicted: treatment=No = 0.92
*------------------------------------------------------------*
Node = 9
*------------------------------------------------------------*
if work_interfere >= NEVER
AND mental_health_consequence IS ONE OF: MAYBE
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 9
Number of Observations = 42
Predicted: treatment=Yes = 0.26
Predicted: treatment=No = 0.74
26
*------------------------------------------------------------*
Node = 10
*------------------------------------------------------------*
if family_history IS ONE OF: NO
AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 10
Number of Observations = 122
Predicted: treatment=Yes = 0.60
Predicted: treatment=No = 0.40
*------------------------------------------------------------*
Node = 11
*------------------------------------------------------------*
if family_history IS ONE OF: YES or MISSING
AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 11
Number of Observations = 124
Predicted: treatment=Yes = 0.81
Predicted: treatment=No = 0.19
27
6.3 Model 3: Neural Network Model
Confusion Matrix
TRAIN
Predicted
Actual 1 0 Total
1 281 69 350
0 73 268 341
Total 354 337 691
VALIDATE
Predicted
Actual 1 0 Total
1 224 63 287
0 101 180 281
Total 325 243 568
28
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 281
= = 0.8029
(𝑇𝑃 + 𝐹𝑁) 281 + 69
Conclusion: The model’s ability to predict positive outcome correctly for train is
0.8029.
Validate:
𝑇𝑃 224
= = 0.7805
(𝑇𝑃 + 𝐹𝑁) 224 + 63
Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.7805.
Conclusion: The model’s ability to predict negative outcome correctly for train is
0.7860.
Validate:
𝑇𝑁 180
= = 0.6406
(𝑇𝑁 + 𝐹𝑃) 180 + 101
iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (281 + 268)
= = 0.7945
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (281 + 268 + 73 + 69)
Conclusion: The model’s ability to predict both positive or negative outcome for
train is 0.7945.
Validate:
Conclusion: The model’s ability to predict both positive or negative outcome for
validate is 0.7113.
29
Based on the output above, there are a few independent variables that are not
significant to the neural network model which is age, work interfere and leave.
There is and no underfit model because none of the three models has positive
value of ROC index gap. The overfit model among the three model is neural
network model because it has the largest gap for misclassification rate and
average squared error.
The best model is decision tree because it has the smallest gap for
misclassification rate, average squared error and ROC index. The independent
variables that are significant to the decision tree model are work interfere,
mental health consequence, family history and benefits.
30
7.0 Conclusion
Seeking for a mental health treatment is very important to people who suffer from it. In
this study, we are able to identify the most suitable model to use between the three
models which is logistic regression model, decision tree model and neural network
model. The best model among the three models is decision tree. This is because
decision tree model has the smallest gap of misclassification rate, average squared
error and ROC index.
Besides that, we are able to determine which independent variables are significant to
predict if the employees will seek treatment for a mental health condition. Based on the
decision tree model, we found that work interfere, mental health consequence, family
history and benefits are the variables that are significant to the model.
31
8.0 References
Malaysia Mental Health Association (MMHA). What is Mental Health. Retrieved from
http://mmha.org.my/what-is-mental-health/
Chen, Fan, Zhou, Li. Neural Network Structure Study In Child Mental Health Disorders
Intelligent Diagnosis System. (November 2011). Retrieved from
https://www.sciencedirect.com/science/article/pii/S1878029611007390?via%3Dihub
C. R. S. Lopes. Neural Networks for the analysis of Common Mental Disorders Factors.
(2002). Retrieved from
https://www.computer.org/csdl/proceedings/sbrn/2002/1709/00/17090114.pdf
Petterson, Miller, Payne, Phillips. Mental health treatment in the primary care setting:
patterns and pathways. (June 2014). Retrieved from
https://www.ncbi.nlm.nih.gov/pubmed/24773273
Fikretoglu, Brunet, Guay, Pedlar. Mental health treatment seeking by military members
with posttraumatic stress disorder: findings on rates, characteristics, and predictors
from a nationally representative Canadian military sample. (February 2007). Retrieved
from https://www.ncbi.nlm.nih.gov/pubmed/17375866
https://www.kaggle.com/diegocalvo/data-mining-of-mental-health/data
32
9.0 Appendixes
33
34
35
Model 2: Decision Tree Model
36
37
Model 3: Neural Network Model
38
39