Thanks to visit codestin.com
Credit goes to www.scribd.com

100% found this document useful (11 votes)
6K views39 pages

UiTM STA555 Project Report Sample

This document discusses a study on predicting whether employees will seek treatment for mental health conditions. It presents three models for analysis: logistic regression, decision tree, and neural network. The literature review finds that logistic regression has been used effectively in previous studies on mental health treatment seeking. Decision trees like the MacNeil-Lichtenberg Decision Tree (MLDT) can help identify patients likely to need cognitive assessments or referrals. The study aims to identify the most suitable predictive model and determine significant variables for predicting treatment seeking.

Uploaded by

Emma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
100% found this document useful (11 votes)
6K views39 pages

UiTM STA555 Project Report Sample

This document discusses a study on predicting whether employees will seek treatment for mental health conditions. It presents three models for analysis: logistic regression, decision tree, and neural network. The literature review finds that logistic regression has been used effectively in previous studies on mental health treatment seeking. Decision trees like the MacNeil-Lichtenberg Decision Tree (MLDT) can help identify patients likely to need cognitive assessments or referrals. The study aims to identify the most suitable predictive model and determine significant variables for predicting treatment seeking.

Uploaded by

Emma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 39

CONTENTS

1.0 Introduction.............................................................................................................3

2.0 Research Background............................................................................................3

3.0 Research Objective


3.1 Objective 1....................................................................................................4
3.2 Objective 2....................................................................................................4

4.0 Literature Review


4.1 Introduction...................................................................................................5
4.2 Model 1: Logistic Regression Model.........................................................5-6
4.3 Model 2: Decision Tree Model......................................................................6
4.4 Model 3: Neural Network Model................................................................6-7

5.0 Methodology
5.1 Data Collection.............................................................................................8
5.2 Data Description...........................................................................................8
5.3 Model 1: Logistic Regression Model.........................................................8-9
5.4 Model 2: Decision Tree Model.................................................................9-10
5.5 Model 3: Neural Network Model..................................................................10

6.0 Results and Discussion


6.1 Model 1: Logistic Regression Model......................................................11-23
6.2 Model 2: Decision Tree Model...............................................................24-28
6.3 Model 3: Neural Network Model.............................................................29-31
6.4 Model Comparison and Best Model............................................................31

7.0 Conclusion............................................................................................................32

8.0 References.............................................................................................................33

9.0 Appendixes.......................................................................................................34-40

1
1.0 Introduction

Mental health is defined as a state of well-being in which every individual


realizes their own potential, can cope with the normal stresses of life, can work
productively and fruitfully, and is able to make a contribution to their community.

Mental health problems are actually very common. 5 of the 10 leading causes
of disability worldwide are mental health problems. Around 450 million people
suffer from mental disorders and one in four families has at least one member
with a mental disorder at any point in time.

Statistics of mental health in Malaysia shows that every 3 in 10 adults aged 16


years and above have some sorts of mental health problems. The prevalence of
mental health problems among adults increased from 10.7% in 1996, to 11.2%
in 2006, to 29.2% in 2015.

2.0 Research Background

Work is good for mental health but a negative working environment can lead to
physical and mental health problems. There are many risk factors for mental
health that may be present in the working environment. Most risks relate to
interactions between type of work, the organizational environment, the skills of
employees, and the support available for employees to carry out their work.

A healthy workplace can be described as one where workers and managers


actively contribute to the working environment by promoting and protecting the
health, safety and well-being of all employees. An important element of
achieving a healthy workplace is the development of strategies and polices
such as informing staff that mental health support and benefit are available for
them. The organization should also provide with the appropriate mental health
training so that the employees will feel more confident in discussing mental
health matters with the employers.

56 per cent for major depression which is one of mental health problems are the
rates of people who are not seeking for any treatment. There are some factors
that influencing them to not seek for treatment for their mental health problems.
The most common reason for people to not seek treatment for mental health
issue is because they feel a sense of shame in being mentally unfit. This is
because there is a lot of stigma and discrimination associated to such
disorders. Other than that, some of them have lack of support from people
around them such as families, friends and co-workers. They are not willing to
accept or acknowledge a mental health issue a family member or a friend is
suffering from. They prefer living in denial rather than accepting and seeking
treatment.

2
3.0 Research Objectives

There are a few objectives that we are aiming in this study:

3.1 Objective 1

To identify the most suitable model to use between the three models which is
logistic regression model, decision tree model and neural network model.

3.2 Objective 2

To determine which independent variables are significant to predict if the


employees will seek treatment for a mental health condition which is the
dependent variable.

3
4.0 Literature Review

4.1 Introduction

Access to mental health treatment remains a major problem globally, but more
obvious in developing countries. In general, mental health problems even
though are acknowledged as great contributors to the global burden of disease,
they receive little attention at global, regional and local levels compared to other
illnesses such as communicable diseases. Approximately 1 in 4 adults will
experience a mental health problem at some point during their lives. Over the
past decade government policies and funding has been aimed at improving
access to mental health treatment. However barriers to accessing care still
remain.

The most important factors that could influence access to mental health
treatment among people with mental health problems include the perception of
the causes of mental illness. In addition, mental health treatment is scarce for
most of the population resulting in patients and their families using what is
available and also travels long distance to access services. Efforts to improve
access to mental health treatment should be approached holistically, as it is
influenced by social, family and health system factors.

4.2 Model 1: Logistic Regression Model

There are some previous researchers that used logistic regression model in
their studies regarding mental health issues. Posttraumatic stress disorder is
one of the mental health problems. A study in 2007 regarding mental health
treatment seeking by military members with posttraumatic stress disorder found
that a significant portion of military members with posttraumatic stress disorder
seek mental health treatment, 1 in 3 never did. The results of the logistic
regression showed about two-thirds (62.2%) of military members with
posttraumatic stress disorder sought some form of mental health treatment in
their lifetime while a significant portion (35.2%) never sought any form of mental
health treatment.

Another study in 2014 regarding mental health treatment in the primary care
setting found that 30% of the adult population has a mental health disorder
within any 12-month period and most of them will be diagnosed, treated, and
managed in primary care. 8.1% of 184636 patients had poor mental health. For
this group, 49.5% of them obtained care from only a primary care physician for
treatment, 5.0% obtained care from only a mental health provider and just
13.6% received care from both mental health and primary care providers.

4
Approximately 28.6% of adults with better mental health did not report any
mental health treatment visits compared with 17.7% of adults with poor mental
health. The study also found that patients who obtained care solely from
primary care providers tended to be female, of lower income, have less
schooling, and were older than persons who obtained care solely from mental
health providers.

4.3 Model 2: Decision Tree Model

Based on the previous study, The MacNeil-Lichtenberg Decision Tree (MLDT)


was develop to guide decision making. MLDT is composed of a cognitive
component and affective component. The objective of cognitive component is,
to identify those patients with high probability for cognitive impairment, to
quickly target specific referral questions and to reduce the number of
unnecessary mental health referrals.

In study 1 (Utility of Cognitive Component of The MLDT), the data from a


sample of 173 inpatients were utilized to evaluate cognitive components of the
MLDT. In this study they measure the MLDT based on three test which is
Benton Temporal Orientation Test, Animal naming set and Psychosocial
considerations. Based on this three test, if the patients score in the impaired
range on either animal naming or orientation, and they report at least one
positive psychosocial indicator (a “yes” response to either indicator 1 or 2; a
“no” response to 3), a referral for complete cognitive assessment is
recommended. Based on the study 1, the result shows that MMSE and MDRS
were significantly correlated with education, MLDT cognitive measures are not.

Study 2 is regarding Utility of Emotional Status Component of the MLDT. The


result of sensitivity of the GDS was 76% and specificity was 80%. The positive
predictive power shows the percentage of individual classified as depressed
who were actually depressed based on the GDS total score was 57%. The
decision tree was designed to help health care professional quickly triage needs
for cognitive assessment and depression assessment in older adult.

4.4 Model 3: Neural Network Model

A study regarding child mental health disorders by using neural network model
was conducted in 2011. The researchers did many experiments for finding
better neural network structure for suiting the child mental health intelligent
diagnosis system. The researchers think that full connecting mode suit the
medical system better and at the same time adding suitable hidden node
number can improve convergence effect and reduce error of network. But
adding hidden layer number does not always improve network convergence
effect under the experiment condition.

5
The study found that diagnosis and therapy system of child mental health
disorders can diagnose 61 kind child mental health disorders. This includes
more than 95% child mental health disorders such as hyperactivity, conduct
disorder, tic disorder, depression and anxiety. Moreover, after each diagnosis,
the computer will give a treatment method suggesting. Comparing the diagnosis
by computer with the senior child psychiatrists the diagnosis consistent rate is
99%.

Another study in 2002 was conducted regarding the analysis of common mental
disorders factors by using neural networks. The aim of the study is to analyse
common mental disorders factors using multilayer perceptron trained with
simulated annealing algorithm. The study found that by using neural networks
model, the variables which showed higher relation with common mental
disorders were years of schooling, marital status, sex, working conditions,
possession of house, incoming and age. The variable that is more associated
with common mental disorders is years of schooling with 89,29%.

6
5.0 Methodology

5.1 Data Collection

The dataset was taken from kaggle.com. The data was made public, which
gives us an interesting opportunity to analyze the attitudes of tech-workers from
48 different countries towards mental health. The data is ordered by date from
August 2014 until February 2016. There are 1260 responses with 26 different
variables in the dataset.

5.2 Data Description

The data is related with attitudes towards mental health and frequency of
mental health disorders in the tech workplace. This survey had questions
pertaining to how mental health is perceived at tech workplaces by employees
and their employers. In this study, 1 dependent variable and 7 independent
variables are selected from the dataset.

The dependent variable or target variable chosen is treatment which represents


binary outcomes whether the employees seek for treatment of mental health
condition or not. The independent variables or input variables chosen are age,
gender, work interfere, family history, benefits, leave and mental health
consequence. Age has an interval level while gender has a nominal level. Work
interfere is about whether the employees feel that their mental health condition
interferes with their work if they have any mental health condition. The work
interfere level is nominal. Next, the variable family history is about the
employee’s family history of mental illness and the level is binary. Benefits are
about the mental health benefits that are provided by the employer. Leave is
about how easy is it for the employees to take medical leave for a mental health
condition and mental health consequence is about the employees’ opinion
whether discussing a mental health issue with their employer would have
negative consequences or not. The benefits, leave and mental health
consequence have nominal level of variable.

5.3 Model 1: Logistic Regression Model

Logistic regression is able to predict the presence or absence of a characteristic


or outcome based on values of a set of predictor variables. It is similar to a
linear regression model but is suited to models where the dependent variable is
binary outcomes that take on two values, 1 or 0.

The estimated model of logistic regression model is

1
𝑙𝑜𝑔𝑖𝑡 = ln ( ) = 𝐵0 + 𝐵1𝑋
1−𝑝
.

7
While the odd of logistic function is

𝑝
𝑂𝑑𝑑𝑠 = ( ) = 𝑒 (𝐵0+𝐵1𝑋)
1−𝑝

Logistic regression coefficients can be used to estimate odds ratios for each of
the independent variables in the model.

The goal is to estimate the probability that an event occur, p. A method called
maximum likelihood is used, to find the best-fit line for logistic regression.
Logistic regression does not rely on distributional assumptions in the same
sense that discriminant analysis does. As with other forms of regression,
multicollinearity among the predictors can lead to biased estimates and inflated
standard errors.

Method selection allows the user to specify how independent variables are
entered into the analysis. There are 3 different methods that can construct a
variety of regression models from the same set of variables which is forward,
backward and stepwise selection method. The significance values in the output
are based on fitting a single model. Therefore, the significance values are
generally invalid when a stepwise method is used. All independent variables
selected are added to a single regression model. However, different entry
methods can be specified for different subsets of variables.

5.4 Model 2: Decision Tree Model

A decision tree is a hierarchical collection of rules that describes how to divide a


large collection of records into successively smaller groups of records. With
each successive division, the members of the resulting segments become more
and more similar to one another with respect to the target.

Decision tree uses the target variable to determine how each input should be
partitioned. In the end, the decision tree breaks the data into segments, defined
by the splitting rules at each step. Taken together, the rules for all the segments
form the decision tree model.

Decision tree repeatedly splits the data set according to a criterion that
maximizes the separation of the data, resulting in a tree-like structure. The most
common criterion is information gain. This means that at each split, the
decrease in entropy due to this split is maximized.

The goal is to build a tree that uses the values of the input fields to create rules
that result in leaves that do a good job of assigning a target value to each
record. The first task is to split the records into children by creating a rule on the
input variables. To perform the split, the algorithm considers all possible splits
on all input variables. The measure used to evaluate a potential split is purity of

8
the target variable in the children. The best split is the one that increases purity
in the children by the greatest amount creates nodes of similar size, or at least
does not create nodes containing very few records

5.5 Model 3: Neural Network Model

The structure of a typical neural network consists of an input layer, a hidden


layer, and an output layer. Data enters the network through input layer. The
hidden layer comprised of artificial neurons, each of which receives multiple
inputs from the input layer. Output layer is a layer that combines results
summarized by the artificial neurons.

A neural network can have any number of hidden layers, but in general, one
hidden layer is sufficient. The wider the layer, the greater the capacity of the
network to recognize patterns. The neural network will have greater capability in
memorizing the pattern in the training set. This will result in overfitting.

Neural networks are good for prediction and estimation problems. A good
problem has the characteristics such as the inputs are well understood which
means the user have a good idea of which features of the data are important,
but not necessarily how to combine them. Other than that, the output is well
understood. This means the user should know what they are trying to model.

There a few keys to use neural networks successfully. The most important
issue is choosing the right training set. Second, the data must be represented in
such a way as to maximize the ability of the network to recognize patterns in it.
Next, the results produced by the network must be interpreted. Finally,
understand some specific details about how they work, such as network
topology and parameters controlling training.

9
6.0 Results and Discussion

6.1 Model 1: Logistic Regression Model

Confusion Matrix

1. Backward Elimination

TRAIN
Predicted
1 0 Total
Actual 1 296 54 350
0 105 236 341
Total 401 290 691

VALIDATE
Predicted
1 0 Total
Actual 1 237 50 287
0 104 177 281
Total 341 227 568

10
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 296
= = 0.84571
(𝑇𝑃 + 𝐹𝑁) (296 + 54)

Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.84571.

Validate:
𝑇𝑃 237
= = 0.82578
(𝑇𝑃 + 𝐹𝑁) (237 + 50)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.82578.

ii) True Negative Rates (TNR), specificity:


Train:
𝑇𝑁 236
= = 0.69208
(𝑇𝑁 + 𝐹𝑃) (236 + 105)

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.69208.

Validate:
𝑇𝑁 177
= = 0.62989
(𝑇𝑁 + 𝐹𝑃) (177 + 104)

Conclusion: The model’s ability to predict negative outcome correctly for


validate is 0.62989.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (296 + 236)
= = 0.76990
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (296 + 236 + 105 + 54)

Conclusion: The model’s ability to predict both positive or negative outcome for
train is 0.76990.

Validate:
(𝑇𝑃 + 𝑇𝑁) (237 + 177)
= = 0.72887
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (50 + 177 + 104 + 237)

Conclusion: The model’s ability to predict both positive or negative outcome for
validate is 0.72887.

11
2. Forward Selection

TRAIN
Predicted
1 0 Total
Actual 1 292 58 350
0 98 243 341
Total 390 301 691

VALIDATE
Predicted
1 0 Total
Actual 1 235 52 287
0 101 180 281
Total 336 232 568

12
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 292
= = 0.83429
(𝑇𝑃 + 𝐹𝑁) (292 + 58)

Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.83429.

Validate:
𝑇𝑃 235
= = 0.81882
(𝑇𝑃 + 𝐹𝑁) (235 + 52)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.81882.

ii) True Negative Rates (TNR), specificity:


Train:

𝑇𝑁 243
= = 0.71261
(𝑇𝑁 + 𝐹𝑃) (243 + 98)

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.71261.

Validate:
𝑇𝑁 180
= = 0.64057
(𝑇𝑁 + 𝐹𝑃) (180 + 101)

Conclusion: The model’s ability to predict negative outcome correctly for


validate is 0.64057.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (292 + 243)
= = 0.77424
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (292 + 243 + 58 + 98)

Conclusion: The model’s ability to predict both positive or negative outcome for
the train is 0.77424.

Validate:
(𝑇𝑃 + 𝑇𝑁) (235 + 180)
= = 0.73063
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (52 + 180 + 101 + 235)

Conclusion: The model’s ability to predict both positive or negative outcome for
the validate is 0.73063.

13
3. Stepwise Regression

TRAIN
Predicted
1 0 Total
Actual 1 296 54 350
0 105 236 341
Total 401 290 691

VALIDATE
Predicted
1 0 Total
Actual 1 237 50 287
0 104 177 281
Total 341 227 568

14
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 296
= = 0.84571
(𝑇𝑃 + 𝐹𝑁) (296 + 54)

Conclusion: The model’s ability to predict positive outcome correctly for train is
0.84571.

Validate:
𝑇𝑃 237
= = 0.82578
(𝑇𝑃 + 𝐹𝑁) (237 + 50)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.82578.

ii) True Negative Rates (TNR), specificity:


Train:

𝑇𝑁 236
= = 0.69208
(𝑇𝑁 + 𝐹𝑃) (236 + 105)

Conclusion: The model’s ability to predict negative outcome correctly for the
train is 0.69208.

Validate:
𝑇𝑁 177
= = 0.62989
(𝑇𝑁 + 𝐹𝑃) (177 + 104)

Conclusion: The model’s ability to predict negative outcome correctly for


validate is 0.62989.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (296 + 236)
= = 0.76990
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (54 + 236 + 105 + 296)

Conclusion: The model’s ability to predict both positive or negative outcome for
the train is 0.76990.

Validate:
(𝑇𝑃 + 𝑇𝑁) (237 + 177)
= = 0.72887
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (50 + 177 + 104 + 237)

Conclusion: The model’s ability to predict both positive or negative outcome for
the validate is 0.72887.

15
Model Interpretation

1. Backward Elimination

From the output above, the variables that significant which is have
p-value < 0.05 are IMP_REP_work_interfere, benefits, family_history, and
leave.

16
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.079-1)*100 The odd of never have work interfere
Never vs Sometimes =92.1% is 92.1% lower than sometimes work
interfere.
IMP_REP_work_interfere (2.001-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =100.1% 100.1% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.638-1)*100 The odd of rarely have work interfere
Rarely vs Sometimes =63.8% is 63.8% higher than sometimes work
interfere.
benefits (0.334-1)*100 The odd of benefits ‘Don’t know’ is
=66.6% 66.6% lower than benefits ‘Yes’.
benefits (0.458-1)*100 The odd of benefits ‘no’ is 54.2%
=54.2% lower than benefits ‘yes’.
Family_history (0.308-1)*100 The odd of did not have family history
=69.2% in mental health is 69.2% lower than
have family history in mental health.
leave (0.787-1)*100 The odd of leave ‘don’t know’ is
=21.3% 21.3% lower than leave ‘very easy’.
leave (1.718-1)*100 The odd of leave ‘somewhat difficult’
=71.8% is 71.8% higher than leave ‘very
easy’.
leave (0.564-1)*100 The odd of leave ‘somewhat easy’ is
=43.6% 43.6% lower than leave ‘very easy’.
leave (2.132-1)*100 The odd of leave ‘very difficult’ is
=113.2% 113.2% higher than leave ‘very easy’.

Logistic function:

𝑝
𝑙𝑛 (1−𝑝) =

0.3871-2.1966*IMP_REP_work_interfere(Never)+1.0303*IMP_REP_work_
interfere(Often)+0.8297*IMP_REP_work_interfere(Rarely)-0.4708*benefits
(Don’t know)-0.1552*benefits(no)-0.5880*family_history(no)-0.3371*leave(Don’t
know)+0.4441*leave(somewhatdifficult)-0.6700*leave(somewhat easy)
+0.6600*leave(very difficult)

17
2. Forward Selection

Based on the output above, the variables that significant are


IMP_REP_work_interfere, REP_Gender, benefits, family_history and leave.

18
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.082-1)*100 The odd of never have work interfere is
Never vs Sometimes =91.8% 91.8% lower than sometimes work
interfere.
IMP_REP_work_interfere (1.964-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =96.4% 96.4% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.613-1)*100 The odd of rarely have work interfere is
Rarely vs Sometimes =61.3% 61.3% higher than sometimes work
interfere.
Benefits (0.348-1)*100 The odd of benefits ‘Don’t know’ is 65.2%
Don’t Know vs Yes =65.2% lower than benefits ‘Yes’.
Benefits (0.480-1)*100 The odd of benefits ‘no’ is 52% lower than
No vs Yes =52% benefits ‘yes’.
Family_history (0.324-1)*100 The odd of did not have family history in
No vs Yes =67.6% mental health is 67.6% lower than have
family history in mental health.
Leave (0.771-1)*100 The odd of leave ‘don’t know’ is 22.9%
Don’t Know vs Vey Easy =22.9% lower than leave ‘very easy’.
Leave (1.548-1)*100 The odd of leave ‘somewhat difficult’ is
Somewhat difficult vs =54.8% 54.8% higher than leave ‘very easy’.
Very easy
Leave (0.568-1)*100 The odd of leave ‘somewhat easy’ is
Somewhat easy vs very =43.2% 43.2% lower than leave ‘very easy’.
easy
Leave (2.136-1)*100 The odd of leave ‘very difficult’ is 113.6%
Very difficult vs Very =113.6% higher than leave ‘very easy’.
easy

Logistic function:

𝑝
𝑙𝑛 ( ) =
1−𝑝

3.8142-2.1678*IMP_REP_work_interfere(Never)+1.0134*IMP_REP_work_
interfere(Often) +0.8162* IMP_REP_work_interfere(Rarely)-3.0183*
REP_Gender(Female)-3.5612*REP_Gender(Male)-0.4597*benefits(Don’t
know)-0.1371*benefits(no)-0.5642*family_history(no)-0.3336*leave(Don’t
know)+0.3627*leave(somewhat difficult)-0.6402*leave
(somewhat easy)+0.6850*leave(very difficult)

19
3. Stepwise Regression

Based on the output above, the variables that significant are


IMP_REP_work_interfere, benefits, family_history and leave.

20
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.079-1)*100 The odd of never have work interfere is
Never vs Sometimes =92.1% 92.1% lower than sometimes work
interfere.
IMP_REP_work_interfere (2.001-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =100.1% 100.1% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.638-1)*100 The odd of rarely have work interfere is
Rarely vs Sometimes =63.8% 63.8% higher than sometimes work
interfere.
benefits (0.334-1)*100 The odd of benefits ‘Don’t know’ is
=66.6% 66.6% lower than benefits ‘Yes’.

benefits (0.458-1)*100 The odd of benefits ‘no’ is 54.2% lower


=54.2% than benefits ‘yes’.

Family_history (0.308-1)*100 The odd of did not have family history


=69.2% in mental health is 69.2% lower than
have family history in mental health.
leave (0.787-1)*100 The odd of leave ‘don’t know’ is 21.3%
=21.3% lower than leave ‘very easy’.

leave (1.718-1)*100 The odd of leave ‘somewhat difficult’ is


=71.8% 71.8% higher than leave ‘very easy’.

leave (0.564-1)*100 The odd of leave ‘somewhat easy’ is


=43.6% 43.6% lower than leave ‘very easy’.

leave (2.132-1)*100 The odd of leave ‘very difficult’ is


=113.2% 113.2% higher than leave ‘very easy’.

Logistic function:

𝑝
𝑙𝑛 (1−𝑝) =

0.3871-2.1966*IMP_REP_work_interfere(Never)+1.0303*IMP_REP_work_
interfere(Often)+0.8297*IMP_REP_work_interfere(Rarely)-0.4708*benefits
(Don’t know)-0.1552*benefits(no)-0.5880*family_history(no)-0.3371*leave(Don’t
know)+0.4441*leave(somewhatdifficult)-0.6700*leave(somewhat easy)
+0.6600*leave(very difficult)

21
Model Selection

Misclassification Rate Mean Square Error ROC Index


Model Valid Train Gap Valid Train Gap Valid Train Gap
Description
Backward 0.27113 0.23010 0.04103 0.17872 0.16018 0.01854 0.805 0.843 -0.038
Elimination
Forward 0.26937 0.22576 0.04361 0.17636 0.15851 0.01785 0.81 0.849 -0.039
Selection
Stepwise 0.27113 0.23010 0.04103 0.17872 0.16018 0.01854 0.805 0.843 -0.038
Regression

The best model between Backward Elimination, Forward Selection and


Stepwise Regression is the Forward Selection. This is because Forward
Selection have smallest value of gap of the misclassification rate, mean square
error, and ROC index between valid and train.

There is no underfit model since there is no negative value of gap of the mean
square error between train and valid. And there is no overfit model since there
is no highest value of gap between valid and train.

22
6.2 Model 2: Decision Tree Model

Confusion Matrix

TRAIN
Predicted
Actual 1 0 Total
1 331 19 350
0 93 248 341
Total 424 267 691

VALIDATE
Predicted
Actual 1 0 Total
1 272 15 287
0 86 195 281
Total 358 210 568

23
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 331
= = 0.8073
(𝑇𝑃 + 𝐹𝑁) (331 + 19)

Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.8073.

Validate:

𝑇𝑃 272
= = 0.94774
(𝑇𝑃 + 𝐹𝑁) (272 + 15)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.94774.

ii) True Negative Rates (TNR), specificity:


Train:
𝑇𝑁 248
= = 0.7273
(𝑇𝑁 + 𝐹𝑃) (248 + 93)

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.7273.

Validate:

𝑇𝑁 195
= = 0.69395
(𝑇𝑁 + 𝐹𝑃) (195 + 86)

Conclusion: The model’s ability to predict negative outcome correctly for


validate is 0.69395.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (331 + 248)
= = 0.8379
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (331 + 248 + 93 + 19)
Conclusion: The model’s ability to predict both positive and negative outcome is
0.8379.

Validate:
(𝑇𝑃 + 𝑇𝑁) (272 + 195)
= = 0.82218
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (272 + 195 + 86 + 15)

Conclusion: The model’s ability to predict both positive and negative outcome is
0.82218.

24
Model Interpretation

The most important variable is Replacement: work interfere.


There are 6 rules that represented by the number of leaf.
Depth is 2.

The most important variable is work interfere.


There are 9 important variable ranked by the value of “IMPORTANT COLUMN”
There are 5 variables that used as splitting variable only one time.
The rest of the variables are not used as splitting variables.

25
*------------------------------------------------------------*
Node = 4
*------------------------------------------------------------*
if work_interfere <= NA or MISSING
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 4
Number of Observations = 151
Predicted: treatment=Yes = 0.01
Predicted: treatment=No = 0.99

*------------------------------------------------------------*
Node = 7
*------------------------------------------------------------*
if benefits IS ONE OF: YES
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 7
Number of Observations = 178
Predicted: treatment=Yes = 0.89
Predicted: treatment=No = 0.11

*------------------------------------------------------------*
Node = 8
*------------------------------------------------------------*
if work_interfere >= NEVER
AND mental_health_consequence IS ONE OF: NO, YES or MISSING
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 8
Number of Observations = 74
Predicted: treatment=Yes = 0.08
Predicted: treatment=No = 0.92

*------------------------------------------------------------*
Node = 9
*------------------------------------------------------------*
if work_interfere >= NEVER
AND mental_health_consequence IS ONE OF: MAYBE
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 9
Number of Observations = 42
Predicted: treatment=Yes = 0.26
Predicted: treatment=No = 0.74

26
*------------------------------------------------------------*
Node = 10
*------------------------------------------------------------*
if family_history IS ONE OF: NO
AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 10
Number of Observations = 122
Predicted: treatment=Yes = 0.60
Predicted: treatment=No = 0.40
*------------------------------------------------------------*
Node = 11
*------------------------------------------------------------*
if family_history IS ONE OF: YES or MISSING
AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 11
Number of Observations = 124
Predicted: treatment=Yes = 0.81
Predicted: treatment=No = 0.19

There are 3 profiles for treatment = Yes


if benefits IS ONE OF: YES
AND Replacement: work_interfere >= OFTEN

if family_history IS ONE OF: NO


AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN

if family_history IS ONE OF: YES or MISSING


AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
There are 3 profiles for treatment = No
if work_interfere <= NA or MISSING
AND Replacement: work_interfere <= NEVER or MISSING

if work_interfere >= NEVER


AND mental_health_consequence IS ONE OF: NO, YES or MISSING
AND Replacement: work_interfere <= NEVER or MISSING

if work_interfere >= NEVER


AND mental_health_consequence IS ONE OF: MAYBE
AND Replacement: work_interfere <= NEVER or MISSING

27
6.3 Model 3: Neural Network Model

Confusion Matrix

TRAIN
Predicted
Actual 1 0 Total
1 281 69 350
0 73 268 341
Total 354 337 691

VALIDATE
Predicted
Actual 1 0 Total
1 224 63 287
0 101 180 281
Total 325 243 568

28
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 281
= = 0.8029
(𝑇𝑃 + 𝐹𝑁) 281 + 69

Conclusion: The model’s ability to predict positive outcome correctly for train is
0.8029.

Validate:
𝑇𝑃 224
= = 0.7805
(𝑇𝑃 + 𝐹𝑁) 224 + 63

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.7805.

ii) True Negative Rates (TNR), specificity:


Train:
𝑇𝑁 268
= = 0.7860
(𝑇𝑁 + 𝐹𝑃) 268 + 73

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.7860.

Validate:
𝑇𝑁 180
= = 0.6406
(𝑇𝑁 + 𝐹𝑃) 180 + 101

Conclusion: The model’s ability to predict negative outcome correctly for


validate is 0.6406.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (281 + 268)
= = 0.7945
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (281 + 268 + 73 + 69)

Conclusion: The model’s ability to predict both positive or negative outcome for
train is 0.7945.

Validate:

(𝑇𝑃 + 𝑇𝑁) (224 + 180)


= = 0.7113
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (224 + 180 + 101 + 63)

Conclusion: The model’s ability to predict both positive or negative outcome for
validate is 0.7113.

29
Based on the output above, there are a few independent variables that are not
significant to the neural network model which is age, work interfere and leave.

6.4 Model Comparison and Best Model

Misclassification Rate Average Square Error ROC Index


Model Valid Train Gap Valid Train Gap Valid Train Gap
Description
Logistic 0.26937 0.22576 0.04361 0.17872 0.16018 0.01854 0.81 0.849 -0.039
Regression
Decision 0.17782 0.16208 0.01574 0.12799 0.11872 0.00927 0.879 0.896 -0.017
Tree
Neural 0.28873 0.20550 0.08323 0.19317 0.13547 0.0577 0.785 0.886 -0.101
Network

There is and no underfit model because none of the three models has positive
value of ROC index gap. The overfit model among the three model is neural
network model because it has the largest gap for misclassification rate and
average squared error.

The best model is decision tree because it has the smallest gap for
misclassification rate, average squared error and ROC index. The independent
variables that are significant to the decision tree model are work interfere,
mental health consequence, family history and benefits.

30
7.0 Conclusion

Seeking for a mental health treatment is very important to people who suffer from it. In
this study, we are able to identify the most suitable model to use between the three
models which is logistic regression model, decision tree model and neural network
model. The best model among the three models is decision tree. This is because
decision tree model has the smallest gap of misclassification rate, average squared
error and ROC index.

Besides that, we are able to determine which independent variables are significant to
predict if the employees will seek treatment for a mental health condition. Based on the
decision tree model, we found that work interfere, mental health consequence, family
history and benefits are the variables that are significant to the model.

31
8.0 References

Malaysia Mental Health Association (MMHA). What is Mental Health. Retrieved from
http://mmha.org.my/what-is-mental-health/

World Health Organization. Mental health in the workplace. (2017, September).


Retrieved from http://www.who.int/mental_health/in_the_workplace/en/

Dr Samuel B Harvey. Developing a mentally healthy workplace: A review of the


literature. (November 2014). Retrieved from
https://www.headsup.org.au/docs/default-source/resources/developing-a-mentally-
healthy-workplace_final-november-2014.pdf?sfvrsn=8

Chen, Fan, Zhou, Li. Neural Network Structure Study In Child Mental Health Disorders
Intelligent Diagnosis System. (November 2011). Retrieved from
https://www.sciencedirect.com/science/article/pii/S1878029611007390?via%3Dihub

C. R. S. Lopes. Neural Networks for the analysis of Common Mental Disorders Factors.
(2002). Retrieved from
https://www.computer.org/csdl/proceedings/sbrn/2002/1709/00/17090114.pdf

Petterson, Miller, Payne, Phillips. Mental health treatment in the primary care setting:
patterns and pathways. (June 2014). Retrieved from
https://www.ncbi.nlm.nih.gov/pubmed/24773273

Fikretoglu, Brunet, Guay, Pedlar. Mental health treatment seeking by military members
with posttraumatic stress disorder: findings on rates, characteristics, and predictors
from a nationally representative Canadian military sample. (February 2007). Retrieved
from https://www.ncbi.nlm.nih.gov/pubmed/17375866

MacNeill, SE, Lichtenberg, PA. The MacNeill-Lichtenberg Decision Tree: a unique


method of triaging mental health problems in older medical rehabilitation patients. Arch
Phys Med Rehabil. 2000;81:618–622.

https://www.kaggle.com/diegocalvo/data-mining-of-mental-health/data

32
9.0 Appendixes

Model 1: Logistic Regression Model

33
34
35
Model 2: Decision Tree Model

36
37
Model 3: Neural Network Model

38
39

You might also like