100% found this document useful (11 votes)

6K views39 pages

UiTM STA555 Project Report Sample

This document discusses a study on predicting whether employees will seek treatment for mental health conditions. It presents three models for analysis: logistic regression, decision tree, and neural network. The literature review finds that logistic regression has been used effectively in previous studies on mental health treatment seeking. Decision trees like the MacNeil-Lichtenberg Decision Tree (MLDT) can help identify patients likely to need cognitive assessments or referrals. The study aims to identify the most suitable predictive model and determine significant variables for predicting treatment seeking.

Uploaded by

Emma

We take content rights seriously. If you suspect this is your content, claim it here.

100% found this document useful (11 votes)

6K views39 pages

UiTM STA555 Project Report Sample

Uploaded by

Emma

We take content rights seriously. If you suspect this is your content, claim it here.

You are on page 1/ 39

1.0 Introduction.............................................................................................................3

2.0 Research Background............................................................................................3

3.0 Research Objective

3.1 Objective 1....................................................................................................4
3.2 Objective 2....................................................................................................4

4.0 Literature Review

4.1 Introduction...................................................................................................5
4.2 Model 1: Logistic Regression Model.........................................................5-6
4.3 Model 2: Decision Tree Model......................................................................6
4.4 Model 3: Neural Network Model................................................................6-7

5.0 Methodology
5.1 Data Collection.............................................................................................8
5.2 Data Description...........................................................................................8
5.3 Model 1: Logistic Regression Model.........................................................8-9
5.4 Model 2: Decision Tree Model.................................................................9-10
5.5 Model 3: Neural Network Model..................................................................10

6.0 Results and Discussion

6.1 Model 1: Logistic Regression Model......................................................11-23
6.2 Model 2: Decision Tree Model...............................................................24-28
6.3 Model 3: Neural Network Model.............................................................29-31
6.4 Model Comparison and Best Model............................................................31

7.0 Conclusion............................................................................................................32

8.0 References.............................................................................................................33

9.0 Appendixes.......................................................................................................34-40

1
1.0 Introduction

Mental health is defined as a state of well-being in which every individual

realizes their own potential, can cope with the normal stresses of life, can work
productively and fruitfully, and is able to make a contribution to their community.

Mental health problems are actually very common. 5 of the 10 leading causes
of disability worldwide are mental health problems. Around 450 million people
suffer from mental disorders and one in four families has at least one member
with a mental disorder at any point in time.

Statistics of mental health in Malaysia shows that every 3 in 10 adults aged 16

years and above have some sorts of mental health problems. The prevalence of
mental health problems among adults increased from 10.7% in 1996, to 11.2%
in 2006, to 29.2% in 2015.

2.0 Research Background

Work is good for mental health but a negative working environment can lead to
physical and mental health problems. There are many risk factors for mental
health that may be present in the working environment. Most risks relate to
interactions between type of work, the organizational environment, the skills of
employees, and the support available for employees to carry out their work.

A healthy workplace can be described as one where workers and managers

actively contribute to the working environment by promoting and protecting the
health, safety and well-being of all employees. An important element of
achieving a healthy workplace is the development of strategies and polices
such as informing staff that mental health support and benefit are available for
them. The organization should also provide with the appropriate mental health
training so that the employees will feel more confident in discussing mental
health matters with the employers.

56 per cent for major depression which is one of mental health problems are the
rates of people who are not seeking for any treatment. There are some factors
that influencing them to not seek for treatment for their mental health problems.
The most common reason for people to not seek treatment for mental health
issue is because they feel a sense of shame in being mentally unfit. This is
because there is a lot of stigma and discrimination associated to such
disorders. Other than that, some of them have lack of support from people
around them such as families, friends and co-workers. They are not willing to
accept or acknowledge a mental health issue a family member or a friend is
suffering from. They prefer living in denial rather than accepting and seeking
treatment.

2
3.0 Research Objectives

There are a few objectives that we are aiming in this study:

3.1 Objective 1

To identify the most suitable model to use between the three models which is
logistic regression model, decision tree model and neural network model.

3.2 Objective 2

To determine which independent variables are significant to predict if the

employees will seek treatment for a mental health condition which is the
dependent variable.

3
4.0 Literature Review

4.1 Introduction

Access to mental health treatment remains a major problem globally, but more
obvious in developing countries. In general, mental health problems even
though are acknowledged as great contributors to the global burden of disease,
they receive little attention at global, regional and local levels compared to other
illnesses such as communicable diseases. Approximately 1 in 4 adults will
experience a mental health problem at some point during their lives. Over the
past decade government policies and funding has been aimed at improving
access to mental health treatment. However barriers to accessing care still
remain.

The most important factors that could influence access to mental health
treatment among people with mental health problems include the perception of
the causes of mental illness. In addition, mental health treatment is scarce for
most of the population resulting in patients and their families using what is
available and also travels long distance to access services. Efforts to improve
access to mental health treatment should be approached holistically, as it is
influenced by social, family and health system factors.

4.2 Model 1: Logistic Regression Model

There are some previous researchers that used logistic regression model in
their studies regarding mental health issues. Posttraumatic stress disorder is
one of the mental health problems. A study in 2007 regarding mental health
treatment seeking by military members with posttraumatic stress disorder found
that a significant portion of military members with posttraumatic stress disorder
seek mental health treatment, 1 in 3 never did. The results of the logistic
regression showed about two-thirds (62.2%) of military members with
posttraumatic stress disorder sought some form of mental health treatment in
their lifetime while a significant portion (35.2%) never sought any form of mental
health treatment.

Another study in 2014 regarding mental health treatment in the primary care
setting found that 30% of the adult population has a mental health disorder
within any 12-month period and most of them will be diagnosed, treated, and
managed in primary care. 8.1% of 184636 patients had poor mental health. For
this group, 49.5% of them obtained care from only a primary care physician for
treatment, 5.0% obtained care from only a mental health provider and just
13.6% received care from both mental health and primary care providers.

4
Approximately 28.6% of adults with better mental health did not report any
mental health treatment visits compared with 17.7% of adults with poor mental
health. The study also found that patients who obtained care solely from
primary care providers tended to be female, of lower income, have less
schooling, and were older than persons who obtained care solely from mental
health providers.

4.3 Model 2: Decision Tree Model

Based on the previous study, The MacNeil-Lichtenberg Decision Tree (MLDT)

was develop to guide decision making. MLDT is composed of a cognitive
component and affective component. The objective of cognitive component is,
to identify those patients with high probability for cognitive impairment, to
quickly target specific referral questions and to reduce the number of
unnecessary mental health referrals.

In study 1 (Utility of Cognitive Component of The MLDT), the data from a

sample of 173 inpatients were utilized to evaluate cognitive components of the
MLDT. In this study they measure the MLDT based on three test which is
Benton Temporal Orientation Test, Animal naming set and Psychosocial
considerations. Based on this three test, if the patients score in the impaired
range on either animal naming or orientation, and they report at least one
positive psychosocial indicator (a “yes” response to either indicator 1 or 2; a
“no” response to 3), a referral for complete cognitive assessment is
recommended. Based on the study 1, the result shows that MMSE and MDRS
were significantly correlated with education, MLDT cognitive measures are not.

Study 2 is regarding Utility of Emotional Status Component of the MLDT. The

result of sensitivity of the GDS was 76% and specificity was 80%. The positive
predictive power shows the percentage of individual classified as depressed
who were actually depressed based on the GDS total score was 57%. The
decision tree was designed to help health care professional quickly triage needs
for cognitive assessment and depression assessment in older adult.

4.4 Model 3: Neural Network Model

A study regarding child mental health disorders by using neural network model
was conducted in 2011. The researchers did many experiments for finding
better neural network structure for suiting the child mental health intelligent
diagnosis system. The researchers think that full connecting mode suit the
medical system better and at the same time adding suitable hidden node
number can improve convergence effect and reduce error of network. But
adding hidden layer number does not always improve network convergence
effect under the experiment condition.

5
The study found that diagnosis and therapy system of child mental health
disorders can diagnose 61 kind child mental health disorders. This includes
more than 95% child mental health disorders such as hyperactivity, conduct
disorder, tic disorder, depression and anxiety. Moreover, after each diagnosis,
the computer will give a treatment method suggesting. Comparing the diagnosis
by computer with the senior child psychiatrists the diagnosis consistent rate is
99%.

Another study in 2002 was conducted regarding the analysis of common mental
disorders factors by using neural networks. The aim of the study is to analyse
common mental disorders factors using multilayer perceptron trained with
simulated annealing algorithm. The study found that by using neural networks
model, the variables which showed higher relation with common mental
disorders were years of schooling, marital status, sex, working conditions,
possession of house, incoming and age. The variable that is more associated
with common mental disorders is years of schooling with 89,29%.

6
5.0 Methodology

5.1 Data Collection

The dataset was taken from kaggle.com. The data was made public, which
gives us an interesting opportunity to analyze the attitudes of tech-workers from
48 different countries towards mental health. The data is ordered by date from
August 2014 until February 2016. There are 1260 responses with 26 different
variables in the dataset.

5.2 Data Description

The data is related with attitudes towards mental health and frequency of
mental health disorders in the tech workplace. This survey had questions
pertaining to how mental health is perceived at tech workplaces by employees
and their employers. In this study, 1 dependent variable and 7 independent
variables are selected from the dataset.

The dependent variable or target variable chosen is treatment which represents

binary outcomes whether the employees seek for treatment of mental health
condition or not. The independent variables or input variables chosen are age,
gender, work interfere, family history, benefits, leave and mental health
consequence. Age has an interval level while gender has a nominal level. Work
interfere is about whether the employees feel that their mental health condition
interferes with their work if they have any mental health condition. The work
interfere level is nominal. Next, the variable family history is about the
employee’s family history of mental illness and the level is binary. Benefits are
about the mental health benefits that are provided by the employer. Leave is
about how easy is it for the employees to take medical leave for a mental health
condition and mental health consequence is about the employees’ opinion
whether discussing a mental health issue with their employer would have
negative consequences or not. The benefits, leave and mental health
consequence have nominal level of variable.

5.3 Model 1: Logistic Regression Model

Logistic regression is able to predict the presence or absence of a characteristic

or outcome based on values of a set of predictor variables. It is similar to a
linear regression model but is suited to models where the dependent variable is
binary outcomes that take on two values, 1 or 0.

The estimated model of logistic regression model is

1
𝑙𝑜𝑔𝑖𝑡 = ln ( ) = 𝐵0 + 𝐵1𝑋
1−𝑝
.

7
While the odd of logistic function is

𝑝
𝑂𝑑𝑑𝑠 = ( ) = 𝑒 (𝐵0+𝐵1𝑋)
1−𝑝

Logistic regression coefficients can be used to estimate odds ratios for each of
the independent variables in the model.

The goal is to estimate the probability that an event occur, p. A method called
maximum likelihood is used, to find the best-fit line for logistic regression.
Logistic regression does not rely on distributional assumptions in the same
sense that discriminant analysis does. As with other forms of regression,
multicollinearity among the predictors can lead to biased estimates and inflated
standard errors.

Method selection allows the user to specify how independent variables are
entered into the analysis. There are 3 different methods that can construct a
variety of regression models from the same set of variables which is forward,
backward and stepwise selection method. The significance values in the output
are based on fitting a single model. Therefore, the significance values are
generally invalid when a stepwise method is used. All independent variables
selected are added to a single regression model. However, different entry
methods can be specified for different subsets of variables.

5.4 Model 2: Decision Tree Model

A decision tree is a hierarchical collection of rules that describes how to divide a

large collection of records into successively smaller groups of records. With
each successive division, the members of the resulting segments become more
and more similar to one another with respect to the target.

Decision tree uses the target variable to determine how each input should be
partitioned. In the end, the decision tree breaks the data into segments, defined
by the splitting rules at each step. Taken together, the rules for all the segments
form the decision tree model.

Decision tree repeatedly splits the data set according to a criterion that
maximizes the separation of the data, resulting in a tree-like structure. The most
common criterion is information gain. This means that at each split, the
decrease in entropy due to this split is maximized.

The goal is to build a tree that uses the values of the input fields to create rules
that result in leaves that do a good job of assigning a target value to each
record. The first task is to split the records into children by creating a rule on the
input variables. To perform the split, the algorithm considers all possible splits
on all input variables. The measure used to evaluate a potential split is purity of

8
the target variable in the children. The best split is the one that increases purity
in the children by the greatest amount creates nodes of similar size, or at least
does not create nodes containing very few records

5.5 Model 3: Neural Network Model

The structure of a typical neural network consists of an input layer, a hidden

layer, and an output layer. Data enters the network through input layer. The
hidden layer comprised of artificial neurons, each of which receives multiple
inputs from the input layer. Output layer is a layer that combines results
summarized by the artificial neurons.

A neural network can have any number of hidden layers, but in general, one
hidden layer is sufficient. The wider the layer, the greater the capacity of the
network to recognize patterns. The neural network will have greater capability in
memorizing the pattern in the training set. This will result in overfitting.

Neural networks are good for prediction and estimation problems. A good
problem has the characteristics such as the inputs are well understood which
means the user have a good idea of which features of the data are important,
but not necessarily how to combine them. Other than that, the output is well
understood. This means the user should know what they are trying to model.

There a few keys to use neural networks successfully. The most important
issue is choosing the right training set. Second, the data must be represented in
such a way as to maximize the ability of the network to recognize patterns in it.
Next, the results produced by the network must be interpreted. Finally,
understand some specific details about how they work, such as network
topology and parameters controlling training.

9
6.0 Results and Discussion

6.1 Model 1: Logistic Regression Model

Confusion Matrix

1. Backward Elimination

TRAIN
Predicted
1 0 Total
Actual 1 296 54 350
0 105 236 341
Total 401 290 691

VALIDATE
Predicted
1 0 Total
Actual 1 237 50 287
0 104 177 281
Total 341 227 568

10
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 296
= = 0.84571
(𝑇𝑃 + 𝐹𝑁) (296 + 54)

Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.84571.

Validate:
𝑇𝑃 237
= = 0.82578
(𝑇𝑃 + 𝐹𝑁) (237 + 50)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.82578.

ii) True Negative Rates (TNR), specificity:

Train:
𝑇𝑁 236
= = 0.69208
(𝑇𝑁 + 𝐹𝑃) (236 + 105)

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.69208.

Validate:
𝑇𝑁 177
= = 0.62989
(𝑇𝑁 + 𝐹𝑃) (177 + 104)

Conclusion: The model’s ability to predict negative outcome correctly for

validate is 0.62989.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (296 + 236)
= = 0.76990
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (296 + 236 + 105 + 54)

Conclusion: The model’s ability to predict both positive or negative outcome for
train is 0.76990.

Validate:
(𝑇𝑃 + 𝑇𝑁) (237 + 177)
= = 0.72887
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (50 + 177 + 104 + 237)

Conclusion: The model’s ability to predict both positive or negative outcome for
validate is 0.72887.

11
2. Forward Selection

TRAIN
Predicted
1 0 Total
Actual 1 292 58 350
0 98 243 341
Total 390 301 691

VALIDATE
Predicted
1 0 Total
Actual 1 235 52 287
0 101 180 281
Total 336 232 568

12
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 292
= = 0.83429
(𝑇𝑃 + 𝐹𝑁) (292 + 58)

Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.83429.

Validate:
𝑇𝑃 235
= = 0.81882
(𝑇𝑃 + 𝐹𝑁) (235 + 52)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.81882.

ii) True Negative Rates (TNR), specificity:

Train:

𝑇𝑁 243
= = 0.71261
(𝑇𝑁 + 𝐹𝑃) (243 + 98)

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.71261.

Validate:
𝑇𝑁 180
= = 0.64057
(𝑇𝑁 + 𝐹𝑃) (180 + 101)

Conclusion: The model’s ability to predict negative outcome correctly for

validate is 0.64057.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (292 + 243)
= = 0.77424
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (292 + 243 + 58 + 98)

Conclusion: The model’s ability to predict both positive or negative outcome for
the train is 0.77424.

Validate:
(𝑇𝑃 + 𝑇𝑁) (235 + 180)
= = 0.73063
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (52 + 180 + 101 + 235)

Conclusion: The model’s ability to predict both positive or negative outcome for
the validate is 0.73063.

13
3. Stepwise Regression

TRAIN
Predicted
1 0 Total
Actual 1 296 54 350
0 105 236 341
Total 401 290 691

VALIDATE
Predicted
1 0 Total
Actual 1 237 50 287
0 104 177 281
Total 341 227 568

14
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 296
= = 0.84571
(𝑇𝑃 + 𝐹𝑁) (296 + 54)

Conclusion: The model’s ability to predict positive outcome correctly for train is
0.84571.

Validate:
𝑇𝑃 237
= = 0.82578
(𝑇𝑃 + 𝐹𝑁) (237 + 50)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.82578.

ii) True Negative Rates (TNR), specificity:

Train:

𝑇𝑁 236
= = 0.69208
(𝑇𝑁 + 𝐹𝑃) (236 + 105)

Conclusion: The model’s ability to predict negative outcome correctly for the
train is 0.69208.

Validate:
𝑇𝑁 177
= = 0.62989
(𝑇𝑁 + 𝐹𝑃) (177 + 104)

Conclusion: The model’s ability to predict negative outcome correctly for

validate is 0.62989.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (296 + 236)
= = 0.76990
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (54 + 236 + 105 + 296)

Conclusion: The model’s ability to predict both positive or negative outcome for
the train is 0.76990.

Validate:
(𝑇𝑃 + 𝑇𝑁) (237 + 177)
= = 0.72887
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (50 + 177 + 104 + 237)

Conclusion: The model’s ability to predict both positive or negative outcome for
the validate is 0.72887.

15
Model Interpretation

1. Backward Elimination

From the output above, the variables that significant which is have
p-value < 0.05 are IMP_REP_work_interfere, benefits, family_history, and
leave.

16
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.079-1)*100 The odd of never have work interfere
Never vs Sometimes =92.1% is 92.1% lower than sometimes work
interfere.
IMP_REP_work_interfere (2.001-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =100.1% 100.1% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.638-1)*100 The odd of rarely have work interfere
Rarely vs Sometimes =63.8% is 63.8% higher than sometimes work
interfere.
benefits (0.334-1)*100 The odd of benefits ‘Don’t know’ is
=66.6% 66.6% lower than benefits ‘Yes’.
benefits (0.458-1)*100 The odd of benefits ‘no’ is 54.2%
=54.2% lower than benefits ‘yes’.
Family_history (0.308-1)*100 The odd of did not have family history
=69.2% in mental health is 69.2% lower than
have family history in mental health.
leave (0.787-1)*100 The odd of leave ‘don’t know’ is
=21.3% 21.3% lower than leave ‘very easy’.
leave (1.718-1)*100 The odd of leave ‘somewhat difficult’
=71.8% is 71.8% higher than leave ‘very
easy’.
leave (0.564-1)*100 The odd of leave ‘somewhat easy’ is
=43.6% 43.6% lower than leave ‘very easy’.
leave (2.132-1)*100 The odd of leave ‘very difficult’ is
=113.2% 113.2% higher than leave ‘very easy’.

Logistic function:

𝑝
𝑙𝑛 (1−𝑝) =

0.3871-2.1966*IMP_REP_work_interfere(Never)+1.0303*IMP_REP_work_
interfere(Often)+0.8297*IMP_REP_work_interfere(Rarely)-0.4708*benefits
(Don’t know)-0.1552*benefits(no)-0.5880*family_history(no)-0.3371*leave(Don’t
know)+0.4441*leave(somewhatdifficult)-0.6700*leave(somewhat easy)
+0.6600*leave(very difficult)

17
2. Forward Selection

Based on the output above, the variables that significant are

IMP_REP_work_interfere, REP_Gender, benefits, family_history and leave.

18
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.082-1)*100 The odd of never have work interfere is
Never vs Sometimes =91.8% 91.8% lower than sometimes work
interfere.
IMP_REP_work_interfere (1.964-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =96.4% 96.4% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.613-1)*100 The odd of rarely have work interfere is
Rarely vs Sometimes =61.3% 61.3% higher than sometimes work
interfere.
Benefits (0.348-1)*100 The odd of benefits ‘Don’t know’ is 65.2%
Don’t Know vs Yes =65.2% lower than benefits ‘Yes’.
Benefits (0.480-1)*100 The odd of benefits ‘no’ is 52% lower than
No vs Yes =52% benefits ‘yes’.
Family_history (0.324-1)*100 The odd of did not have family history in
No vs Yes =67.6% mental health is 67.6% lower than have
family history in mental health.
Leave (0.771-1)*100 The odd of leave ‘don’t know’ is 22.9%
Don’t Know vs Vey Easy =22.9% lower than leave ‘very easy’.
Leave (1.548-1)*100 The odd of leave ‘somewhat difficult’ is
Somewhat difficult vs =54.8% 54.8% higher than leave ‘very easy’.
Very easy
Leave (0.568-1)*100 The odd of leave ‘somewhat easy’ is
Somewhat easy vs very =43.2% 43.2% lower than leave ‘very easy’.
easy
Leave (2.136-1)*100 The odd of leave ‘very difficult’ is 113.6%
Very difficult vs Very =113.6% higher than leave ‘very easy’.
easy

Logistic function:

𝑝
𝑙𝑛 ( ) =
1−𝑝

3.8142-2.1678*IMP_REP_work_interfere(Never)+1.0134*IMP_REP_work_
interfere(Often) +0.8162* IMP_REP_work_interfere(Rarely)-3.0183*
REP_Gender(Female)-3.5612*REP_Gender(Male)-0.4597*benefits(Don’t
know)-0.1371*benefits(no)-0.5642*family_history(no)-0.3336*leave(Don’t
know)+0.3627*leave(somewhat difficult)-0.6402*leave
(somewhat easy)+0.6850*leave(very difficult)

19
3. Stepwise Regression

Based on the output above, the variables that significant are

IMP_REP_work_interfere, benefits, family_history and leave.

20
X Odd Ratio Interpretation
IMP_REP_work_interfere (0.079-1)*100 The odd of never have work interfere is
Never vs Sometimes =92.1% 92.1% lower than sometimes work
interfere.
IMP_REP_work_interfere (2.001-1)*100 The odd of often in work interfere is
Oftten vs Sometimes =100.1% 100.1% higher than sometimes work
interfere.
IMP_REP_work_interfere (1.638-1)*100 The odd of rarely have work interfere is
Rarely vs Sometimes =63.8% 63.8% higher than sometimes work
interfere.
benefits (0.334-1)*100 The odd of benefits ‘Don’t know’ is
=66.6% 66.6% lower than benefits ‘Yes’.

benefits (0.458-1)*100 The odd of benefits ‘no’ is 54.2% lower

=54.2% than benefits ‘yes’.

Family_history (0.308-1)*100 The odd of did not have family history

=69.2% in mental health is 69.2% lower than
have family history in mental health.
leave (0.787-1)*100 The odd of leave ‘don’t know’ is 21.3%
=21.3% lower than leave ‘very easy’.

leave (1.718-1)*100 The odd of leave ‘somewhat difficult’ is

=71.8% 71.8% higher than leave ‘very easy’.

leave (0.564-1)*100 The odd of leave ‘somewhat easy’ is

=43.6% 43.6% lower than leave ‘very easy’.

leave (2.132-1)*100 The odd of leave ‘very difficult’ is

=113.2% 113.2% higher than leave ‘very easy’.

Logistic function:

𝑝
𝑙𝑛 (1−𝑝) =

21
Model Selection

Misclassification Rate Mean Square Error ROC Index

Model Valid Train Gap Valid Train Gap Valid Train Gap
Description
Backward 0.27113 0.23010 0.04103 0.17872 0.16018 0.01854 0.805 0.843 -0.038
Elimination
Forward 0.26937 0.22576 0.04361 0.17636 0.15851 0.01785 0.81 0.849 -0.039
Selection
Stepwise 0.27113 0.23010 0.04103 0.17872 0.16018 0.01854 0.805 0.843 -0.038
Regression

The best model between Backward Elimination, Forward Selection and

Stepwise Regression is the Forward Selection. This is because Forward
Selection have smallest value of gap of the misclassification rate, mean square
error, and ROC index between valid and train.

There is no underfit model since there is no negative value of gap of the mean
square error between train and valid. And there is no overfit model since there
is no highest value of gap between valid and train.

22
6.2 Model 2: Decision Tree Model

Confusion Matrix

TRAIN
Predicted
Actual 1 0 Total
1 331 19 350
0 93 248 341
Total 424 267 691

VALIDATE
Predicted
Actual 1 0 Total
1 272 15 287
0 86 195 281
Total 358 210 568

23
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 331
= = 0.8073
(𝑇𝑃 + 𝐹𝑁) (331 + 19)

Conclusion: The model’s ability to predict positive outcome correctly for the train
is 0.8073.

Validate:

𝑇𝑃 272
= = 0.94774
(𝑇𝑃 + 𝐹𝑁) (272 + 15)

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.94774.

ii) True Negative Rates (TNR), specificity:

Train:
𝑇𝑁 248
= = 0.7273
(𝑇𝑁 + 𝐹𝑃) (248 + 93)

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.7273.

Validate:

𝑇𝑁 195
= = 0.69395
(𝑇𝑁 + 𝐹𝑃) (195 + 86)

Conclusion: The model’s ability to predict negative outcome correctly for

validate is 0.69395.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (331 + 248)
= = 0.8379
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (331 + 248 + 93 + 19)
Conclusion: The model’s ability to predict both positive and negative outcome is
0.8379.

Validate:
(𝑇𝑃 + 𝑇𝑁) (272 + 195)
= = 0.82218
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (272 + 195 + 86 + 15)

Conclusion: The model’s ability to predict both positive and negative outcome is
0.82218.

24
Model Interpretation

The most important variable is Replacement: work interfere.

There are 6 rules that represented by the number of leaf.
Depth is 2.

The most important variable is work interfere.

There are 9 important variable ranked by the value of “IMPORTANT COLUMN”
There are 5 variables that used as splitting variable only one time.
The rest of the variables are not used as splitting variables.

25
*------------------------------------------------------------*
Node = 4
*------------------------------------------------------------*
if work_interfere <= NA or MISSING
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 4
Number of Observations = 151
Predicted: treatment=Yes = 0.01
Predicted: treatment=No = 0.99

*------------------------------------------------------------*
Node = 7
*------------------------------------------------------------*
if benefits IS ONE OF: YES
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 7
Number of Observations = 178
Predicted: treatment=Yes = 0.89
Predicted: treatment=No = 0.11

*------------------------------------------------------------*
Node = 8
*------------------------------------------------------------*
if work_interfere >= NEVER
AND mental_health_consequence IS ONE OF: NO, YES or MISSING
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 8
Number of Observations = 74
Predicted: treatment=Yes = 0.08
Predicted: treatment=No = 0.92

*------------------------------------------------------------*
Node = 9
*------------------------------------------------------------*
if work_interfere >= NEVER
AND mental_health_consequence IS ONE OF: MAYBE
AND Replacement: work_interfere <= NEVER or MISSING
then
Tree Node Identifier = 9
Number of Observations = 42
Predicted: treatment=Yes = 0.26
Predicted: treatment=No = 0.74

26
*------------------------------------------------------------*
Node = 10
*------------------------------------------------------------*
if family_history IS ONE OF: NO
AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 10
Number of Observations = 122
Predicted: treatment=Yes = 0.60
Predicted: treatment=No = 0.40
*------------------------------------------------------------*
Node = 11
*------------------------------------------------------------*
if family_history IS ONE OF: YES or MISSING
AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
then
Tree Node Identifier = 11
Number of Observations = 124
Predicted: treatment=Yes = 0.81
Predicted: treatment=No = 0.19

There are 3 profiles for treatment = Yes

if benefits IS ONE OF: YES
AND Replacement: work_interfere >= OFTEN

if family_history IS ONE OF: NO

AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN

if family_history IS ONE OF: YES or MISSING

AND benefits IS ONE OF: NO, DON'T KNOW or MISSING
AND Replacement: work_interfere >= OFTEN
There are 3 profiles for treatment = No
if work_interfere <= NA or MISSING
AND Replacement: work_interfere <= NEVER or MISSING

if work_interfere >= NEVER

AND mental_health_consequence IS ONE OF: NO, YES or MISSING
AND Replacement: work_interfere <= NEVER or MISSING

if work_interfere >= NEVER

AND mental_health_consequence IS ONE OF: MAYBE
AND Replacement: work_interfere <= NEVER or MISSING

27
6.3 Model 3: Neural Network Model

Confusion Matrix

TRAIN
Predicted
Actual 1 0 Total
1 281 69 350
0 73 268 341
Total 354 337 691

VALIDATE
Predicted
Actual 1 0 Total
1 224 63 287
0 101 180 281
Total 325 243 568

28
i) True Positive Rates (TPR), sensitivity:
Train:
𝑇𝑃 281
= = 0.8029
(𝑇𝑃 + 𝐹𝑁) 281 + 69

Conclusion: The model’s ability to predict positive outcome correctly for train is
0.8029.

Validate:
𝑇𝑃 224
= = 0.7805
(𝑇𝑃 + 𝐹𝑁) 224 + 63

Conclusion: The model’s ability to predict positive outcome correctly for validate
is 0.7805.

ii) True Negative Rates (TNR), specificity:

Train:
𝑇𝑁 268
= = 0.7860
(𝑇𝑁 + 𝐹𝑃) 268 + 73

Conclusion: The model’s ability to predict negative outcome correctly for train is
0.7860.

Validate:
𝑇𝑁 180
= = 0.6406
(𝑇𝑁 + 𝐹𝑃) 180 + 101

Conclusion: The model’s ability to predict negative outcome correctly for

validate is 0.6406.

iii) Accuracy:
Train:
(𝑇𝑃 + 𝑇𝑁) (281 + 268)
= = 0.7945
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (281 + 268 + 73 + 69)

Conclusion: The model’s ability to predict both positive or negative outcome for
train is 0.7945.

Validate:

(𝑇𝑃 + 𝑇𝑁) (224 + 180)

= = 0.7113
(𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁) (224 + 180 + 101 + 63)

Conclusion: The model’s ability to predict both positive or negative outcome for
validate is 0.7113.

29
Based on the output above, there are a few independent variables that are not
significant to the neural network model which is age, work interfere and leave.

6.4 Model Comparison and Best Model

Misclassification Rate Average Square Error ROC Index

Model Valid Train Gap Valid Train Gap Valid Train Gap
Description
Logistic 0.26937 0.22576 0.04361 0.17872 0.16018 0.01854 0.81 0.849 -0.039
Regression
Decision 0.17782 0.16208 0.01574 0.12799 0.11872 0.00927 0.879 0.896 -0.017
Tree
Neural 0.28873 0.20550 0.08323 0.19317 0.13547 0.0577 0.785 0.886 -0.101
Network

There is and no underfit model because none of the three models has positive
value of ROC index gap. The overfit model among the three model is neural
network model because it has the largest gap for misclassification rate and
average squared error.

The best model is decision tree because it has the smallest gap for
misclassification rate, average squared error and ROC index. The independent
variables that are significant to the decision tree model are work interfere,
mental health consequence, family history and benefits.

30
7.0 Conclusion

Seeking for a mental health treatment is very important to people who suffer from it. In
this study, we are able to identify the most suitable model to use between the three
models which is logistic regression model, decision tree model and neural network
model. The best model among the three models is decision tree. This is because
decision tree model has the smallest gap of misclassification rate, average squared
error and ROC index.

Besides that, we are able to determine which independent variables are significant to
predict if the employees will seek treatment for a mental health condition. Based on the
decision tree model, we found that work interfere, mental health consequence, family
history and benefits are the variables that are significant to the model.

31
8.0 References

Malaysia Mental Health Association (MMHA). What is Mental Health. Retrieved from
http://mmha.org.my/what-is-mental-health/

World Health Organization. Mental health in the workplace. (2017, September).

Retrieved from http://www.who.int/mental_health/in_the_workplace/en/

Dr Samuel B Harvey. Developing a mentally healthy workplace: A review of the

literature. (November 2014). Retrieved from
https://www.headsup.org.au/docs/default-source/resources/developing-a-mentally-
healthy-workplace_final-november-2014.pdf?sfvrsn=8

Chen, Fan, Zhou, Li. Neural Network Structure Study In Child Mental Health Disorders
Intelligent Diagnosis System. (November 2011). Retrieved from
https://www.sciencedirect.com/science/article/pii/S1878029611007390?via%3Dihub

C. R. S. Lopes. Neural Networks for the analysis of Common Mental Disorders Factors.
(2002). Retrieved from
https://www.computer.org/csdl/proceedings/sbrn/2002/1709/00/17090114.pdf

Petterson, Miller, Payne, Phillips. Mental health treatment in the primary care setting:
patterns and pathways. (June 2014). Retrieved from
https://www.ncbi.nlm.nih.gov/pubmed/24773273

Fikretoglu, Brunet, Guay, Pedlar. Mental health treatment seeking by military members
with posttraumatic stress disorder: findings on rates, characteristics, and predictors
from a nationally representative Canadian military sample. (February 2007). Retrieved
from https://www.ncbi.nlm.nih.gov/pubmed/17375866

MacNeill, SE, Lichtenberg, PA. The MacNeill-Lichtenberg Decision Tree: a unique

method of triaging mental health problems in older medical rehabilitation patients. Arch
Phys Med Rehabil. 2000;81:618–622.

https://www.kaggle.com/diegocalvo/data-mining-of-mental-health/data

32
9.0 Appendixes

Model 1: Logistic Regression Model

33
34
35
Model 2: Decision Tree Model

36
37
Model 3: Neural Network Model

38
39

Transition Metal Chemistry
100% (1)
Transition Metal Chemistry
36 pages
Case Study DPB2012
60% (5)
Case Study DPB2012
16 pages
Why Do You Think Mariana Endured Her Bullies?: (248 Words)
100% (1)
Why Do You Think Mariana Endured Her Bullies?: (248 Words)
1 page
Field Report LCC
100% (2)
Field Report LCC
3 pages
12th Std Bio-Botany Workbook
No ratings yet
12th Std Bio-Botany Workbook
35 pages
Business Analytics for Students
No ratings yet
Business Analytics for Students
10 pages
Abeka Science 5 Exam
No ratings yet
Abeka Science 5 Exam
3 pages
Article Analysis - LCC401
100% (3)
Article Analysis - LCC401
6 pages
EWC 661 Proposal
No ratings yet
EWC 661 Proposal
6 pages
Htf683 Individual Report - Nurul Aina Binti Muhamad Sukor
100% (1)
Htf683 Individual Report - Nurul Aina Binti Muhamad Sukor
9 pages
LCC401 Sample of Outline - Healthy Lifestyle E-Magazine
No ratings yet
LCC401 Sample of Outline - Healthy Lifestyle E-Magazine
5 pages
MKT558 - Individual Assignment
No ratings yet
MKT558 - Individual Assignment
17 pages
Solution Final Exam Jan 2018
No ratings yet
Solution Final Exam Jan 2018
3 pages
Mkt558 Draft Individual Assignment
60% (5)
Mkt558 Draft Individual Assignment
3 pages
Cambridge International AS & A Level: BIOLOGY 9700/34
No ratings yet
Cambridge International AS & A Level: BIOLOGY 9700/34
16 pages
Thesis Proposal Breast Cancer
100% (3)
Thesis Proposal Breast Cancer
7 pages
Group 3-Report Mgt555
No ratings yet
Group 3-Report Mgt555
19 pages
Biology P1 QS Maranda High School Mock 2022
No ratings yet
Biology P1 QS Maranda High School Mock 2022
13 pages
MGT420 Reflection Essay Nur Iylia Syafiqah 2023983897
No ratings yet
MGT420 Reflection Essay Nur Iylia Syafiqah 2023983897
2 pages
MGT321 July24
No ratings yet
MGT321 July24
5 pages
MGT345 Past Year July 2024 2
100% (1)
MGT345 Past Year July 2024 2
7 pages
Lab Assignment MAT631
No ratings yet
Lab Assignment MAT631
8 pages
Past Year Questions Sta404 Feb 2023 Ac
100% (1)
Past Year Questions Sta404 Feb 2023 Ac
12 pages
Lesson 1
No ratings yet
Lesson 1
14 pages
Records Management for Students
No ratings yet
Records Management for Students
10 pages
Regeneration Types Explained
No ratings yet
Regeneration Types Explained
7 pages
Final Report Ent300
No ratings yet
Final Report Ent300
81 pages
Mgt400 Group Assignment 1 Final
100% (1)
Mgt400 Group Assignment 1 Final
19 pages
Mental Health Help-Seeking Among University Students Draft
No ratings yet
Mental Health Help-Seeking Among University Students Draft
8 pages
Jipi 2
No ratings yet
Jipi 2
18 pages
Acc466 Group Project 2
100% (1)
Acc466 Group Project 2
13 pages
STA104 Assignment
100% (2)
STA104 Assignment
17 pages
Dynamics of Bone Tissue Formation in Tooth Extraction Sites An Experimental Study in Dogs
100% (1)
Dynamics of Bone Tissue Formation in Tooth Extraction Sites An Experimental Study in Dogs
11 pages
Practice Test 5: Full Name: Class
No ratings yet
Practice Test 5: Full Name: Class
6 pages
Hotel & Tourism Management Guide
No ratings yet
Hotel & Tourism Management Guide
23 pages
Elc550 Test Brain Drain May 2021
No ratings yet
Elc550 Test Brain Drain May 2021
5 pages
MGT361 - Starbucks in Usa (Marketplaces of North America)
100% (1)
MGT361 - Starbucks in Usa (Marketplaces of North America)
15 pages
Assingment 1 - Case Study
No ratings yet
Assingment 1 - Case Study
23 pages
Administrative Law Insights
100% (1)
Administrative Law Insights
7 pages
Eco162 Assignment
No ratings yet
Eco162 Assignment
16 pages
Herbicide Use and Mode of Action Guide
100% (2)
Herbicide Use and Mode of Action Guide
28 pages
Ent600 Company Analysis Report Jajarich Enterprise
No ratings yet
Ent600 Company Analysis Report Jajarich Enterprise
10 pages
16 Habits You Should Do Every Day - KratosGuide
No ratings yet
16 Habits You Should Do Every Day - KratosGuide
11 pages
Asm657 Individual Assignment 1 - Siti Nur Adilla Binti Shaiful Bahrin - 2020980731
100% (1)
Asm657 Individual Assignment 1 - Siti Nur Adilla Binti Shaiful Bahrin - 2020980731
8 pages
GE Healthcare's Strategic Evolution
No ratings yet
GE Healthcare's Strategic Evolution
4 pages
MGT420 Reflection
No ratings yet
MGT420 Reflection
7 pages
Set 1 Speaking Test Elc 151
100% (1)
Set 1 Speaking Test Elc 151
7 pages
Business Proposal ICT552
No ratings yet
Business Proposal ICT552
18 pages
LAW507 - June 2022 - TEST QUESTION
No ratings yet
LAW507 - June 2022 - TEST QUESTION
9 pages
01 Lao Eia Nammang3
No ratings yet
01 Lao Eia Nammang3
504 pages
Microbiology of Food and Animal Feeding
No ratings yet
Microbiology of Food and Animal Feeding
22 pages
Ctu695 - Hm2405a - Group Assignment - Group 7
No ratings yet
Ctu695 - Hm2405a - Group Assignment - Group 7
17 pages
My Number Book
No ratings yet
My Number Book
52 pages
Individual Assignment 1 (Flowchart) - Edward Kudi Joseph
No ratings yet
Individual Assignment 1 (Flowchart) - Edward Kudi Joseph
5 pages
Mgt321 - Case Study
No ratings yet
Mgt321 - Case Study
15 pages
Sta104 Assignment PDF
100% (1)
Sta104 Assignment PDF
18 pages
Bioprocess Assignment Derive Equations
No ratings yet
Bioprocess Assignment Derive Equations
12 pages
Dhatu Intro
No ratings yet
Dhatu Intro
11 pages
Wedding Insurance in Malaysia
No ratings yet
Wedding Insurance in Malaysia
12 pages
Balazs & Istvan - Nature's I.Q. (2002)
No ratings yet
Balazs & Istvan - Nature's I.Q. (2002)
161 pages
Full Reports Ict501
No ratings yet
Full Reports Ict501
38 pages
Msds PDF
No ratings yet
Msds PDF
6 pages
Answer Script: Universiti Teknologi Mara Test 1
No ratings yet
Answer Script: Universiti Teknologi Mara Test 1
7 pages
Nur Ainina Najwa BT Zamzuri - Test 2
No ratings yet
Nur Ainina Najwa BT Zamzuri - Test 2
12 pages
Vaccination Attitudes in Sungai Petani
No ratings yet
Vaccination Attitudes in Sungai Petani
65 pages
Solution Final Exam STA404 Dec 2015
100% (1)
Solution Final Exam STA404 Dec 2015
6 pages
CSC 584 Proposal
No ratings yet
CSC 584 Proposal
14 pages
Amanda Ivester - Mental Health Treatment - Behavior Paper
No ratings yet
Amanda Ivester - Mental Health Treatment - Behavior Paper
15 pages
Just Enough Med Bed Holographic Technology (Skye Prince, Randy Cramer)
No ratings yet
Just Enough Med Bed Holographic Technology (Skye Prince, Randy Cramer)
116 pages
Final Report of Sta116
No ratings yet
Final Report of Sta116
25 pages
Lowering Carbohydrates: Nutrition Program
No ratings yet
Lowering Carbohydrates: Nutrition Program
6 pages
LCC402 English For Oral Reporting
No ratings yet
LCC402 English For Oral Reporting
5 pages
Reflection Report Sulam Project Muhamad Akmal Bin Azman
No ratings yet
Reflection Report Sulam Project Muhamad Akmal Bin Azman
14 pages
Spiritual & Consciousness Journal
No ratings yet
Spiritual & Consciousness Journal
8 pages
Business Analytics (MGT 555) Individual Assignment Assignment 2
No ratings yet
Business Analytics (MGT 555) Individual Assignment Assignment 2
5 pages
Sta108 - Group Project Assignment
No ratings yet
Sta108 - Group Project Assignment
8 pages
Sta404 July 2021 (Answer Scheme)
No ratings yet
Sta404 July 2021 (Answer Scheme)
7 pages
Universiti Teknologi Mara: Confidential 1 LG/APR 2019/ELC231/230/SET 2
No ratings yet
Universiti Teknologi Mara: Confidential 1 LG/APR 2019/ELC231/230/SET 2
8 pages
MGT345 Past Year July2023 3
No ratings yet
MGT345 Past Year July2023 3
7 pages
Ent300 Past Years Questions
No ratings yet
Ent300 Past Years Questions
66 pages
O Level Biology Exam Paper 5090/01
No ratings yet
O Level Biology Exam Paper 5090/01
16 pages
Fundamental Unit of Life L-1 - Class 9th - Aarushi Ma'am - Bio
No ratings yet
Fundamental Unit of Life L-1 - Class 9th - Aarushi Ma'am - Bio
36 pages
PD HooN
No ratings yet
PD HooN
13 pages
VitaPCR User Manual Guide
No ratings yet
VitaPCR User Manual Guide
16 pages
Psychology 3E: Saundra K. Ciccarelli, J. Noland White
No ratings yet
Psychology 3E: Saundra K. Ciccarelli, J. Noland White
32 pages
Mat523 Mini Project PDF
No ratings yet
Mat523 Mini Project PDF
37 pages
Error Detection Assignment - 221208 - 160356 - 221208 - 160425
No ratings yet
Error Detection Assignment - 221208 - 160356 - 221208 - 160425
3 pages
Present Continuous Tense
No ratings yet
Present Continuous Tense
6 pages
Balanta de Verificare Luna Decembrie 2018: Clasa 1
No ratings yet
Balanta de Verificare Luna Decembrie 2018: Clasa 1
5 pages
(Latest Edited) Full Note Sta404 - 01042022
No ratings yet
(Latest Edited) Full Note Sta404 - 01042022
108 pages

UiTM STA555 Project Report Sample

Uploaded by

UiTM STA555 Project Report Sample

Uploaded by

CONTENTS

2.0 Research Background............................................................................................3

3.0 Research Objective

4.0 Literature Review

6.0 Results and Discussion

Mental health is defined as a state of well-being in which every individual

Statistics of mental health in Malaysia shows that every 3 in 10 adults aged 16

2.0 Research Background

A healthy workplace can be described as one where workers and managers

There are a few objectives that we are aiming in this study:

To determine which independent variables are significant to predict if the

4.2 Model 1: Logistic Regression Model

4.3 Model 2: Decision Tree Model

Based on the previous study, The MacNeil-Lichtenberg Decision Tree (MLDT)

In study 1 (Utility of Cognitive Component of The MLDT), the data from a

Study 2 is regarding Utility of Emotional Status Component of the MLDT. The

4.4 Model 3: Neural Network Model

5.1 Data Collection

5.2 Data Description

The dependent variable or target variable chosen is treatment which represents

5.3 Model 1: Logistic Regression Model

Logistic regression is able to predict the presence or absence of a characteristic

The estimated model of logistic regression model is

5.4 Model 2: Decision Tree Model

A decision tree is a hierarchical collection of rules that describes how to divide a

5.5 Model 3: Neural Network Model

The structure of a typical neural network consists of an input layer, a hidden

6.1 Model 1: Logistic Regression Model

ii) True Negative Rates (TNR), specificity:

Conclusion: The model’s ability to predict negative outcome correctly for

ii) True Negative Rates (TNR), specificity:

Conclusion: The model’s ability to predict negative outcome correctly for

ii) True Negative Rates (TNR), specificity:

Conclusion: The model’s ability to predict negative outcome correctly for

Based on the output above, the variables that significant are

Based on the output above, the variables that significant are

benefits (0.458-1)*100 The odd of benefits ‘no’ is 54.2% lower

Family_history (0.308-1)*100 The odd of did not have family history

leave (1.718-1)*100 The odd of leave ‘somewhat difficult’ is

leave (0.564-1)*100 The odd of leave ‘somewhat easy’ is

leave (2.132-1)*100 The odd of leave ‘very difficult’ is

Misclassification Rate Mean Square Error ROC Index

The best model between Backward Elimination, Forward Selection and

ii) True Negative Rates (TNR), specificity:

Conclusion: The model’s ability to predict negative outcome correctly for

The most important variable is Replacement: work interfere.

The most important variable is work interfere.

There are 3 profiles for treatment = Yes

if family_history IS ONE OF: NO

if family_history IS ONE OF: YES or MISSING

if work_interfere >= NEVER

if work_interfere >= NEVER

ii) True Negative Rates (TNR), specificity:

Conclusion: The model’s ability to predict negative outcome correctly for

(𝑇𝑃 + 𝑇𝑁) (224 + 180)

6.4 Model Comparison and Best Model

Misclassification Rate Average Square Error ROC Index

World Health Organization. Mental health in the workplace. (2017, September).

Dr Samuel B Harvey. Developing a mentally healthy workplace: A review of the

MacNeill, SE, Lichtenberg, PA. The MacNeill-Lichtenberg Decision Tree: a unique

Model 1: Logistic Regression Model

You might also like