See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/387902717
Poisson Regression Model Using SPSS Software
Article · January 2025
CITATIONS READS
0 58
1 author:
Abolfazl Ghoodjani
Independent Author
117 PUBLICATIONS 1,120 CITATIONS
SEE PROFILE
All content following this page was uploaded by Abolfazl Ghoodjani on 10 January 2025.
The user has requested enhancement of the downloaded file.
GraphPad.ir
Statistical Analysis
SPSS (Poisson regression model)
What's the Matter?
We use the Poisson regression model when the dependent quantity is of the type of
number, frequency, and count. This model is analyzed in data where Y has a Poisson
Distribution and we want to obtain a regression relationship between Y and
independent quantities X. There are many discussions in the field of Poisson regression
theories. I have given a brief explanation about it in this link
(https://graphpad.ir/poisson-regression-prism/). You might be interested in reading.
In this article, we are going to explain the use of the Poisson regression model using the
Generalized Linear Models (GLM) menu. We can also perform Poisson regression using
the Generalized Estimating Equations (GEE) path. This path in SPSS software provides
us with many capabilities for performing Poisson analysis.
Assumptions
When we want to analyze our data using Poisson Regression, it is necessary to know
some of the assumptions of this analysis. It is necessary to state this point that we can
and should use Poisson Regression when these assumptions hold. Of course, they are not
complicated and are seen in most studies with a count dependent quantity (having a
Poisson distribution). Let's review them once.
2
GraphPad.ir
Assumption (1)
It is clear that when we want to use the Poisson regression model, the response
quantity must be of the type of count, frequency, and number. Note that if the number
of count units is large, it is better to use another regression model such as linear or
gamma.
Assumption (2)
When we talk about regression, it is necessary that there are independent variables,
or Xs, in the model. So, in a Poisson regression model there must be one or more
Independent Variables (IVs). IVs can be numeric, ordinal, or nominal. They are the
factors affecting the DV that we want to determine how much and how they affect the
response quantity.
Assumption (3)
Observations must be independent of each other. That is, one observation cannot
provide information or findings about another observation. This is a very important
assumption. Note that the lack of independent observations is more of a study design
issue.
Assumption (4)
The response quantity data should have a Poisson Distribution. The assumption that
the data is Poisson can be easily tested using a goodness of fit test. You can see this
link in this regard.
3
GraphPad.ir
Assumption (5)
One of the characteristics of the Poisson distribution is that the mean and variance of
the data are equal. So, when you want to use the Poisson regression model, check the
numerical values of its descriptive statistics. Of course, when we work with real data, it
is very rare that the mean and standard deviation are exactly the same (Even though
the Poisson hypothesis is confirmed, the data).
Because there is always some deviation from the Poisson distribution (or any other
statistical distribution) in real data, and that is why the P-value is there and reported.
So we can write assumption number 5 in such a way that the mean and variance of the
data are acceptably close to each other.
In my article titled Types of Generalized Linear Models GLM and GEE in SPSS software,
I explained the models available in the software. The Poisson regression model that I
want to talk about in this article is one of those types of generalized models. So read
that article first. Now let's start with an example of Poisson regression.
Example with Software
In a study conducted on 100 people, results such as the number of tumor
recurrences per patient, treatment group, number and size of the largest tumor in
each person at the beginning of the study were obtained.
Our goal in this study is to obtain a relationship between the number of tumors and
the type of treatment group, the number and size of tumors at the beginning of the
study. Since our response quantity is the number and count, which has a Poisson
distribution, we use Poisson regression models. In the image below, you can see part
of the data for this example.
4
GraphPad.ir
You can download the data file for this article from here Poisson Regression.
Poisson Regression Example Data
As you can see, the data is presented in four columns. The column labeled Recurrences
is the response quantity Y of the Poisson regression model, which itself has a Poisson
distribution and shows the number of tumor recurrences per patient.
The Treatment column shows whether the individual is in the treatment group with code
1 or in the control group with code 0.
The Number of Tumors at Baseline and Size of Largest Tumor at Baseline columns show
the number and size of the largest tumor in each individual at the beginning of the
study, respectively.
5
GraphPad.ir
Tip. To obtain the Poisson regression model in SPSS software, we use the following path.
Analyze → Generalized Linear Models → Generalized Linear Models
How to perform Poisson regression in SPSS software
Software settings in the Poisson model
When we go to the top path in SPSS software, the following window called
Generalized Linear Models opens for us.
6
GraphPad.ir
Generalized Linear Models window and selecting the Poisson loglinear option
In this window and in the Counts section, we select the Poisson loglinear option. This
option and this section are selected because our response quantity data is of the count
type and has a Poisson distribution. In the following, I will talk about the other tabs
needed in this window. 7
GraphPad.ir
Response
Click on the Response tab. You will enter the following window.
Response tab in the Generalized Linear Models window
In this tab, it is necessary to determine the response quantity of the Poisson regression
model. Since we want to obtain the relationship between the number of tumor
recurrences and other Xs, we place the column named Number of Recurrences in the
Dependent Variable box.
8
GraphPad.ir
Predictors
Click on the Predictors tab. This will take you to the window below.
Predictors tab in the Generalized Linear Models window
In the Predictors tab, we need to introduce the Xs of the Poisson regression model to the
software. Usually, nominal and sometimes rank quantities are placed as Factors and
numerical Scale quantities are placed as Covariates in this model. We did the same in
the window above. That is, Treatment, which indicates the treatment or control group, is
placed in the Factors section and the number and size of tumors are placed in the
Covariates section.
In this window, there is another section called Offset. In explaining this, we will explain
9
that the term Offset means a “structural” predictor. That is, it is entered into the model
GraphPad.ir
but its regression coefficient is not estimated by the model, and its entry into the model
is also because it can help improve the estimate of the other Xs in the model. This is
especially useful in Poisson regression models where each individual can have a
different level of exposure to the event of interest (in this case, tumor recurrence).
For example, here the patient's age can be entered into the model as an Offset
Variable. This is because there is a significant difference in the probability of tumor
recurrence between a 70-year-old and a 30-year-old. Also note that if a quantity is
considered as an Offset, it cannot be entered into the model as a Covariate or Factor.
The reverse is also true.
Model
In the next step, click on the Model tab to enter the window below.
10
Model tab in the Generalized Linear Models window
GraphPad.ir
In this window, we put all three X of the Poisson regression model, namely treatment,
number and tumor size, in the Model box.
We also only want to obtain their main effects and do not deal with Interactions, such
as the interaction effect of treatment * size or the interaction effect of number * size.
For this reason, we select the Main effects option in the Type section. We also select the
Include intercept in model option, which is the default in SPSS software.
In this window, there is a section called Build Nested Term. This allows you to create
nested factors for your model. Nested effects are useful for modeling the effect of a
factor or covariate whose values do not interact with the levels of other factors. For
example, a grocery store chain might track the amount of shopping its customers spend
at multiple store locations. Since each customer only shops at one of these locations, the
customer effect can be said to be nested within the store location effect. Therefore,
store location can be entered into the model as a nested variable.
The Estimation tab contains methods and options for estimating model parameters. See
the image below.
11
GraphPad.ir
Estimation tab in the Generalized Linear Models window
Usually, the default software options are selected and placed in this window. In this
regard, you would be interested in seeing this link on the IBM site.
You can also see the Statistics tab in the window below. We will also accept and place
the default software options in this tab. Here, in addition to the software settings, I have
also selected the Include exponential parameter estimates option. In this regard, you
would be interested in seeing this link.
12
GraphPad.ir
Statistics tab in the Generalized Linear Models window
EM Means
In the EM Means tab, we can view the estimated Marginal Means for each of the
different groups and levels of factors in the model or the model interactions. We can
also compare different levels and groups of interactions with each other in this tab.
13
GraphPad.ir
EM Means tab in the Generalized Linear Models window
Of course, in this example we have not had an interaction effect. However, by selecting
the Pairwise option, we can compare the levels of different factors with each other.
At the bottom of the EM Means window, there is another box called Scale. Based on
the options in this section, we can obtain the marginal means of the response quantity
on the original Response data or on the transformed data (based on the selected link
function).
The other section of this window, labeled Adjustment for Multiple Comparisons, adjusts
for multiple comparisons. This is done when there are a large number of multiple
comparisons. Doing so reduces the likelihood of obtaining significant results based on
chance and coincidence, and only completely significant comparisons with small
probability values are reported as significant.
14
GraphPad.ir
In the pop-up box of this box we can see a variety of adjustment methods. Perhaps the
most famous of them is Bonferroni. You can see this link in this topic.
Types of adjustment methods for multiple comparisons
In the Save tab, we can obtain more findings and information from the analysis results
in the form of a data file. Selecting any of the options in this window will add a new
column to the data file and place the results of the selected option in that column. See
this link.
15
Save tab in the Generalized Linear Models window
GraphPad.ir
Finally, the Export tab appears in the Generalized Linear Models window. Using this
tab, you can view the outputs and some of the model parameter estimates in the form
of a new data file. You can see this link in this regard.
Poisson Regression Results
At the beginning of the results and outputs of the SPSS software is the Model
Information table.
Model Information Table
This table states that the Dependent Variable of the study is Number of Recurrences.
The probability distribution is also defined as Poisson because the response quantity is
in the form of count and number. Also, our link function is logarithmic Probit.
In the Categorical Variable Information table, descriptive information including number
and percentage for each of the groups of the study factor, i.e., Treatment (which plays
the role of Independent Variable), has been obtained.
16
GraphPad.ir
Table of Categorical Variable Information
The results of the table above show that 48 people are in the control group and 52
people are in the treatment group.
In the Continuous Variable Information table, descriptive statistics are obtained for the
continuous quantities of the study, namely the number of recurrences, the number of
tumors, and the size of the largest tumor.
Table of Continuous Variable Information
The table above shows that the mean number of tumor recurrences is 0.96 and the
standard deviation is 1.348. The best you can get from this table is whether there might
be overdispersion in your analysis (Poisson regression assumption #5). You can do this
by considering the ratio of the variance (the square of the “Std. Deviation” column) to
the mean (the “Mean” column) for the dependent quantity. In our example, we get the
following result.
17
(𝑠𝑑) (1.348)
= = 1.89
𝑀 0.96
GraphPad.ir
The resulting value of 1.89 (which is equal to one in the exact Poisson distribution)
suggests that there is some overdispersion before we add the Xs. However, we need to
check this assumption once all the independent variables have been added to the
Poisson regression.
The Goodness of Fit table provides results on the goodness of fit.
Goodness of Fit Table
This table provides criteria that are useful for comparing with other models. In any
model where indicators such as AIC and BIC are lower, it can be concluded that that
model is better. Of course, there are other decision-making tools.
In addition, Value/df, which indicates the ratio of the test statistic to the degree of
freedom, is a good tool that can indicate the suitability of the fitted model. This ratio
for Poisson regression should be close to 1.0. That is, there is a reasonable ratio
18
between the test degree of freedom and the test statistic.
GraphPad.ir
It is good to know that the test degree of freedom is obtained from the following
relationship.
𝑑𝑓 = 𝑛 − (𝑘 − 1) + 1 + 𝐶𝑜𝑣 + 𝑀𝐷
In this relation, n is the number of samples, k is the number of levels of each factor in
the model, p is the number of study factors, Cov is the number of kurtosis in the model,
and MD is the number of missing data. For example, here we have 100 samples. There
is one factor with two levels (control and treatment) and two kurtosis in the model. We
also have no missing data. Therefore, the relation
𝑑𝑓 = 100 − (2 − 1) + 1 + 2 + 0 = 96 exists.
In our example, the Value/df ratio is 1.208 (based on the Pearson Chi-Square test). In
this ratio, a value of 1 indicates uniform dispersion, while values greater than 1 indicate
overdispersion, and numbers less than 1 indicate no dispersion.
Usually, in examples and studies, the most common type of violation of the assumption
is the homogeneity of dispersion. That is, observing numbers greater than 1. However,
in this example, the value of 1.208 is unlikely to be a serious violation of this assumption
and we assume that the data have a Poisson distribution with uniform dispersion.
The table above does not have a statistical test and therefore the probability value
that SPSS software shows in a column called Sig. Instead, the next table of results, called
Omnibus Test, has a statistical test. See it.
19
GraphPad.ir
Omnibus Test Table
In this table, the fitted model (in this case, Poisson regression) is compared and tested
with a model containing only the fixed coefficient (i.e., without any of the Xs). The result
of the Omnibus Test shows that the fitted regression model is significant. The implication
of this statement is that at least one of the Xs has a significant effect on the response
quantity of the number of tumor recurrences.
Now that you know that adding all the independent quantities creates a statistically
significant Poisson model, you want to know which Xs are statistically significant. This is
discussed in the next section.
In the next table, Tests of Model Effects, the effect of each of the Xs on Y (number of
recurrences) is tested separately. See it.
20
GraphPad.ir
Results of the Tests of Model Effects table
The results of the table above show that treatment has no significant effect on Y (P-
value=0.515). However, the number of tumors (P-value<0.001) and the size of the
largest tumor (P-value=0.028) have a significant effect on the number of recurrences.
The next table, Parameter Estimates, is given below. See the image.
Table of Parameter Estimates in Poisson Regression
The positive regression coefficient of Size (B=0.102) indicates that increasing tumor size
leads to an increase in the probability of an event (tumor recurrence). This result is
21
significant (P-value = 0.028). The numerical value of the event chance, which is
GraphPad.ir
expressed as 1.107 = Exp(B), shows that each unit increase in tumor size increases the
chance of tumor recurrence by about 1.11 times.
Another finding is that the number of tumors also has a significant and strong effect on
tumor recurrence in the studied individuals (B=0.253, P-value < 0.001). Here, too, the
numerical value of the event chance is 1.287 = Exp(B). This number shows that each unit
increase in tumor number increases the chance of tumor recurrence by about 1.29 times.
However, as we have said before, treatment has no significant effect on recurrence.
Although the regression coefficient for the negative control group was obtained (B=-
0.137) and showed that the control group had a lower chance of recurrence than the
treatment group, this finding was not significant (P-value = 0.515).
In the following output of the SPSS software, the results are called Estimated Marginal
Means: Treatment?
Marginal Means Estimation Table
This table shows the marginal means of tumor recurrence for each of the control and
treatment groups. The standard error and 95% confidence intervals are also estimated.
The results of this table clearly show that the mean tumor recurrence in the control group
is lower than in the treatment group. This was the same result that we had reached in
22
GraphPad.ir
the Parameter Estimates table. That is, the Poisson regression coefficient of the control
group is negative compared to the treatment group.
Another result of this section is the Pairwise Comparisons table. In this table, the
treatment factor groups are compared with each other in a pairwise comparison.
Pairwise Comparisons Table
The treatment factor only had two groups. If there were other factors in this study, we
could have seen the results of this table better and with more findings.
Finally, the software output includes a table called Overall Test Results.
Overall Test Results Table
In this table, you can see the overall test result of all pairwise comparisons in the
Pairwise Comparisons table. The probability value obtained in this table is the same as
23
the number in the Pairwise Comparisons table. The reason for this is that in the Pairwise
GraphPad.ir
Comparisons table, there were only two groups, control and treatment. Therefore, the
results in the Pairwise Comparisons table will be similar to the Overall Test Results table.
“In this article, we discussed the expression of the Poisson regression model using the
Generalized Linear Models path in SPSS software. “
Abolfazl Ghoodjani
01/09/2024
24
GraphPad.ir
References
1. Abolfazl Ghoodjani. (2016), “Advanced Statistical Methods and Applications”.
2. Abolfazl Ghoodjani. (2022), “IBM SPSS Statistics Base 28 at home”.
3. Abolfazl Ghoodjani. (2022), “Cluster Analysis Using Minitab (Cluster Variables, Cluster Observations,
Cluster K-Means)”, Statistica.
4. Abolfazl Ghoodjani. (2023), “Principal Component Regression (PCR) Using Prism”, GraphPad.
5. Abolfazl Ghoodjani. (2022), “Direct Marketing Using SPSS”, GraphPad.
6. Abolfazl Ghoodjani. (2025), “Poisson Regression Model Using SPSS Software”, GraphPad.
7. Abolfazl Ghoodjani. (2022), “Kaplan Meier Survival Analysis using Prism”, GraphPad.
8. Abolfazl Ghoodjani. (2022), “Multinomial Logistic Regression Using SPSS”, GraphPad.
9. Abolfazl Ghoodjani. (2023), “Survival Analysis Using Prism”, GraphPad.
10. Abolfazl Ghoodjani. (2022), “Calculation of LD50 using Probit Regression in SPSS Software”.
11. Abolfazl Ghoodjani. (2022), “Principal Component Analysis Using GraphPad Prism”, Journal
of Statistical Software.
12. Abolfazl Ghoodjani. (2022), “Power Analysis Using SPSS (Means, Proportions, Correlations,
Regression)”, Statistica.
13. Abolfazl Ghoodjani. (2022), “Compare Proportions Test Using SPSS”, Statistica.
14. Abolfazl Ghoodjani. (2016), “Book: Advanced Statistical Methods and Applications”,
Statistica.
15. Abolfazl Ghoodjani. (2022), “Statistical Analysis Training Courses (SPSS Prism Minitab). Better
Data, Better Lives”, Statistica.
16. Abolfazl Ghoodjani. (2022), “What’s new in IBM SPSS Statistics 28”.
17. Abolfazl Ghoodjani. (2021), “Compusyn software online training course”.
18. Abolfazl Ghoodjani. (2021), “GraphPad Prism software online training course”.
19. Abolfazl Ghoodjani. (2021), “Online training course in applied statistics analysis”.
20. Abolfazl Ghoodjani. (2021), “Tech Talks. Drug Combinations Synergism or Antagonism
Compusyn at Home”.
21. Abolfazl Ghoodjani. (2021), “One-way MANCOVA Multivariate GLM with SPSS”.
22. Abolfazl Ghoodjani. (2021), “Two-way ANCOVA Univariate GLM with SPSS”.
23. Abolfazl Ghoodjani. (2021), “One-way ANCOVA Univariate GLM with SPSS”.
24. Abolfazl Ghoodjani. (2021), “One-way MANOVA Multivariate GLM with SPSS”.
25. Abolfazl Ghoodjani. (2021), “Two-way MANOVA GLM Multivariate with SPSS”.
25
26. Abolfazl Ghoodjani. (2021), “Two-way ANOVA Univariate GLM with SPSS”.
GraphPad.ir
27. Abolfazl Ghoodjani. (2021), “Data One-way ANOVA-Cycle with SPSS (Compare Means,
Univariate GLM)”.
28. Abolfazl Ghoodjani. (2020), “Area Fills GraphPad Prism v8”.
29. Abolfazl Ghoodjani. (2020), “Adjust spacing between bars GraphPad Prism v8”.
30. Abolfazl Ghoodjani. (2020), “Multiple Linear Regression with GraphPad Prism v8”.
31. Abolfazl Ghoodjani. (2020), “General Linear Model, Univariate”.
32. Abolfazl Ghoodjani. (2020), “Bell-shaped dose response with GraphPad Prism”.
33. Abolfazl Ghoodjani. (2020), “Bland-Altman method comparison with GraphPad”.
34. Abolfazl Ghoodjani. (2020), “Poisson Regression with GraphPad Prism8”.
35. Abolfazl Ghoodjani. (2020), “Multiple Logistic Regression with GraphPad Prism v8”.
36. Abolfazl Ghoodjani. (2020), “Use and application of Poisson regression model in predicting
the frequency of death in cancer patients”.
37. Abolfazl Ghoodjani. (2020), “GraphPad Prism v8 Training Course”.
38. Abolfazl Ghoodjani. (2020), “Slash and Skew Slash Distribution; Parameters Estimation”,
Journal of Statistical Software
39. Abolfazl Ghoodjani. (2019), “ General Linear Model”.
40. Abolfazl Ghoodjani. (2019), “ Design of Multi-phase Model in Nonlinear Dose Response Curves
and Estimation of Slope in Ambiguous Data”, Molecular Systems Biology.
41. Abolfazl Ghoodjani. (2019), “ A new statistical distribution and estimation of its parameters
Skew Slash Extreme Value (SSEV)”, Journal of the American Mathematical Society 32(1):43-69.
42. Abolfazl Ghoodjani. (2019), “Advanced Statistical Methods and Applications (GraphPad
Prism, SPSS, Minitab)”.
43. Abolfazl Ghoodjani. (2019), “ The First workshop GraphPad Prism8 in the world.”.
44. Abolfazl Ghoodjani. (2018), “ Respectfully [Abolfazl Ghoodjani | Alan Edwards Centre for
Research on Pain (AECRP)] Discover the Breadth of Statistical Features Available in Prism 8”.
45. Abolfazl Ghoodjani. (2018), “ Dose - Response with GraphPad Prism”.
46. Abolfazl Ghoodjani. (2018), “Clinical efficacy of Anticancer drugs and the design of advanced
nonlinear models based on the median effective dose in Middle East women, American Journal of
Obstetrics and Gynecology.
47. Abolfazl Ghoodjani. (2018), “Step 1 | Create DOE”.
48. Abolfazl Ghoodjani. (2018), “Estimates of EC50 and IC50 parameters a dose response model
of Curcumin”, Nature Reviews Drug Discovery 17(8)
49. Abolfazl Ghoodjani. (2018), “ Cluster analysis with SPSS”.
50.
26
Abolfazl Ghoodjani. (2018), “ Discriminate Analysis with SPSS”.
View publication stats
GraphPad.ir
51. Abolfazl Ghoodjani. (2018), “ Pharmacist license: barriers to employment and business
development, delivery of solutions”, Nature Reviews Drug Discovery 16(2).
52. Abolfazl Ghoodjani. (2018), “ Statistical Analysis with Software”.
53. Abolfazl Ghoodjani. (2018), “ Global fertility rate, the impact of economic components on it,
reviewed from 1950 to 2016”, Statistics in Medicine.
54. Abolfazl Ghoodjani. (2018), “ Designing a new statistical distribution, the extreme-value
distributions becomes Crooked”, Biostatistics 19(3).
55. Abolfazl Ghoodjani. (2018), “Statistical Analysis with Software”.
56. Abolfazl Ghoodjani. (2018), “The Impact of Youth Structure on the Rate of Economic
Participation in the Middle East”.
57. Abolfazl Ghoodjani. (2018), “Designing a new statistical distribution, the extreme-value
distributions becomes Crooked.”.
58. Abolfazl Ghoodjani. (2018), “Fishers exact test with GraphPad Prism”.
59. Abolfazl Ghoodjani. (2017), “Hardy-Weinberg Equilibrium Software Design”.
60. Abolfazl Ghoodjani. (2017), “Workshop SPSS v23-ISERB & ICRHC 2017”.
61. Abolfazl Ghoodjani. (2016), “Why should I use a Kruskal Wallis Test?”.
62. Abolfazl Ghoodjani. (2016), “Prism Workshop, Shahid Beheshti University”.
63. Abolfazl Ghoodjani. (2016), “Workshop SPSS-ISERB 2016”.
64. Abolfazl Ghoodjani. (2016), “Time Series Analysis: Fertility Rate & Economic Development”.
65. Abolfazl Ghoodjani. (2016), “Fertility Rate and Economic Development, Long-term 60-year
Trend of Iran”.
66. Abolfazl Ghoodjani. (2016), “Information technology regulation, Development of citizenship
and Support economic activities”.
67. Abolfazl Ghoodjani. (2016), “Executive Prioritized Approach to Electronic Commerce and
Economy Information Technology Standards and the use of Mathematical Models (Regression)”.
68. Abolfazl Ghoodjani. (2016), “Correlation between ICT development index and the index of
ease of doing business in the economic debate in Iran.”.
69. Abolfazl Ghoodjani. (2016), “Slash and Skew Slash Distribution”.
70. Abolfazl Ghoodjani. (2016), “Extension of Slash distribution by extreme value distribution”.
71. Abolfazl Ghoodjani, Abolfazl Khavarinejad. (2016), “Economic Participation Rate In Iran.”.
72. Abolfazl Ghoodjani. (2015), “The Effect of Population Structure on Labor Market Trends”.
73. Abolfazl Ghoodjani, Nader Nematolahi. (2016), “10th Iranian Statistical Conference”.
27