Descriptive Statistics
Table: 1
Descriptive Analysis of Marks in science
Marks in Mathematics
N Valid 200
Missing 0
Mean 54.51
Median 54.00
Mode 52
Std. Deviation 6.058
Variance 36.693
Skewness .160
Std. Error of Skewness .172
Kurtosis .106
Std. Error of Kurtosis .342
Range 32
Minimum 40
Maximum 72
Sum 10902
Percentiles 25 50.00
50 54.00
75 58.75
Marks in science
N Valid 200
Missing 0
Mean 64.96
Std. Error of Mean .353
Median 65.00
Mode 66
Std. Deviation 4.986
Variance 24.857
Skewness .151
Std. Error of Skewness .172
Kurtosis -.054
Std. Error of Kurtosis .342
Range 29
Minimum 53
Maximum 82
Sum 12991
Percentiles 25 61.00
50 65.00
75 68.00
The above table shows the statistical measures of marks in mathematics, out of 200 valid
data of the respondents, the average marks in science is 64.96, median value is 65, mode 66,
standard deviation of marks in mathematics is 4.986, variance is 24.857, skewness is 0.151,
standard error of skewness is 0.172, kurtosis is -0.054, standard error of kurtosis is 0.342, the
difference between highest and lowest marks of the participants i.e., range is 29, minimum marks
is 53, maximum marks is 82 and their sum is 12991. This informative table shows the first,
second and third quartile are 61,65 and 68 respectively.
Figure:1
Pie-Chart of Respondents’ Religion
The figure alongside shows the occupation of parents. This shows that 45 percent of
respondents are Government job holder, 35 percent are private job holder, and 20 percent are
physical worker. The highest percentage is government job holder whereas least is physical
worker
Figure:2
Pie-Chart of caste of student
The figure alongside shows the caste of respondent. This shows that 45 percent of
respondents are Bramin, 35 percent are Vasya, and 20 percent are other. The highest percentage
is Bharmin job holder whereas least is other.
Figure 3
Bar Chart of
The above bar-graph shows the percentage of different caste of student. Out of total
respondents 45% are Bharmin, which is maximum, 35% are Vasya and 20% are ther which is
least.
Figure 4
Bar Chart of Occupation of Parents
The above bar-graph shows the count of different occupation of a parent. This shows that 90
parent are government job holder, 70 parents are private job holder and 40 parents are Physical
worker. Goverement job holder parents are more than double of Physical worker.
Figure:5
Simple Boxplot of Weight of student
The figure alongside shows the whisker plot of the weight of student. The median of the
dataset is 40.. The variables have no outlier.
Figure: 6
Simple Boxplot of marks in mathematics
The figure alongside shows the whisker plot of the marks in mathematics. The median of the
dataset is 65. The variables have outlier 7, which is significantly lower than rest of the data. It is
necessary to exclude the outlier from the original dataset before analyzing the data.
Figure:7
Clustered Boxplot of weight of student by caste of student by occupation of parent
The figure above represents the cluster boxplot of Weight of student by caste of the student. Data
shows that median weight of Bhamin is 39, Median Weight of Vaisya is 41 and median weight of
Other caste student is 40. It means there are similar weight of student according to their caste.
The data have no outlier
Figure: 8
Simple Histogram of Marks in Mathematics
The histogram displayed above represents the distance of home from school distribution of
the participants. It indicates that the majority of the data falls within a normal curve, with only a
few data points outside this range. It is suggested to exclude the outlier data for a more accurate
representation of the distance of home from school distribution.
Figure: 9
Simple Histogram of Marks in mathematics
The histogram displayed alongside represents the marks in mathematics distribution of the
participants. It indicates that the majority of the data falls within a normal curve, with only a few
data points outside this range. It is suggested to exclude the outlier data for a more accurate
representation of the marks in mathematics distribution.
Perform normal distributions of scale data (at least 2)
Table:2
Normal Distribution Scale Data of Age of Participants
Distance of Home from School
N Valid 200
Missing 0
Mean 5.02
Std. Error of Mean .128
Median 5.00
Mode 5
Std. Deviation 1.816
Variance 3.296
Skewness .016
Std. Error of Skewness .172
Kurtosis -.155
Std. Error of Kurtosis .342
Range 10
Minimum 0
Maximum 10
Sum 1004
Percentiles 10 3.00
20 4.00
25 4.00
30 4.00
40 5.00
50 5.00
60 5.00
70 6.00
75 6.00
80 7.00
90 7.00
The above table shows that mean and median of Distance of home from the school are quite
similar i.e., 5.02 and 5 respectively. With this we can say that the data are normally distributed.
On the other hands the value of kurtosis is -0.155 which lies between +2 and -2. It also proves
that the Distance of home from school are normally distributed. Let’s have a quick look on value
of skewness which is 0.016, very near to the zero (0). This also shows above data is normally
distributed. Let’s draw a normal curve of age of respondents.
Figure: 10
Normal Distribution of Distance of Home from School
This normal distribution curve presents the distance of home from school. We see the mean
and standard deviation are 5.02 and 1.816 respectively. Our data seems that approximately 68%
of the data are contained in ±1 standard deviation from the mean i.e., Approximately 68% of the
data belongs to 16.13±1.14⇒14.99-17.27. In the same 95% of the data are necessary to be
included between (μ±2σ) and 99% of the data necessary to be included between (μ±3σ). Another
indication of being a normal distribution is it seems symmetrical with the mean.
Table:3
Normal Distribution Scale Data of Marks in Mathematics of Participants
marks in Mathematics
N Valid 200
Missing 0
Mean 69.30
Std. Error of Mean .418
Median 69.50
Mode 71
Std. Deviation 5.911
Variance 34.943
Skewness .022
Std. Error of Skewness .172
Kurtosis -.272
Std. Error of Kurtosis .342
Range 31
Minimum 55
Maximum 86
Sum 13859
Percentiles 10 62.00
20 64.00
25 65.00
30 66.00
40 68.00
50 69.50
60 71.00
70 72.00
75 74.00
80 75.00
90 77.00
The above table shows that mean and median of marks of mathematics are quite similar i.e.,
69.30 and 69.5 respectively. With this we can say that the data are normally distributed. On the
other hands the value of kurtosis is 0.72 which lies between +2 and -2. It also proves that the data
in marks in mathematics are normally distributed. Let’s draw a normal curve of marks in
mathematics of respondents.
Figure: 11
Simple Histogram of Marks in Mathematics
This normal distribution curve presents the marks in mathematics. We see the mean and
standard deviation are 69.3 and 5.99respectively. Our data seems that approximately 68% of the
data are contained in ±1 standard deviation from the mean i.e., Approximately 68% of the data
belongs to 69.±5.99 In the same 95% of the data are necessary to be included between (μ±2σ)
and 99% of the data necessary to be included between (μ±3σ). Another indication of being a
normal distribution is it seems symmetrical with the mean.
Do some custom tables and cross-tabulation (At least 2)
Table:4
Custom Tables of Writing hand*Caste of student
Writing Hand
Right handed Left hand
Caste of student Caste of student
Bharmin Vaisya other Bharmi Vaisya other
n
Count Count Count Count Count Count
Writing Right 60 20 20 0 0 0
Hand handed
Left hand 0 0 0 30 50 20
The above table shows the custom of writing hand and caste of students. The highest number of
right-handed students are Brahmins, with 60 students. The lowest number of right-handed
students are from the "other" caste, with 20 students. The highest number of left-handed students
are from the "other" caste, with 50 students. The lowest number of left-handed students are
Brahmins, with 0 students. Brahmins: 60 students are right-handed and 0 students are left-
handed. Vaisyas: 20 students are right-handed and 0 students are left-handed. Other: 20 students
are right-handed and 50 students are left-handed. The table shows that there is a clear difference
in the distribution of writing hand by caste. Brahmins are more likely to be right-handed, while
students from the "other" caste are more likely to be left-handed. This difference may be due to a
number of factors, including cultural or genetic differences.
Table: 5
Cross-tabulation Mode Writing Hand*caste of student
Caste Of student Total
Brami Vaisy
n a Other
Writing Hand Right handed 60 20 20 100
Left hand 30 50 20 100
Total 90 70 40 200
The table shows the cross-tabulation of caste of student and writing hand. The highest
number of right-handed students are Brahmin, with 60 students. The lowest number of
right-handed students are Vaisya, with 20 students. The highest number of left-handed
students are other caste, with 50 students. The lowest number of left-handed students
are Brahmin, with 20 students
Develop two scatter plots of any variables (simple and clustered).
Figure: 11
Simple Scatter with Fit Line of Weight of Participants by Age of Participants
The above scatter plot presents the relationship between the Weight of student and Marks in
mathematics. Looking at this scatter plot the data are spreading everywhere. It shows that there is
no linear relationship between the variables. The points are scattered across the graph without a
clear trend or pattern. However, there are some clusters in the lower regions of the middle part of
the figure, it represents a group with varying scores in both variable. One important attribute of
this above figure is there is no relationship between the marks in Math and weight of
participants.
Figure: 12
Grouped Scatter of Marks in Science and Weight of student by Occupation of Parent
The above scatter plot presents the relationship between the marks in science by weight of
student by occupation of parents. Looking at this scatter plot the data are spreading everywhere.
It shows that there is no linear relationship between the variables. The points are scattered across
the graph without a clear trend or pattern. However, there are some clusters in the lower regions
of the middle part of the figure, it represents a group with varying scores in both subjects. One
important attribute of this above figure is there is no relationship between the marks in science
and Weight of student by Occupation of parents.
Perform correlation analysis (at least 2)
Table: 6
Correlation between Marks in Science and Marks in Mathematics.
Marks in Marks in
Science Mathematics
Marks in Science Pearson Correlation 1 .108
Sig. (2-tailed) .129
N 200 200
Marks in Mathematics Pearson Correlation .108 1
Sig. (2-tailed) .129
N 200 200
The table shows that the degree correlations between the marks in science and mathematics.
The Pearson r-value is 0.108. It lies in between 0 to 0.25. which means there is very weak
correlations between the variables that means marks in science does not effects marks in
mathematics of the respondents.
Table: 7
Correlation between Age of Participants and Weight of Participants.
Age of Weight of
Participants Participant
Age of Participants Pearson Correlation 1 .067
Sig. (2-tailed) .348
N 200 200
Weight of Participant Pearson Correlation .067 1
Sig. (2-tailed) .348
N 200 200
The table shows that the degree correlations between the age of participants and weight of
participants. The Pearson r-value is 0.067. It lies in between 0 to 0.25. which means there is very
weak correlations between the variables that means age of participants does not effects weight of
participants.
Perform regression analysis (at least 2)
Table:8
Regression of Marks of Mathematics on Marks in Science
Model Unstandardized Standardized t Sig.
Coefficients Coefficients
B Std. Error Beta
1 (Constant) 57.520 4.826 11.918 .000
Marks in 0.134 .088 .108 1.525 .129
Mathematics
a. Dependent Variable: Marks in Science
The table above provides a regression model of marks in mathematics on marks on
science. The model found that (F, (1,198) = 2.325, p (0.129)>0.05, with R2 =0.012 and
coefficient= 0.134. the R- square value explains that the 1.2% variability of marks in science can
be described by the variability of marks in mathematics.
We have regression equation:
Y= a+ b X
Score in Science = 57.520 +0.134*(Score in Mathematics)
This regression model shows that there is positive impact of independent variables on
dependent variables. That is, if we increase the score of mathematics by 1, there is 0.134 changes
in score in science, or if we increase the score of mathematics by 100%, there is 13.4% positive
change in score in science.
Table:9
Regression of Weight of Participants on Age of Participants.
Model Unstandardized Standardize t Sig.
Coefficients d
Coefficients
B Std. Error Beta
1 (Constant) 14.665 1.561 9.397 .000
Weight of .039 .041 .067 .940 .348
Participant
a. Dependent Variable: Age of Participants
The table above provides a regression model of weight of participants on age of
participants. The model found that (F, (1,198) = 0.884, p (0.348)>0.05, with R2 =0.004 and
coefficient= 0.039. the R- square value explains that the 0.4% variability of age of participants
can be described by the variability of weight of participants.
We have regression equation:
Y= a+ b X
Age of Participants = 14.665+0.039*(Weight of Participants)
This regression model shows that there is positive impact of independent variables on
dependent variables. That is, if we increase the weight of participants by 1, there is 0.039
changes in age of participants, or if we increase the weight of participants by 100%, there is
3.9% positive change in age of participants.