0% found this document useful (0 votes)

33 views9 pages

Report

The document summarizes a group project analyzing the relationship between height and weight using a dataset of 30 female students. Descriptive statistics were calculated for height and weight variables. Inferential analyses included a 95% confidence interval for average height, hypothesis testing for average height and weight percentages, and a simple linear regression model to analyze the relationship between height and weight.

Uploaded by

khangtdds170082

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views9 pages

Report

Uploaded by

khangtdds170082

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

MAS291 Final Project

Group: 7

Member:

{Nguyễn Thái Nguyên- QS170069; Lương Hoàng Duy - DE170114; Trần Đinh Khang -

DS170082; Bùi Nữ Vân Nhi - DA160062}

I. Dataset:

The dataset given by teacher includes the weight and height of 30 female students. Here is the

example of the dataset:

The dataset only contains two variables weights and heights of 30 female students. In this project

we will conduct descriptive and inferential statistical analysis based on the given data and then a

regression model is introduced to show the mathematical relation between the two variables.

II. Descriptive and Inferential Statistical Analysis:

1. Descriptive Analysis:

By using the Analysis tool provided by Microsoft Excel, a descriptive analysis table were

generated.

Weight Height
Mean 59,73333 Mean 165,6667
Standard Error 1,231701 Standard Error 1,132928
Median 59,5 Median 166
Mode 58 Mode 170
Standard Deviation 6,746306 Standard Deviation 6,2053
Sample Variance 45,51264 Sample Variance 38,50575
Kurtosis 0,524973 Kurtosis -0,12615
Skewness 0,837299 Skewness -0,10425
Range 27 Range 25
Minimum 50 Minimum 154
Maximum 77 Maximum 179
Sum 1792 Sum 4970
Count 30 Count 30
Confidence Level (95.0%) 2,519112 Confidence Level (95.0%) 2,317097

a. Variable Weight (kg):

The average weight of the students was found to be approximately 59.73 kg, with a standard

deviation of 6.75 kg, indicating a moderate amount of variability in the weights. The mode is 58

kg, indicating that it is the weight that appears most often in the dataset.

The skewness value of 0.84 indicates that the weight distribution is positively skewed. This

means that the tail of the distribution extends more towards higher weights, suggesting that there

might be a few students with relatively higher weights compared to the majority.

The range of weights observed in the dataset was 27 kg, ranging from a minimum of 50 kg to a

maximum of 77 kg. The most frequently occurring weight was 58 kg. The sample variance, a

measure of dispersion, was calculated to be approximately 45.51 kg^2

Histogram
10 120.00%
8 100.00%
Frequency
6 80.00%
60.00% Frequency
4 40.00% Cumulative %
2 20.00%
0 0.00%
50 55,4 60,8 66,2 71,6 More
Bin

The majority of students have weights ranging from 55.4 kg to 71.6 kg, with the highest

frequency occurring within the 55.4 kg to 60.8 kg range. The distribution appears to be slightly

skewed towards relatively lower weights, as indicated by the lower frequencies in the higher

weight ranges.

b. Variable Height (cm):

The average height of the students was calculated to be approximately 165.67 cm, with a

standard deviation of 6.21 cm. This suggests that the heights of the students varied around the

mean, indicating a moderate level of diversity within the group. The height distribution exhibited

a nearly symmetrical shape, as evidenced by a small negative skewness value.

The analysis revealed that the range of heights in the dataset spanned 25 cm, with the shortest

height recorded at 154 cm and the tallest at 179 cm. The most common height among the

students was 170 cm, reflecting a frequently occurring value within the group. The sample

variance, a measure of height dispersion, was computed to be approximately 38.51 cm^2,

providing an understanding of the spread of heights around the mean.

Histogram
12 120.00%
10 100.00%
8 80.00%

Frequency
6 60.00% Frequency
Cumulative %
4 40.00%
2 20.00%
0 0.00%
154 159 164 169 174 More
Bin

The majority of individuals have heights falling within the range of 164 cm to 169

cm, which is the most prevalent range. The distribution appears to be slightly

skewed towards relatively shorter heights, as indicated by the lower frequencies in

the taller height ranges.

2. Inferential Statistics:

Hypothesis Testing and Confidence Interval of the mean of a population

Population: All female students

Sample: This dataset

Analysis: Construct a Confidence Interval with 5% significance level for average height of all

female students. In here, we will use t-distribution because sigma is unknown.

C.I for average height of all althletes in the world

Use t-distribution (sigma is unknown)
alpha 5%
n 30
sample mean x 165,67
sample stdev s 6,21
t(alpha/2,n-1) 2,05
right bound 167,98
left bound 163,35
163.35 <= height <=
C.I for average height: 167.98

So after the analysis, we can conclude that a 95% confidence interval on the average height of all

female students based on this data is (163.35, 167.98)

Research question for Hypothesis Testing: Average Height of all female students is 164 cm.

Test the claim with significance level of 10% based on the data.

Hypotheses:

Null Hypothesis (H0): The average height of all female students is 164 cm.

Alternative Hypothesis (H1): The average height of all female students is not 164 cm.

alpha 10%
n 30
sample mean x 165,67
sample stdev s 6,21

Test statistic
mean0 164
1,47111
t0 5
t(alpha/2,n-1) 1,70
-t(alpha/2,n-1) -1,70

Because t0 is in acceptance range, fail to reject null hypothesis. This indicates that there is not

enough evidence to suggest that the average height of all female student is not equal to 164 cm.

Hypothesis Testing for population proportion P

Population: All female students

Sample: This data

Analysis: Test the claim that the percentage of female students with weight under 65 in the

world is 60% of the total population with 10% significance level

Hypotheses:

Null Hypothesis (H0): The percentage of female students with weight under 65 in the world is

equal to 60%.

Alternative Hypothesis (H1): The percentage of female students with weight under 65 in the

world is not equal to 60%.

p hat 0,80
test static z0 2,24
z(alpha/2) = right critical 1,96
- z(alpha/2) = left critical -1,96
Z0 is not in acceptance, so we reject the null hypothesis (H0), which means this would indicate

evidence to suggest that the percentage of female students with weight under 65 in the world is

different from 60%.

III. Constructing Simple Linear Regression and Analyzing Result:

For Linear Regression analysis, we set the variable height(cm) as “X” and variable weight(kg) as

“Y”. Using Regression Analysis from Analysis Tool in Excel, a summary output was generated:

Regression Statistics
Multiple R 0,875052
R Square 0,765716
Adjusted R Square 0,757349
Standard Error 3,323203
Observations 30

The multiple correlation coefficient (R) indicates the strength and direction of the linear

relationship between the predictor variables and the response variable. In this case, the multiple

R value is approximately 0.88. This suggests a strong positive correlation between the predictor

variables and the response variable.

The R Square value is approximately 0.77, indicating that around 77% of the variability in the

response variable can be explained by the predictor variables included in the regression model.
The adjusted R Square takes into account the number of predictor variables and the sample size

to provide a more accurate measure of the proportion of variance explained. The adjusted R

Square value of approximately 0.76 suggests that the predictor variables explain about 76% of

the variance in the response variable, considering the model's complexity and sample size.

The standard error represents the average deviation of the observed values from the regression

line. In this case, the standard error is approximately 3.32. It provides an estimate of the typical

distance between the actual data points and the predicted values from the regression model.

The number of observations indicates the sample size used in the regression analysis. In this

case, the analysis is based on 30 observations.

ANOVA table:

The analysis conducted involved performing an ANOVA (analysis of variance) to assess the

significance of a regression model. The results showed that the regression model was highly

significant (p < 0.001), indicating that the predictor variable(s) included in the model have a

strong influence on the response variable.

Further examination of the coefficients revealed that the intercept had a significant negative

effect, with a value of approximately -97.87. The predictor variable (referred to as X Variable 1)

had a significant positive effect, with a coefficient of approximately 0.95. These coefficients

imply that for every unit increase in X Variable 1, the response variable is expected to increase

by approximately 0.95 units, after accounting for the intercept.

For the given regression analysis above, we can conduct a formula calculating Y based on X and

vice versa: Y= 0.951343*X - 97.87254

Example: Given a student in the class with her height is 170cm, predict her weight?

Answer: Her predicted weight is Y = 63.8557 (kg)

In the analysis, we based on the formula and conduct tests to see if the regression model can

correctly predict the weights given their heights. The output table is given below.

X Variable 1 Line Fit Plot

100
80
60 Y
40 Predicted Y
Y

20
0
150 155 160 165 170 175 180 185
X Variable 1

As we can see, the regression model can estimate the value quite correct. This means that the

regression model can estimate well the relation between weights and heights of 30 examined

female students.

IV. Conclusion:

In conclusion, this data analysis project focused on examining the heights and weights of 30

female students. The project utilized descriptive statistics to summarize and analyze the data,

inferential statistics to draw conclusions and make predictions, and visual representations to

effectively communicate the findings.

The descriptive analysis revealed that the average height of the female students was

approximately 165.67 cm, with a standard deviation of 6.21 cm, indicating a moderate level of

variability. The weight data showed an average weight of approximately 59.73 kg, with a

standard deviation of 6.75 kg. Both height and weight distributions exhibited near-normal

shapes.
Inferential statistics were employed to explore relationships and determine the statistical

significance of certain variables. The results indicated strong positive correlations between

height and weight.

Visual representations, including graphs, tables, and frequency distributions, were created to

enhance the communication of the findings. These visuals aided in effectively presenting the

data, allowing for easier interpretation and comprehension.

Overall, this project successfully analyzed the heights and weights of the female students,

providing valuable insights into their distributions, relationships, and characteristics. The

findings can inform further research, interventions, or decision-making processes related to

height and weight management among female students.

Basic Statistics For Research
100% (1)
Basic Statistics For Research
119 pages
Basic Statistics For Research
No ratings yet
Basic Statistics For Research
106 pages
Ex. No.: 9 Analysis Using Select Case Date: 06-05-2010 (SPSS)
No ratings yet
Ex. No.: 9 Analysis Using Select Case Date: 06-05-2010 (SPSS)
20 pages
9.regression Zoom
No ratings yet
9.regression Zoom
23 pages
Math Modeling for Upper Sec Students
50% (2)
Math Modeling for Upper Sec Students
21 pages
Deepu Final
No ratings yet
Deepu Final
9 pages
Math IA Final
No ratings yet
Math IA Final
14 pages
Linear Regression Analysis Guide
No ratings yet
Linear Regression Analysis Guide
19 pages
SPSS
No ratings yet
SPSS
15 pages
CA Foundation Math LR Stats Q MTP 2 June 2023
No ratings yet
CA Foundation Math LR Stats Q MTP 2 June 2023
17 pages
Group One (2) (Repaired)
No ratings yet
Group One (2) (Repaired)
17 pages
Group One
No ratings yet
Group One
21 pages
CAB Output
No ratings yet
CAB Output
25 pages
PT2
No ratings yet
PT2
21 pages
Assignment 1 - Samantha Herrera Business Analytics
No ratings yet
Assignment 1 - Samantha Herrera Business Analytics
11 pages
Math Variability for Grade 7 Students
No ratings yet
Math Variability for Grade 7 Students
4 pages
PT2 Stats
No ratings yet
PT2 Stats
15 pages
Bus-173 ....... GRG
No ratings yet
Bus-173 ....... GRG
9 pages
Maths SBA
No ratings yet
Maths SBA
16 pages
Statistics Angelika
No ratings yet
Statistics Angelika
6 pages
A Regression Equation Model For Height and Weight
No ratings yet
A Regression Equation Model For Height and Weight
8 pages
Linear Regression for Statisticians
No ratings yet
Linear Regression for Statisticians
2 pages
SLR Prediction
No ratings yet
SLR Prediction
21 pages
Oxford Insight Mathematics 10 5 25 3 AC For NSW Student Book Obook John Ley Michael Fuller Z Lib Org - 124
No ratings yet
Oxford Insight Mathematics 10 5 25 3 AC For NSW Student Book Obook John Ley Michael Fuller Z Lib Org - 124
1 page
WILP ASM End-Sem (Regular) Solutions
No ratings yet
WILP ASM End-Sem (Regular) Solutions
3 pages
Data Science - Model Exam Question Paper
No ratings yet
Data Science - Model Exam Question Paper
2 pages
One-Way Repeated ANOVA
No ratings yet
One-Way Repeated ANOVA
33 pages
Assignment 3 FBA
No ratings yet
Assignment 3 FBA
14 pages
MLR Prediction
No ratings yet
MLR Prediction
16 pages
Regression Analysis: Variables in The Model
No ratings yet
Regression Analysis: Variables in The Model
3 pages
Probability To Correlation1
No ratings yet
Probability To Correlation1
38 pages
Statistical Analysis Practice Sheet
No ratings yet
Statistical Analysis Practice Sheet
2 pages
Statistics With The SPSS Package: 5.3 Testing The Assumption of Normality
No ratings yet
Statistics With The SPSS Package: 5.3 Testing The Assumption of Normality
4 pages
Assignment
No ratings yet
Assignment
5 pages
Statistics
No ratings yet
Statistics
5 pages
Biostats Work
No ratings yet
Biostats Work
6 pages
MATLAB Data Analysis Project
No ratings yet
MATLAB Data Analysis Project
3 pages
Group Project
No ratings yet
Group Project
14 pages
Chapter 8 Solution To Example Exercises PDF
No ratings yet
Chapter 8 Solution To Example Exercises PDF
3 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
7 pages
Maths SBA
No ratings yet
Maths SBA
14 pages
Inferential Statistics & ANOVA Guide
No ratings yet
Inferential Statistics & ANOVA Guide
7 pages
Exam
No ratings yet
Exam
7 pages
Correlation and Regression
No ratings yet
Correlation and Regression
8 pages
Chapter 6 Measures of Dispersion
No ratings yet
Chapter 6 Measures of Dispersion
27 pages
A Simple Study On Weight and Height of Students: Dr. Mohammad Rafiqul Islam, (Associate Professor)
No ratings yet
A Simple Study On Weight and Height of Students: Dr. Mohammad Rafiqul Islam, (Associate Professor)
9 pages
Test of Relationship Parametric
No ratings yet
Test of Relationship Parametric
9 pages
Data Comes in Different Formats Time Histograms Lists But . Can Contain The Same Information About Quality
No ratings yet
Data Comes in Different Formats Time Histograms Lists But . Can Contain The Same Information About Quality
64 pages
RM Practical-195218222
No ratings yet
RM Practical-195218222
15 pages
Regression Analysis - Stata Annotated Output: Use Https://stats - Idre.ucla - Edu/stat/stata/notes/hsb2
No ratings yet
Regression Analysis - Stata Annotated Output: Use Https://stats - Idre.ucla - Edu/stat/stata/notes/hsb2
6 pages
Making Decisions For The Difference Between Two Independent Population Means
No ratings yet
Making Decisions For The Difference Between Two Independent Population Means
10 pages
Unit - 2 - Measure of Central Tendency - B.C.a Study
No ratings yet
Unit - 2 - Measure of Central Tendency - B.C.a Study
39 pages
Correlation IAs Pearsons
No ratings yet
Correlation IAs Pearsons
11 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
Statistics
No ratings yet
Statistics
13 pages
STA215 STA220 Practice Test
No ratings yet
STA215 STA220 Practice Test
13 pages
Data Anlalysis
No ratings yet
Data Anlalysis
6 pages
Mathematics: Self-Learning Module 11
No ratings yet
Mathematics: Self-Learning Module 11
17 pages
ABC Business Statistics
No ratings yet
ABC Business Statistics
12 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
21 pages
Math 1040
No ratings yet
Math 1040
17 pages
MBA 1st Sem Model Papers
No ratings yet
MBA 1st Sem Model Papers
18 pages
In Partial Fulfilment of The Requirements of The Subject: Probability and Statistics
No ratings yet
In Partial Fulfilment of The Requirements of The Subject: Probability and Statistics
5 pages
Statistical Analysis Using SAS
No ratings yet
Statistical Analysis Using SAS
47 pages
Grade 10 Math: Position Measures
No ratings yet
Grade 10 Math: Position Measures
9 pages
In Partial Fulfilment of The Requirements of The Subject Probability and Statistics
No ratings yet
In Partial Fulfilment of The Requirements of The Subject Probability and Statistics
21 pages
Geostatistics for Mining Students
No ratings yet
Geostatistics for Mining Students
25 pages
Classroom Rules & Statistical Measures
No ratings yet
Classroom Rules & Statistical Measures
17 pages
Chapter 1 - Solutions of Exercises
No ratings yet
Chapter 1 - Solutions of Exercises
4 pages
Central Tendency in Statistics
No ratings yet
Central Tendency in Statistics
38 pages
SPSS Tutorial and Excersise Book - 240514 - 081527
No ratings yet
SPSS Tutorial and Excersise Book - 240514 - 081527
74 pages
MSDS Sem I Syllabus - Statistics For Data Science
No ratings yet
MSDS Sem I Syllabus - Statistics For Data Science
3 pages
1.1 Mean, Median, Mode of Ungrouped Data'25
No ratings yet
1.1 Mean, Median, Mode of Ungrouped Data'25
11 pages
Basic Maths Measure of Central Tendency 18 BCA AIIT
No ratings yet
Basic Maths Measure of Central Tendency 18 BCA AIIT
44 pages
4 - Quarter Centered Moving Average
No ratings yet
4 - Quarter Centered Moving Average
5 pages
Statistics MCQ P4
No ratings yet
Statistics MCQ P4
3 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
240 pages
Psy 202 Conceptual Assignment 3
No ratings yet
Psy 202 Conceptual Assignment 3
5 pages
English Proficiency by Sex and Grade
No ratings yet
English Proficiency by Sex and Grade
2 pages
3rd Grading Grade 10 Mathematics
No ratings yet
3rd Grading Grade 10 Mathematics
17 pages
Signal Processing Essentials
No ratings yet
Signal Processing Essentials
12 pages
Anscombe's Data Workbook
No ratings yet
Anscombe's Data Workbook
5 pages
Dispersion 2
No ratings yet
Dispersion 2
22 pages
S.id.1-3 Frame
No ratings yet
S.id.1-3 Frame
24 pages
Jawaban Soal Uas Analisa Data
No ratings yet
Jawaban Soal Uas Analisa Data
7 pages
Statistics Project: Khizar Bin Nasir Salman Ali Ghause Ahmad
No ratings yet
Statistics Project: Khizar Bin Nasir Salman Ali Ghause Ahmad
11 pages
Data Analysis Calculator
No ratings yet
Data Analysis Calculator
28 pages

Report

Uploaded by

Report

Uploaded by

MAS291 Final Project

DS170082; Bùi Nữ Vân Nhi - DA160062}

example of the dataset:

II. Descriptive and Inferential Statistical Analysis:

a. Variable Weight (kg):

measure of dispersion, was calculated to be approximately 45.51 kg^2

b. Variable Height (cm):

a nearly symmetrical shape, as evidenced by a small negative skewness value.

variance, a measure of height dispersion, was computed to be approximately 38.51 cm^2,

providing an understanding of the spread of heights around the mean.

skewed towards relatively shorter heights, as indicated by the lower frequencies in

the taller height ranges.

Hypothesis Testing and Confidence Interval of the mean of a population

Population: All female students

Sample: This dataset

female students. In here, we will use t-distribution because sigma is unknown.

C.I for average height of all althletes in the world

female students based on this data is (163.35, 167.98)

Hypothesis Testing for population proportion P

Population: All female students

Sample: This data

world is 60% of the total population with 10% significance level

world is not equal to 60%.

different from 60%.

III. Constructing Simple Linear Regression and Analyzing Result:

variables and the response variable.

case, the analysis is based on 30 observations.

strong influence on the response variable.

by approximately 0.95 units, after accounting for the intercept.

vice versa: Y= 0.951343*X - 97.87254

Answer: Her predicted weight is Y = 63.8557 (kg)

X Variable 1 Line Fit Plot

effectively communicate the findings.

height and weight.

data, allowing for easier interpretation and comprehension.

findings can inform further research, interventions, or decision-making processes related to

height and weight management among female students.

You might also like