0% found this document useful (0 votes)

55 views16 pages

Academic Performance Insights

The document describes a student score visualization project that aims to analyze and visualize academic performance data. It generates simulated student data with unique math and science scores. These scores are then cleaned and analyzed using descriptive statistics, graphs, regression models, and ANOVA tests to explore patterns and relationships in student performance. The project provides visualizations of individual and grouped student scores to offer insights for educators.

Uploaded by

Sajan Hegde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views16 pages

Academic Performance Insights

Uploaded by

Sajan Hegde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Student Score Visualisation

ABSTRACT
The "Student Performance Analysis and Visualization" project aims to analyze and visualize
the academic performance of a group of students based on their math and science scores. The
project begins by generating simulated student data, including student names, unique math
scores, and unique science scores. These scores are then cleaned to remove missing values
and compute a total score.

The analysis encompasses various aspects of the dataset, including descriptive statistics,
subject-wise score comparisons using bar graphs, a scatter plot with a linear regression line to
explore the relationship between math and science scores, and a multiple linear regression
model to predict total scores based on math and science scores.

In addition to these analyses, the project offers a comprehensive visualization suite. It

includes individual bar graphs for each student's scores, a box plot to visualize score
distributions, a correlation matrix heatmap to examine relationships between variables, and a
residual plot to assess the assumptions of the regression model. Histograms and pairwise
scatterplots provide further insights into score distributions and relationships.

Furthermore, the project conducts ANOVA tests to investigate potential differences in math
and science scores based on student names, shedding light on any statistically significant
variations among students.

This project provides a comprehensive exploration of student performance data, offering

valuable insights and visualizations for educators and researchers interested in understanding
academic achievement patterns

Dept. Of CSE, DSATM 2022-2023 1

Student Score Visualisation

INTRODUCTION

In today's educational landscape, understanding student performance and academic

achievement is crucial for educators, administrators, and policymakers. Data analysis and
visualization play a pivotal role in gaining insights into student scores and performance
trends. The "Student Performance Analysis and Visualization" project serves as a
comprehensive exploration of academic data, providing a valuable toolkit for examining and
interpreting student performance.

This project begins by simulating student data, each with unique math and science scores,
mirroring the diversity of academic profiles encountered in real-world educational settings.
Through data cleaning and aggregation, we prepare the dataset for analysis, ensuring
accuracy and completeness.

PROGRAM CODE

Dept. Of CSE, DSATM 2022-2023 2

Student Score Visualisation

# Load required packages

library(dplyr)

library(ggplot2)

library(gridExtra) # For arranging plots

library(tidyr)

# Simulated student data with unique math and science scores

set.seed(123) # For reproducibility

student_data <- data.frame(

student_id = 1:100,

name = sample(c("Rohan", "Aisha", "Sanya", "Vikram", "Raj"), 100, replace = TRUE)

# Generate unique math and science scores for each student

student_data <- student_data %>%

group_by(name) %>%

mutate(

math_score = sample(40:95, n(), replace = FALSE),

science_score = sample(40:100, n(), replace = FALSE)

) %>%

ungroup()

# Data Cleaning

cleaned_data <- student_data %>%

Dept. Of CSE, DSATM 2022-2023 3

Student Score Visualisation

filter(!is.na(math_score) & !is.na(science_score)) %>%

mutate(total_score = math_score + science_score)

# Descriptive Statistics

summary_stats <- cleaned_data %>%

summarize(

avg_math = mean(math_score),

avg_science = mean(science_score),

avg_total = mean(total_score),

max_total = max(total_score)

print(summary_stats)

# Bar graph: Subject-wise scores comparison

bar_data <- cleaned_data %>%

gather(key = "subject", value = "score", math_score, science_score)

ggplot(bar_data, aes(x = subject, y = score, fill = subject)) +

geom_bar(stat = "identity", position = "dodge") +

labs(title = "Subject-wise Scores Comparison",

x = "Subject", y = "Score", fill = "Subject") +

theme_minimal()

# Scatter plot with linear regression line

Dept. Of CSE, DSATM 2022-2023 4

Student Score Visualisation

ggplot(cleaned_data, aes(x = math_score, y = science_score)) +

geom_point() +

geom_smooth(method = "lm", se = FALSE, color = "blue") +

labs(title = "Math vs. Science Scores",

x = "Math Score", y = "Science Score")

# Multiple Linear Regression

multi_reg <- lm(total_score ~ math_score + science_score, data = cleaned_data)

# Summary of the multiple linear regression model

summary(multi_reg)

# Individual bar graphs for each student's scores

individual_plots <- list()

for (student_name in unique(student_data$name)) {

student_scores <- student_data %>%

filter(name == student_name) %>%

select(name, math_score, science_score) %>%

gather(key = "subject", value = "score", math_score, science_score)

individual_plots[[student_name]] <- ggplot(student_scores, aes(x = subject, y = score, fill =

subject)) +

geom_bar(stat = "identity", position = "dodge") +

labs(title = paste("Scores for", student_name),

x = "Subject", y = "Score", fill = "Subject") +

Dept. Of CSE, DSATM 2022-2023 5

Student Score Visualisation

theme_minimal()

# Combine all plots using grid.arrange

combined_plots <- grid.arrange(grobs = individual_plots, ncol = 2)

# Display the combined plots

print(combined_plots)

# Box plots for math and science scores

boxplot_data <- cleaned_data %>%

gather(key = "subject", value = "score", math_score, science_score)

ggplot(boxplot_data, aes(x = subject, y = score, fill = subject)) +

geom_boxplot() +

labs(title = "Box Plot of Math and Science Scores",

x = "Subject", y = "Score", fill = "Subject") +

theme_minimal()

# Correlation matrix heatmap

correlation_matrix <- cor(cleaned_data[, c("math_score", "science_score", "total_score")])

correlation_matrix_plot <- ggplot(data = as.data.frame(correlation_matrix),

aes(x = Var1, y = Var2, fill = value)) +

geom_tile() +

Dept. Of CSE, DSATM 2022-2023 6

Student Score Visualisation

scale_fill_gradient(low = "white", high = "blue") +

labs(title = "Correlation Matrix Heatmap",

x = "Variable 1", y = "Variable 2", fill = "Correlation") +

theme_minimal() +

theme(axis.text.x = element_text(angle = 45, hjust = 1))

# Residual plot for multiple linear regression

residuals <- residuals(multi_reg)

ggplot(data = cleaned_data, aes(x = total_score, y = residuals)) +

geom_point() +

geom_hline(yintercept = 0, color = "red", linetype = "dashed") +

labs(title = "Residual Plot for Multiple Linear Regression",

x = "Total Score", y = "Residuals") +

theme_minimal()

# Histogram of total scores

ggplot(cleaned_data, aes(x = total_score)) +

geom_histogram(binwidth = 5, fill = "blue", color = "black") +

labs(title = "Histogram of Total Scores",

x = "Total Score", y = "Frequency") +

theme_minimal()

# Pairwise scatterplots

pairs(cleaned_data[, c("math_score", "science_score", "total_score")])

Dept. Of CSE, DSATM 2022-2023 7

Student Score Visualisation

# ANOVA for math scores by student name

anova_math <- aov(math_score ~ name, data = cleaned_data)

summary(anova_math)

# ANOVA for science scores by student name

anova_science <- aov(science_score ~ name, data = cleaned_data)

summary(anova_science)

Flowchart:
Dept. Of CSE, DSATM 2022-2023 8
Student Score Visualisation

Dept. Of CSE, DSATM 2022-2023 9

Student Score Visualisation

LIST OF FIGURES:

Fig 1: Residual plot for multiple line regression

Fig 2: Box plot of science and maths score

Fig 3: Subject verses Score comparison

Fig 4: Student score visualisation

Fig 5: Maths vs Science score

Dept. Of CSE, DSATM 2022-2023 10

Student Score Visualisation

OUTPUT
Df Sum Sq Mean Sq F value Pr(>F)
name 4 427 106.9 0.342 0.849
Residuals 95 29690 312.5

Dept. Of CSE, DSATM 2022-2023 11

Student Score Visualisation

OUTPUT SCREENSHOTS

Fig 1

Dept. Of CSE, DSATM 2022-2023 12

Student Score Visualisation

Fig 2

Fig 3

Dept. Of CSE, DSATM 2022-2023 13

Student Score Visualisation

Fig 4

Fig 5

Dept. Of CSE, DSATM 2022-2023 14

Student Score Visualisation

CONCLUSION

In our project, we looked at how students are doing in their studies. We started by creating a
pretend group of students with different scores in math and science. We made sure our data
was clean and correct so that we could study it properly.

First, we found out some basic things about the scores, like the average (typical) scores and
the best scores.

Then, we used pictures and graphs to show the scores in math and science. This helped us see
which subjects students were better at and if they were related.

We also made a special math equation to guess how well a student would do in all their
subjects. This can help teachers and schools plan better.

We didn't just look at numbers. We also used pictures to show how scores are different for
each student. This can help us understand what's going on.

Lastly, we checked if there were big differences in scores based on students' names.

In summary, our project showed that using data and pictures can help us understand how
students are doing in school. This can help teachers and schools make better decisions to help
students succeed. We know there's more to explore, and this project is just the beginning.

Dept. Of CSE, DSATM 2022-2023 15

Student Score Visualisation

REFERENCES

1. Books

 "R for Data Science" by Hadley Wickham and Garrett Grolemund.

 "Shiny in Action" by Hadley Wickham.

 "Data Visualization with ggplot2" by Hadley Wickham.

2. Online Tutorials and Documentation:

 R Project's official website: https://www.r-project.org/

 Shiny documentation: https://shiny.rstudio.com/

 ggplot2 documentation: https://ggplot2.tidyverse.org/

 RStudio's online learning resources: https://education.rstudio.com/learn/

3. Blogs and Websites:

 R-bloggers: https://www.r-bloggers.com/

 RStudio blog: https://blog.rstudio.com/

 R Views: https://rviews.rstudio.com/

4. Forums and Q&A:

 Stack Overflow's R tag: https://stackoverflow.com/questions/tagged/r

 RStudio Community: https://community.rstudio.com/

5. GitHub Repositories and Projects:

 Explore GitHub repositories related to Shiny applications and R data

visualization.

Dept. Of CSE, DSATM 2022-2023 16

Heatless Regenerative Dessicant Dryers
No ratings yet
Heatless Regenerative Dessicant Dryers
20 pages
R Functions
No ratings yet
R Functions
8 pages
3 - Assignment Question - Updated
No ratings yet
3 - Assignment Question - Updated
6 pages
Data Visualization Course Guide
No ratings yet
Data Visualization Course Guide
2 pages
TCP1101 Assignment PDF
No ratings yet
TCP1101 Assignment PDF
16 pages
PNB vs. CA 217 Scra 347
100% (1)
PNB vs. CA 217 Scra 347
2 pages
Capstone Project On R Studio
No ratings yet
Capstone Project On R Studio
13 pages
جودة المواقع PDF
No ratings yet
جودة المواقع PDF
25 pages
R Assignment
No ratings yet
R Assignment
9 pages
Data Visualization for Students
No ratings yet
Data Visualization for Students
2 pages
6406 Report
No ratings yet
6406 Report
7 pages
Student Performance Analysis Guide
No ratings yet
Student Performance Analysis Guide
3 pages
Demand Analysis of Maggi
83% (6)
Demand Analysis of Maggi
8 pages
Internal 1
No ratings yet
Internal 1
2 pages
Go Ahead PDF
No ratings yet
Go Ahead PDF
2 pages
CFA LEVEL 1 - CFA Exam Core Video Series
No ratings yet
CFA LEVEL 1 - CFA Exam Core Video Series
2 pages
CH 4 Data Visualization
No ratings yet
CH 4 Data Visualization
43 pages
2011-2012 Tuition Fees Rates
No ratings yet
2011-2012 Tuition Fees Rates
2 pages
Apex Freebitcoin High Odds Long Runner Intelligent Bot
No ratings yet
Apex Freebitcoin High Odds Long Runner Intelligent Bot
16 pages
Matplotlib
No ratings yet
Matplotlib
6 pages
Radix Senegae
No ratings yet
Radix Senegae
13 pages
Module 2.9
No ratings yet
Module 2.9
11 pages
User Manual: Di1611/Di1811p/Di2011 Twain Driver
No ratings yet
User Manual: Di1611/Di1811p/Di2011 Twain Driver
21 pages
LiFePO4 Battery Specs HP-50160282
No ratings yet
LiFePO4 Battery Specs HP-50160282
14 pages
DV Lab Manual (Ex - No.1-10)
No ratings yet
DV Lab Manual (Ex - No.1-10)
23 pages
Lab 2 - Basic Statistical Analysis
No ratings yet
Lab 2 - Basic Statistical Analysis
7 pages
Impacts of The World Recession and Economic Crisis On Tourism North America
No ratings yet
Impacts of The World Recession and Economic Crisis On Tourism North America
11 pages
Ai Project
No ratings yet
Ai Project
9 pages
Dabra, Gwalior
No ratings yet
Dabra, Gwalior
3 pages
Data Analytics and Visualization: 23SDAD01A
No ratings yet
Data Analytics and Visualization: 23SDAD01A
95 pages
Strategic Change - 2022 - Joy - Digital Future of Luxury Brands Metaverse Digital Fashion and Non Fungible Tokens
No ratings yet
Strategic Change - 2022 - Joy - Digital Future of Luxury Brands Metaverse Digital Fashion and Non Fungible Tokens
7 pages
MIT 212 Collecting and Organizing Data - Tutorial 08
No ratings yet
MIT 212 Collecting and Organizing Data - Tutorial 08
5 pages
EMBA Day3
No ratings yet
EMBA Day3
29 pages
UNIT4
No ratings yet
UNIT4
8 pages
PDS Expt6 2023800095
No ratings yet
PDS Expt6 2023800095
16 pages
@vtucode - in 21CS644 Module 4 2021 Scheme
No ratings yet
@vtucode - in 21CS644 Module 4 2021 Scheme
33 pages
Da Laqs Saqs
No ratings yet
Da Laqs Saqs
23 pages
121a1086 - Bda - Assignment - No.2
No ratings yet
121a1086 - Bda - Assignment - No.2
31 pages
LAB 1 - Matlab Basic
100% (1)
LAB 1 - Matlab Basic
26 pages
Yarn Counting System
No ratings yet
Yarn Counting System
2 pages
LESSON PLAN FORMAT HAND TOOLS Arbelle
No ratings yet
LESSON PLAN FORMAT HAND TOOLS Arbelle
2 pages
R Code
No ratings yet
R Code
9 pages
HUAWEI MateView GT Quick Start Guide - (01, En-Us, Zhuque)
No ratings yet
HUAWEI MateView GT Quick Start Guide - (01, En-Us, Zhuque)
41 pages
Chem-Project 1
No ratings yet
Chem-Project 1
4 pages
Data Unit4
No ratings yet
Data Unit4
8 pages
Unit 4 Bank Deposits and Lending
No ratings yet
Unit 4 Bank Deposits and Lending
30 pages
Casemine Judgments 12
No ratings yet
Casemine Judgments 12
8 pages
Numpy Advanced Functional Analysis Questions
No ratings yet
Numpy Advanced Functional Analysis Questions
1 page
PE
No ratings yet
PE
552 pages
IT - R23 - Skills Development-DATA VISUALIZATION Lab
No ratings yet
IT - R23 - Skills Development-DATA VISUALIZATION Lab
31 pages
Welcome
No ratings yet
Welcome
8 pages
2023 TPM Award Winners Announced
No ratings yet
2023 TPM Award Winners Announced
27 pages
DV Lab Manual
No ratings yet
DV Lab Manual
97 pages
Data Visualization and Communication Introduction
No ratings yet
Data Visualization and Communication Introduction
14 pages
DV Lab Manual ICEAS
No ratings yet
DV Lab Manual ICEAS
98 pages
Sequence Paper
No ratings yet
Sequence Paper
10 pages
Data Virt QB Updated
No ratings yet
Data Virt QB Updated
12 pages
Ifrs 8 Aggregation of Operating Segments
No ratings yet
Ifrs 8 Aggregation of Operating Segments
8 pages
The Theoretical Framework of The Optimization of Public Transport Travel
No ratings yet
The Theoretical Framework of The Optimization of Public Transport Travel
7 pages
Samarthvresume 21
No ratings yet
Samarthvresume 21
2 pages
R Studio Commands
No ratings yet
R Studio Commands
19 pages
Week13 Slides Review
No ratings yet
Week13 Slides Review
23 pages
Machine
No ratings yet
Machine
10 pages
CODAP Sequence Complete
No ratings yet
CODAP Sequence Complete
211 pages
CSE 2nd Year B.tech. CS and Syllabus 2023-24 (R22) - 24
No ratings yet
CSE 2nd Year B.tech. CS and Syllabus 2023-24 (R22) - 24
1 page
R22 DVT Handout
No ratings yet
R22 DVT Handout
10 pages
UNIT 3 - Exploratory Graphs
No ratings yet
UNIT 3 - Exploratory Graphs
23 pages
Planmeca
No ratings yet
Planmeca
27 pages
DSV Manual Final
No ratings yet
DSV Manual Final
47 pages
Maintenance Task Record E Rating English
No ratings yet
Maintenance Task Record E Rating English
11 pages
DV Unit I
No ratings yet
DV Unit I
47 pages
Student Performance Analysis and Prediction 2.3
No ratings yet
Student Performance Analysis and Prediction 2.3
19 pages
Student Performance Analysis and Prediction
No ratings yet
Student Performance Analysis and Prediction
19 pages
Module 2.9
No ratings yet
Module 2.9
12 pages
Unit V
No ratings yet
Unit V
24 pages
Da Pra Week-8 (Karthik S) - 074713
No ratings yet
Da Pra Week-8 (Karthik S) - 074713
9 pages
Auditing
No ratings yet
Auditing
54 pages
Data Visulization1
No ratings yet
Data Visulization1
39 pages
Manas College Pamphlet 2025
No ratings yet
Manas College Pamphlet 2025
2 pages
EDA Exp 2 Outout
No ratings yet
EDA Exp 2 Outout
7 pages
Ip Project Complete
No ratings yet
Ip Project Complete
55 pages
BC A Students Result Analysis Using R
No ratings yet
BC A Students Result Analysis Using R
11 pages
Student Data Analysis Report
No ratings yet
Student Data Analysis Report
7 pages
Chapt-3 Data Visualization
No ratings yet
Chapt-3 Data Visualization
73 pages
Da Week 8
No ratings yet
Da Week 8
5 pages
2 Mark Dev
No ratings yet
2 Mark Dev
6 pages