Major Project Report
On
Course Recommendation System
Using Machine Learning
Submitted in partial fulfilment for
Internship
in
Department of Computer Science and Engineering
NIT Patna
Submitted by
Kamran Farogh (201900252)
B.Tech
Computer Science Engineering
Sikkim Manipal Institute of Technology
Under the supervision of
External Supervisor
Md. Tanwir Uddin Haider
Associate Professor
Department of Computer Science and Engineering
National Institute of Technology Patna
May 2023
[i]
CERTIFICATE
Department of Computer Science and Engineering
National Institute of Technology Patna
This is to certify that Kamran Farogh (201900252) B.Tech
Computer Science Engineering, Sikkim Manipal Institute of
Technology, Sikkim, have carried out the Internship Major
project 8th semester entitled “Course Recommendation System
using Machine Learning” under the supervision of Md. Tanwir
Uddin Haider, Associate Professor, Department of Computer
Science and Engineering, NIT Patna.
……………………………..
Md. Tanwir Uddin Haider
Associate Professor
CSE Department
NIT Patna
[ii]
DECLARATION
I hereby declare that this Internship Major project 8th semester
entitled “Course Recommendation System using Machine
Learning” has been carried out by me in the department of
Computer Science and Engineering of National Institute of
Technology Patna under the guidance Md. Tanwir Uddin
Haider, Associate Professor, Department of Computer Science
and Engineering, NIT Patna. No part of this work has been
submitted to any other institute other than Sikkim Manipal
Institute of Technology.
Name Signature
1. Kamran Farogh (201900252) ……………………………
B.Tech
Computer Science Engineering
Sikkim Manipal Institute of Technology
Place: NIT Patna Date: ……………………..
[iii]
LIST OF CONTENTS
CHAPTERS TOPICS PAGE
NO.
Certificate ii
Declaration iii
Acknowledgement 1
SYNOPSIS 2
1. INTRODUCTION 3
Overview 3
Importance 3
2. LITERATURE SURVEY 4
Challenges 9
3. METHODOLOGY 10
Proposed Framework 10
Framework Modules 11
Technology Used 16
4. FUTURE SCOPE 18
REFERENCES 19
[iv]
ACKNOWLEDGEMENT
I express my deepest gratitude towards my project guide Md. Tanwir
Uddin Haider, Associate Professor, Department of Computer Science
Engineering, National Institute of Technology Patna for his valuable
suggestions, insightful criticisms and directions for the proposal
development of this project. He has always encouraged us to get out the
shell and do something innovative despite our limitations. We wish to
convey our sincere gratitude to the Head of Department and all the faculties
of the Computer Science and Engineering Department who have
enlightened us during our studies. The faculties and cooperation received
from the technical staff of the Department of Computer Science and
Engineering is thankfully acknowledged.
Date: 02-05-2023
…………………………………..
Kamran Farogh (201900252)
B.Tech
Computer Science
Engineering
Sikkim Manipal Institute of
Technology
[1]
SYNOPSIS
E-learning as an alternative to conventional classroom-based education has
garnered extensive appeal lately resulting in an abundance of online
courses and content influencing learner's decision-making process whilst
selecting suitable courses suiting their individual needs.
This study proposes utilizing data analytics to develop a personalized
course recommendation system. This algorithm analyses an individual's
academic background and personal interests to provide tailored
recommendations suited towards meeting their objectives whilst also
furnishing students with necessary information regarding prerequisites
before enrolling into the program. Additionally, it lends aid to students
with supplementary materials or resources that are advantageous to
improve the learning experience.
Thus, this proposed recommendation system holds enormous potential to
streamlining student's selection process enabling fulfilling and
personalized learning paths. With data analytics playing an integral part,
the emergence of tailored course suggestions for every individual student
mark a notable turning point in e-learning’s growth, offering
unprecedented opportunities for learners to engage more fully with their
online studies. We used Kaggle for the dataset for courses that were used
as raw data. After importing the raw data, we cleaned the data to correct
errors in the data. Performing this data can produce consistent data that
may be utilized for additional analysis and modeling. We derive the link
between relation and entity, then vectorize the data by translating data from
its native format to numerical vectors so that machine learning algorithms
can understand. We use cosine and linear similarity to assess and quantify
the similarity between two vectors and a set of data, then the data is used
to develop recommendations. Then we develop exploratory data analysis
(EDA) to analyze, summarize and identify patterns, relationships which
then be used for visualization, and statistics to gain insight into data. Then
Flask is used to build web app-like applications in the easy-to-use interface
so that any age group of users can use it without facing any trouble.
[2]
Chapter 1
INTRODUCTION
1.1 OVERVIEW
The rise of popularity of e-learning as an alternative to traditional
classroom learning has reached its peak in the past few years. Due to
the COVID-19 pandemic education system has changed significantly.
Students from all around the world shifted from classroom learning to
online learning platforms. Even after the pandemic over and everything
going back to normal, the vast majority of students are still preferring
e-learning platforms. E-learning platforms provide the flexibility for
learners to decrease the time for searching for the learning content, it
provides learners with recommendations that align with their goal and
interests, and it also increases learners' interest. As good as e-learning
platforms sound it has some flaws, users normally tend to encounter
information overload due to the vast variety of similar specification
courses.
1.2 Importance
The whole purpose is for developing a recommendations system which
provides courses that are similar to the required courses and is highly
recommended as per course subscriber, level of the course and rating of
the course. The recommendation model will use the provided course
information and provide the best courses which will help the user to
complete the course with ease.
There are many types of filtering methods Collaborative filtering,
Content-based filtering, and Hybrid filtering. The collaborative filtering
recommendation model aims to recommend items such as articles,
videos, products, etc. according to users' who have preferences that are
similar to the active users. The Content-Based filtering
recommendation model customizes the content for users based on the
preference of what users have learned. Hybrid Filtering combines
information from either learners or learning objects. We focus on
[3]
applying the Hybrid Filtering method to improve the quality of e-
learning recommendations.
The easy-to-use interface helps the users of any age group use to operate
the model without any trouble.
[4]
Chapter 2
LITERATURE SURVEY
[1] Klašnja-Milićević, A., Ivanović, M., Vesin, B., & Budimac, Z.
(2017). Enhancing e-learning systems with personalized
recommendation based on collaborative tagging techniques.
The paper discusses the use of collaborative tagging systems to enhance
recommender systems in e-learning. The authors propose using tag-
based recommendations to personalize the learning experience for
individual learners. They analyze different techniques for applying tag-
based recommendations and modify the most appropriate model
ranking to gain the most efficient recommendation results. They also
propose reducing tag space with clustering techniques based on a
learning style model to improve execution time and decrease memory
requirements while preserving the quality of the recommendations. The
authors evaluate the reduced model for providing tag-based
recommendations in a programming tutoring system. Overall, the paper
suggests that collaborative tagging systems can provide enhancements
to recommender systems in e-learning.
[2] Dwivedi, P., Kant, V., & Bharadwaj, K. K. (2017). Learning path
recommendation based on modified variable length genetic
algorithm. Education and Information Technologies
The paper presents a personalized learning path recommendation
system for e-learners using a variable length genetic algorithm. The
system considers learners' preferences such as knowledge levels,
learning styles, and emotions to recommend a sequence of learning
materials in an appropriate order with a starting and ending point. The
proposed approach has been evaluated through experiments on a dataset
collected through a survey. The results show that the system is effective
in recommending personalized learning paths for individuals.
[5]
[3] George, G., & Lal, A. M. (2019). Review of ontology-based
recommender systems in e-learning. Computers & Education
The paper discusses the use of ontology in recommender systems for
personalized resource recommendation in the e-learning domain. The
paper concludes that ontology-based recommender systems show more
promising results in generating recommendations for e-learning as
compared to conventional recommender systems used in e-commerce
and other domains. Ontology-based recommender systems allow for
more detailed information about the learner and learning objects to be
factored in, which leads to more relevant recommendations. However,
like any technique, ontology has its pros and cons.
[4] Kolekar, S. V., Pai, R. M., & M. M., M. P. (2018). Rule based
adaptive user interface for adaptive E-learning system
The paper proposes a generic approach to provide learning contents
with Adaptive User Interface (AUI) components based on the learning
styles of the learners. The approach defines generic rules that are
generated automatically for any online course with adaptive contents.
The experiment conducted on engineering students for a particular
online course shows the well adaptation of user interface components
and contents based on learning styles. The result indicates that the
proposed approach can provide customized resources and interfaces to
learners, which can improve their learning experience.
[5] Khaled, A., Ouchani, S., & Chohra, C. (2018). Recommendations-
based on semantic analysis of social networks in learning
environments
The paper presents a framework called iLearn that uses web semantic
and social networks to improve the learning process and
recommendation quality. iLearn is a fully automatic learning platform
that helps tutors understand the limitations of students and provides
appropriate resources and guidance. The platform uses two ontologies,
[6]
one for understanding the feelings of users towards resources and
recommendations, and the other for categorizing different resources.
iLearn also uses semantic analysis algorithms to detect different
learning communities and provide appropriate recommendations. The
paper concludes that iLearn advances the state-of-the-art crowd
intelligence and anticipatory computing.
2.1 Challenges
2.1.1 Cleaning and identifying the link between relations and entity
as per user needs
The existing system does not provide the data for the fundamental
knowledge to users before applying for the course. So, our system first
asks the user for the course he/she wants to study and develop
recommendations related to that course so that user can complete the
desired course without facing any trouble.
2.1.2 Developing fixed path recommendations for the course
The existing system does not provide the fixed number of
recommendations, what it means is that the sequence of
recommendations generated by the existing model is recommends far
too many courses to the user which might lead to information overload.
So, our system generates top 6 recommendations as per user needs so
that he/she choose the desired top recommended courses which are top
rated by other users to decide for the desired course to study.
[7]
Chapter 3
METHODOLOGY
3.1 Proposed Framework
A framework has been proposed to develop this project which contains
seven modules which is shown in Fig: 1
Fig 1: The proposed framework of the project
[8]
3.2 Framework Modules
3.2.1 Input Raw Data
Raw data refers to unprocessed, unanalyzed, or unstructured data that lacks
context, structure, or meaning, and necessitates cleaning, organization, and
analysis to extract meaningful insights. In its unrefined state, raw data is
typically disorderly and unrecognizable, rendering it difficult to interpret
and derive meaningful conclusions. We used Kaggle for the dataset for
courses that were used as raw data. This data consists of 18 rows (unnamed,
course_title, url, is_paid, price, num_subscribers, num_reviews,
num_lectures, level, content_duration, published_timestamp, subject,
profit, published_date, published_time, year, month, day) as shown in Fig:
2.
Fig 2: Snapshot of raw data
3.2.2 Data Preprocessing Module
3.2.2.1 Data Cleaning
As a result, the data cleaning process identifies and corrects errors in the
data. This contributes to the data's accuracy and dependability. It produces
consistent data that may be utilized for additional analysis and modelling.
This data consists of 11 rows (course_title, url, num_subscribers,
[9]
num_reviews, num_lectures, level, content_duratoin,
published_timestamp, subject, year, clean_title) as shown in Fig: 3
Fig 3: Snapshot of clean data
3.2.2.2 Entity Relationship
Entities refer to the items we are interested in examining. Relationships are
the entities that link. It aids in the development of reliable models capable
of making forecasts and suggestions. As shown in Fig: 4 the link between
course and subject has been formed.
Fig 4: Snapshot of developed Entity and relationship
3.2.3 Vectorization of Data
Vectorization is the act of translating data from its native format, which in
this case is text, into numerical vectors that machine learning algorithms
can understand. This enables the algorithms to handle and analyze data
more effectively, as well as conduct operations like clustering and
classification. Fig: 5 shows the vectorization of data.
[10]
Fig: 6 Snapshot of output of performed vectorization of the data
3.2.4 Metric Module
3.2.4.1 Cosine Similarity
It assesses the similarity of two vectors. It computes the cosine of the angle
formed by two vectors and returns a number between -1 and 1. A value of
1 indicates that the vectors are identical, whereas a value of -1 indicates
that they are utterly distinct. It is frequently employed in text analysis and
recommendation systems. Fig: 7 shows the output generated by cosine
similarity model.
Fig: 7 Snapshot of output of performed cosine similarity
3.2.4.2 Linear Similarity
Linear similarity quantifies the similarity of two sets of data. It works by
establishing the linear connection between the two sets of data and how
well they match. This method is widely used in picture and audio
[11]
recognition, recommendation systems, and natural language processing.
Fig: 8 shows the output generated by linear similarity model.
Fig: 8 Snapshot of output of performed linear similarity
3.2.5 Exploratory Data Analysis (EDA)
It analyzes and summarize datasets. It helps to identify patterns,
relationships, and irregularities in data, as well as determine which
variables are important. EDA involves the use of visualizations, statistics,
and other methods to gain insight into data and prepare it for modeling.
Fig: 9 Shows the statistics developed by EDA model consisting of number
of subscribers domain wise and number of courses level wise. Fig: 10
Shows statistics developed by EDA model consisting number of
subscribers enrolled per subject category.
Fig: 9 Snapshot of statistics of generated by EDA model
[12]
Fig: 10 Snapshot of statistics of generated by EDA model
3.2.6 Recommend
It displays multiple recommendations developed by the recommendation
model as per user needs so than user can choose the specific course, he/she
wants to learn. Fig: 11 Shows the recommendation generated by the model.
Fig: 11 Snapshot of recommendation of generated by the model
[13]
3.2.9 Display
Selected recommended course by user which was developed by the model
is being redirected and displayed on user’s device. Fig: 12 Shows that the
course user has selected and redirected to the course website.
Fig: 12 Snapshot of redirected to selected course
3.3 Technology Used
3.3.1 Jupyter Notebook:
o Jupyter Notebook (formerly known as IPython Notebook) is a free
and open-source web tool that lets you create and share documents
with live code, equations, visualisations, and narrative prose.
o Jupyter Notebook is a data analysis, scientific computing, and
machine learning tool that supports numerous computer languages,
including Python, R, Julia, and Scala.
[14]
o Jupyter Notebook papers are referred to as "notebooks" and are
divided into cells. Each cell can have code, markdown text, or raw
text in it.
o Code cells allow you to write and run code interactively, whereas
markdown cells allow you to enter styled text in Markdown.
Unformatted material is written in raw cells.
3.3.2 Flask:
o Flask, an adept micro web framework authored in Python, enables
software engineers to expeditiously build web applications.
o It garners the moniker "micro" because it has a meager footprint and
exclusively supplies indispensable features necessary for web app
construction.
o Flask relies on the Werkzeug WSGI toolkit and the Jinja2 template
engine. It supports HTTP requests and responses, routing, session
management, and templating.
o The forte of Flask lies in its straightforwardness and versatility.
o It does not enforce any specific project composition or require
specific libraries, which grants developers full autonomy in
constructing applications according to their preferences.
o Flask is also lightweight and efficient, thus serving as an optimal
option for developing scalable web applications.
3.3.3 Jinja2
o Jinja2 is a prevalent and renowned templating mechanism devised
for Python web applications.
[15]
o Its purpose is to provide developers with the ability to craft dynamic
web pages by amalgamating static template files with the dynamic
content that is produced.
o It operates by utilizing a syntax that is quite analogous to Python and
permits a broad spectrum of template inheritance, loops, macros, and
conditional constructs.
[16]
Chapter 4
Conclusion and Future Work
Course Recommendation System using Machine Learning is web app
whose functionality is to provide facility to user to find the perfect course
which meets user needs. It develops a recommendations system which
provides fundamental related courses that are similar to the required
courses so that the user does not face any difficulty while completing the
course. The objective of the project is to optimize the time and also to
minimize the time taken while finding the perfect course by generating
courses which are top rated and recommended.
In future software can be enhanced to generate learning pattern that will be
developed by recommendation model based on feedback. Users can upload
their developed course for other users to access. Cross platform support.
Native app like feel using React. The app can be modified for generic
courses (For now focuses on four courses only i.e Finance, Graphic
Designing, Musical Instruments and Web Development).
[17]
REFERENCES
[1] Klašnja-Milićević, A., Ivanović, M., Vesin, B., & Budimac, Z.
(2017). Enhancing e-learning systems with personalized
recommendation based on collaborative tagging techniques. Applied
Intelligence, 48(6), 1519–1535. doi:10.1007/s10489-017-1051-8.
[2] Dwivedi, P., Kant, V., & Bharadwaj, K. K. (2017). Learning path
recommendation based on modified variable length genetic
algorithm. Education and Information Technologies, 23(2), 819–
836. doi:10.1007/s10639-017-9637-7.
[3] George, G., & Lal, A. M. (2019). Review of ontology-based
recommender systems in e-learning. Computers & Education,
103642. doi:10.1016/j.compedu.2019.103642.
[4] Kolekar, S. V., Pai, R. M., & M. M., M. P. (2018). Rule based
adaptive user interface for adaptive E-learning system. Education
and Information Technologies. doi:10.1007/s10639-018-9788-1.
[5] Khaled, A., Ouchani, S., & Chohra, C. (2018). Recommendations-
based on semantic analysis of social networks in learning
environments. Computers in Human Behavior.
doi:10.1016/j.chb.2018.08.051.
[18]