Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views19 pages

RS Notes1

Recommender systems are machine learning tools designed to predict user preferences and suggest relevant products or content, significantly enhancing user experience and business revenue. They can be categorized into content-based, collaborative filtering, knowledge-based, and hybrid models, each with distinct advantages and challenges. Hybrid models, which combine multiple techniques, are increasingly favored for their ability to improve recommendation accuracy and user satisfaction.

Uploaded by

manya.intellect
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views19 pages

RS Notes1

Recommender systems are machine learning tools designed to predict user preferences and suggest relevant products or content, significantly enhancing user experience and business revenue. They can be categorized into content-based, collaborative filtering, knowledge-based, and hybrid models, each with distinct advantages and challenges. Hybrid models, which combine multiple techniques, are increasingly favored for their ability to improve recommendation accuracy and user satisfaction.

Uploaded by

manya.intellect
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

A recommendation system (or recommender system) is a class of machine learning that uses data to

help predict, narrow down, and find what people are looking for among an exponentially growing
number of options.

Nowadays, people used to buy products online more than from stores. Previously, people used to
buy products based on the reviews given by relatives or friends but now as the options increased and
we can buy anything digitally we need to assure people that the product is good and they will like it.
To give confidence in buying the products, recommender systems were built.

Uses of Recommender systems:

1) Recommender systems are one of the most successful and widespread applications of
machine learning technologies in business.
2) Recommendation systems help to increase the business revenue and help customers to buy
the most suitable product for them.

What is a Recommendation engine?

Recommendation engines filter out the products that a particular customer would be interested
in or would buy based on his or her previous buying history. The more data available about a
customer the more accurate the recommendations.
But if the customer is new this method will fail as we have no previous data for that customer.
So, to tackle this issue different methods are used; for example, often the most popular
products are recommended. These recommendations would not be must accurate as they are
not customer dependent and are the same for all new customers.

Some businesses ask new customers their interests so that they can recommend more
precisely.

Content-based filtering

This filtering is based on the description or some data provided for that product. The system
finds the similarity between products based on its context or description. The user’s previous
history is taken into account to find similar products the user may like.

For example, if a user likes movies such as ‘Mission Impossible’ then we can recommend him
the movies of ‘Tom Cruise’ or movies with the genre ‘Action’.
In this filtering, two types of data are used. First, the likes of the user, the user’s interest, user’s
personal information such as age or, sometimes the user’s history too. This data is represented by
the user vector. Second, information related to the product’s known as an item vector. The item
vector contains the features of all items based on which similarity between them can be calculated.

Advantages

1. The user gets recommended the types of items they love.

2. The user is satisfied by the type of recommendation.

3. New items can be recommended; just data for that item is required.

Disadvantages

1. The user will never be recommended for different items.

2. Business cannot be expanded as the user does not try a different type of product.

3. If the user matrix or item matrix is changed the cosine similarity matrix needs to be
calculated again.

Collaborative filtering

Two types of collaborative filtering techniques are used:

1. User-User collaborative filtering

2. Item-Item collaborative filtering


User-User collaborative filtering

In this, the user vector includes all the items purchased by the user and rating given for each
particular product. The similarity is calculated between users using an n*n matrix in which n is the
number of users present. The similarity is calculated using the same cosine similarity formula. Now,
the recommending matrix is calculated. In this, the rating is multiplied by the similarity between the
users who have bought this item and the user to which item has to be recommended. This value is
calculated for all items that are new for that user and are sorted in descending order. Then the top
items are recommended to that user.

If a new user comes or old user changes his or her rating or provides new ratings then the
recommendations may change.

Item-Item collaborative filtering


Advantages

1. New products can be introduced to the user.

2. Business can be expanded and can popularise new products.

Disadvantages

1. User’s previous history is required or data for products is required based on the type of
collaborative method used.

2. The new item cannot be recommended if no user has purchased or rated it.

Both recommendation algorithms have their advantages and disadvantages. To make more accurate
recommendations nowadays the hybrid recommendation algorithm is used; that is products are
recommended using both content-based and collaborative filtering together. The hybrid
recommendation algorithm is more efficient and more useful.

Goals of Recommender systems.

Recommender systems are designed to provide personalized recommendations to users, helping


them discover relevant content, products, or services. The goals of recommender systems include:

1. Personalization

• Goal: Tailor recommendations to individual users based on their preferences, behaviour, and
history.

• Purpose: Enhance the user experience by providing relevant and customized suggestions.

2. User Engagement

• Goal: Increase user interaction and engagement with a platform.

• Purpose: By presenting users with content or products that interest them, recommender
systems keep users active and engaged, increasing the time they spend on the platform.

3. Sales and Revenue Growth

• Goal: Boost sales and revenue by recommending products or services that users are likely to
purchase.

• Purpose: For e-commerce platforms, this directly translates to increased conversion rates
and higher sales.

4. Discovery and Exploration

• Goal: Help users discover new and diverse content or products that they might not find on
their own.

• Purpose: Encourage users to explore beyond their usual preferences, leading to a richer user
experience.
5. Customer Retention

• Goal: Improve user loyalty and retention by consistently providing value through relevant
recommendations.

• Purpose: Satisfied users are more likely to return to a platform, increasing customer lifetime
value.

6. Efficiency

• Goal: Reduce the time and effort users spend searching for content or products.

• Purpose: Streamline the user journey by quickly presenting options that align with the user’s
needs and interests.

7. User Satisfaction

• Goal: Enhance overall user satisfaction by meeting or exceeding user expectations with
relevant recommendations.

• Purpose: Satisfied users are more likely to have a positive perception of the platform,
contributing to positive word-of-mouth and brand loyalty.

8. Supporting Decision Making

• Goal: Aid users in making informed decisions by presenting them with options that align with
their preferences and needs.

• Purpose: Whether it’s choosing a movie, book, or product, recommender systems help users
make choices more confidently.

9. Content Utilization

• Goal: Maximize the use of available content or products by effectively distributing attention
across the inventory.

• Purpose: Ensure that the platform’s offerings are fully utilized, preventing certain items from
being overlooked.

Recommender systems aim to balance these goals, optimizing for a positive user experience and
business outcomes. The specific focus may vary depending on the application context, such as e-
commerce, content streaming, social media, or other domains.

Basic models of RS

Recommender Systems (RS) rely on various models to suggest items to users. Here are some of the
basic models used in recommender systems:

1. Collaborative Filtering

• Overview: Collaborative Filtering (CF) recommends items based on the preferences of similar
users or the similarity between items.

• Types:
o User-Based Collaborative Filtering: Recommends items by finding users similar to
the target user and suggesting items they liked.

o Item-Based Collaborative Filtering: Recommends items by finding items similar to


those the target user has liked.

• Example: If User A and User B have similar taste in movies, and User A likes a movie that
User B hasn't seen, that movie will be recommended to User B.

• Strengths:

o Works well when user preferences are well documented.

o Doesn't require detailed information about the items.

• Weaknesses:

o Struggles with cold start problems (new users or new items with little data).

o Can suffer from sparsity issues if the user-item interaction matrix is large and sparse.

2. Content-Based Filtering

• Overview: Content-Based Filtering recommends items by analysing the characteristics of


items that a user has liked in the past.

• How It Works:

o It uses item features (like genre, director, and actors for movies) to suggest similar
items to those a user has interacted with.

• Example: If a user likes science fiction books, the system will recommend other science
fiction books based on genre, author, or other metadata.

• Strengths:

o Does not require data from other users.

o Works well for new items, overcoming the cold start problem for items.

• Weaknesses:

o Requires detailed item features.

o May lack diversity, leading to "filter bubbles" where the user is only exposed to
similar types of content.

3. Hybrid Models

• Overview: Hybrid models combine multiple recommendation techniques to leverage their


strengths and mitigate their weaknesses.

• Types of Hybridization:

o Weighted Hybrid: Combines scores from different models by assigning weights to


each model.
o Switching Hybrid: Switches between different models based on certain criteria (e.g.,
user data availability).

o Feature Combination: Combines content-based and collaborative filtering features


into a single model.

o Cascade: Applies one model first and then refines the recommendations using
another model.

o Mixed Hybrid: Presents results from different recommenders together.

• Strengths:

o Can improve recommendation accuracy by addressing weaknesses of individual


models.

o Offers flexibility in adapting to different domains and user needs.

• Weaknesses:

o Can be more complex to implement and tune.

o May require more computational resources.

4. Matrix Factorization

• Overview: Matrix Factorization is a collaborative filtering technique that factors the user-
item interaction matrix into lower-dimensional matrices representing users and items.

• How It Works:

o Decomposes the user-item matrix into two lower-dimensional matrices: one


representing users and the other representing items.

o The product of these matrices approximates the original interaction matrix, helping
to predict missing interactions.

• Example: Techniques like Singular Value Decomposition (SVD) are used in models like
Netflix's recommendation engine.

• Strengths:

o Efficiently handles large, sparse datasets.

o Can capture latent factors that explain user preferences.

• Weaknesses:

o Can be computationally intensive.

o Requires careful tuning of parameters.

5. Neural Network-Based Models

• Overview: These models use deep learning techniques to capture complex patterns in user-
item interactions.

• Types:
o Deep Neural Networks (DNNs): Use multiple layers of neurons to model non-linear
interactions between users and items.

o Recurrent Neural Networks (RNNs): Used for sequence-based recommendations,


like predicting the next item in a sequence.

o Convolutional Neural Networks (CNNs): Used for content-based recommendations,


especially for items with rich features like images or text.

o Autoencoders: Used for collaborative filtering by learning compressed


representations of user preferences.

• Strengths:

o Can model complex user-item interactions.

o Capable of handling large-scale data with diverse features.

• Weaknesses:

o Requires significant computational resources and large amounts of data.

o More challenging to interpret compared to traditional models.

6. Knowledge-Based Systems

• Overview: Knowledge-based systems use explicit knowledge about users and items, such as
rules, constraints, or preferences, to generate recommendations.

• How It Works:

o Leverages domain knowledge and user preferences explicitly provided by the user or
derived from external sources.

• Example: A travel recommendation system that suggests destinations based on specific user
constraints like budget, time, and activities.

• Strengths:

o Can handle situations where collaborative or content-based methods are not


applicable.

o Provides transparent and explainable recommendations.

• Weaknesses:

o Requires extensive domain knowledge.

o May not scale well with a large number of items.

Summary

Recommender systems can employ various models depending on the context and goals, each with its
strengths and weaknesses. Hybrid models are often used in practice to combine the benefits of
multiple approaches, improving recommendation accuracy and user satisfaction.
Recommender systems are tools that help users discover products or content by predicting their
preferences. There are several approaches to building recommender systems, each with its strengths
and challenges. Let's explore four key types: content-based, knowledge-based, ensemble-based,
and hybrid recommender systems.

collaborative filtering,

1. Content-Based Recommender Systems

Concept: Content-based recommender systems suggest items similar to those a user has liked in the
past. They rely on the features of the items and the user's preferences, creating a profile of the user’s
interests.

How It Works:

• User Profile Creation: A user profile is built using the features of items the user has
interacted with (e.g., liked, viewed, or purchased).

• Item Representation: Items are represented by their features (e.g., keywords, tags,
categories).

• Similarity Calculation: The system calculates the similarity between items the user has liked
and other items in the database using techniques like cosine similarity, Euclidean distance, or
others.

• Recommendation: Items that are most similar to those in the user's profile are
recommended.

Example: A movie recommender system that suggests movies based on genres, actors, or directors
that the user has previously enjoyed.

Advantages:

• Doesn’t require data from other users, avoiding the cold-start problem for new items.

• Can explain recommendations by pointing to item features.

Challenges:

• Limited to recommending items similar to those the user has already shown interest in,
leading to limited diversity.

• Requires a detailed feature representation of items.

2. Knowledge-Based Recommender Systems

Concept: Knowledge-based recommender systems suggest items based on explicit knowledge about
how certain item features meet user needs and preferences. They rely on domain-specific knowledge
rather than past user behavior.

How It Works:

• Knowledge Base: Contains rules or heuristics about the domain (e.g., "If a user wants a quiet
restaurant, recommend places with low noise levels").

• User Preferences: Users express their needs or preferences explicitly, and the system
matches them to the knowledge base.
• Recommendation: Items that match the user's criteria are recommended.

Example: A travel recommendation system that suggests destinations based on user-defined criteria
such as budget, climate preferences, or travel purpose.

Advantages:

• Effective for domains where user feedback is scarce or irrelevant (e.g., rare purchases, luxury
items).

• Can provide personalized recommendations based on specific user needs.

Challenges:

• Requires a well-maintained knowledge base, which can be difficult and costly to create.

• Doesn’t learn from user behavior over time unless integrated with machine learning
techniques.

3. Ensemble-Based Recommender Systems

Concept: Ensemble-based recommender systems combine multiple algorithms or models to make


more accurate and robust predictions. This approach leverages the strengths of various methods
while compensating for their weaknesses.

How It Works:

• Diverse Models: Multiple recommender models are trained independently, each potentially
using a different approach (e.g., collaborative filtering, content-based).

• Combination Strategies: The outputs of these models are combined using methods like
weighted averaging, voting, stacking, or more sophisticated machine learning techniques.

• Recommendation: The combined result is used to generate recommendations.

Example: A recommendation system that combines collaborative filtering, content-based filtering,


and knowledge-based methods to recommend books.

Advantages:

• Improved accuracy and robustness by leveraging different models.

• Can reduce the impact of individual model weaknesses, such as overfitting or bias.

Challenges:

• Increased computational complexity and resource requirements.

• More difficult to interpret the recommendations since multiple models contribute.

4. Hybrid Recommender Systems

Concept: Hybrid recommender systems combine two or more recommendation techniques (e.g.,
collaborative filtering, content-based, knowledge-based) to enhance performance, particularly in
dealing with the limitations of individual methods.

How It Works:
• Sequential Hybrid: One recommender’s output serves as input for another (e.g., a content-
based filter followed by a collaborative filter).

• Parallel Hybrid: Multiple recommenders run simultaneously, and their results are combined
(e.g., through weighted averaging).

• Mixed Hybrid: Different recommenders are applied to different parts of the user base or
item catalog (e.g., content-based for new users, collaborative for existing users).

Example: Netflix uses a hybrid system that combines collaborative filtering (based on user behavior)
with content-based methods (analyzing item features) to recommend movies and shows.

Advantages:

• Can address the cold-start problem by using content-based filtering for new items and
collaborative filtering for existing users.

• Offers more diverse recommendations, balancing the strengths of various methods.

Challenges:

• Complexity in design and implementation.

• Balancing the contributions of different components can be challenging.

Summary

• Content-Based Recommender Systems focus on the features of items and user profiles.

• Knowledge-Based Recommender Systems rely on domain knowledge and user-specified


criteria.

• Ensemble-Based Recommender Systems combine multiple algorithms to improve accuracy


and robustness.

• Hybrid Recommender Systems integrate different recommendation techniques to overcome


the limitations of individual methods and provide more effective recommendations.

Each approach has its advantages and challenges, and the choice of method depends on the specific
requirements and context of the application.

Neighborhood-based methods (clustering, dimensionality


reduction, regression modelling and graph models),
Neighborhood-based methods are a class of algorithms used in
recommender systems, particularly within collaborative filtering
approaches.
These methods focus on making predictions based on the similarities
between items or users.
1. Clustering
Concept: Clustering is a technique used to group similar items or
users together. In the context of recommender systems, clustering
can be applied to users or items to identify groups with similar
preferences.
How It Works:
• User Clustering: Users with similar rating patterns are grouped
into clusters. Recommendations for a user are then made based
on the preferences of other users within the same cluster.
• Item Clustering: Items that are similar based on user ratings or
features are grouped together. A user is recommended items
from the cluster that contains items they have already rated
highly.
Example: A music streaming service might cluster users based on
their listening habits and recommend songs that are popular within
the user's cluster.
Advantages:
• Reduces the complexity of the recommendation process by
limiting the search space to a relevant cluster.
• Can handle scalability better by dividing the dataset into
smaller, more manageable clusters.
Challenges:
• The quality of recommendations depends on the quality of the
clustering.
• Determining the optimal number of clusters can be difficult.

2. Dimensionality Reduction
Concept: Dimensionality reduction is used to reduce the number of
features or variables in the data, making the data easier to process
while preserving its essential structure. In recommender systems, it
helps in simplifying the user-item interaction matrix.
How It Works:
• Matrix Factorization: Techniques like Singular Value
Decomposition (SVD) or Principal Component Analysis (PCA) are
used to reduce the dimensions of the user-item matrix.
• Latent Factors: The reduced dimensions represent latent
factors, which capture underlying patterns in user preferences
and item characteristics.
Example: Netflix uses dimensionality reduction techniques to identify
latent factors such as "comedy vs. drama" or "action vs. romance" in
movies and make recommendations accordingly.
Advantages:
• Helps in handling the sparsity of the user-item matrix by
projecting it into a lower-dimensional space.
• Improves the efficiency and scalability of the recommendation
process.
Challenges:
• May lose some information during the dimensionality reduction
process.
• Requires careful tuning to ensure that important patterns are
preserved.
3. Regression Modeling
Concept: Regression modeling is used to predict the ratings or
preferences of users for items. In neighborhood-based methods,
regression can be employed to model the relationship between a
user's known preferences and their unknown preferences.
How It Works:
• Linear Regression: A simple linear model is built to predict a
user's rating for an item based on the ratings of similar items or
users.
• Regularization: Techniques like Ridge or Lasso regression are
often used to prevent overfitting by penalizing large coefficients
in the model.
Example: A book recommendation system might use regression to
predict how a user would rate a book based on how similar users
have rated similar books.
Advantages:
• Provides a clear and interpretable model for making
predictions.
• Can incorporate various features and user/item attributes into
the recommendation process.
Challenges:
• May not capture complex interactions between users and
items.
• Requires sufficient data to build an accurate model.
4. Graph Models
Concept: Graph models represent the relationships between users
and items as a graph, where users and items are nodes, and edges
represent interactions such as ratings, purchases, or likes.
How It Works:
• Bipartite Graphs: A common approach is to represent users and
items in a bipartite graph, where edges connect users to the
items they have interacted with.
A bipartite graph is a type of graph in graph theory that can be
divided into two distinct sets of vertices such that:
1. Every edge in the graph connects a vertex in one set to a vertex
in the other set.
2. No edge exists between two vertices within the same set.

• Graph-Based Algorithms: Techniques like Random Walks,


PageRank, or Graph Convolutional Networks (GCNs) can be
used to propagate information through the graph and make
recommendations.
Example: A social media platform might use a graph model to
recommend friends, content, or products by analyzing the
connections between users and their interactions.
Advantages:
• Can capture complex relationships and dependencies between
users and items.
• Naturally incorporates various types of interactions (e.g., co-
purchases, co-ratings).
Challenges:
• Graph construction and processing can be computationally
expensive, especially for large datasets.
• Requires sophisticated algorithms to effectively exploit the
graph structure.
Summary
• Clustering groups similar users or items to narrow down the
recommendation search space, making the process more
efficient.
• Dimensionality Reduction simplifies the data by reducing the
number of variables, helping to uncover latent factors that
influence user preferences.
• Regression Modeling predicts user preferences by establishing
relationships between known and unknown data points, often
using techniques like linear regression.
• Graph Models represent the interactions between users and
items as a graph, enabling the exploration of complex
relationships for making recommendations.
Each of these neighborhood-based methods contributes to building
more accurate, scalable, and interpretable recommender systems.
They can be used individually or combined to leverage their
respective strengths in different application contexts.

Cosine similarity algorithm: Deep dive


Cosine similarity is a measure of similarity between two non-zero
vectors of an inner product space based on the cosine of the angle
between them, resulting in a value between -1 and 1. The value -1
means that the vectors are opposite, 0 represents orthogonal
vectors, and value 1 signifies similar vectors.
Thus:
Cosine similarity is a measure of similarity, often used to measure
document similarity in text analysis.

Similarity = (A.B) / (||A||.||B||)

cosine similarity can be easily calculated using the mathematical


formula. But what if the data you have becomes too large and you
want to calculate the similarities fast? The most popular
programming language used for such tasks is definitely Python and
its flexibility is partly due to its extensive range of libraries. For
calculating cosine similarity, the most popular ones are:
• NumPy: the fundamental package for scientific computing in
Python, which has functions for dot product and vector
magnitude, both necessary for the cosine similarity formula.
• SciPy: a library used for scientific and technical computing. It
has a function that can calculate the cosine distance, which
equals 1 minus the cosine similarity.
• Scikit-learn: offers simple and efficient tools for predictive data
analysis and has a function to directly and efficiently compute
cosine similarity.
From the above-mentioned libraries, only scikit-learn directly
calculates the cosine similarity between two vectors or matrices,
making it an excellent tool for data analysts and machine learning
enthusiasts. It provides sklearn.metrics.pairwise.cosine_similarity
function to do that

Practical use cases of cosine similarity


Cosine similarity is used in various applications, mostly by data
scientists, to perform tasks for machine learning, natural language
processing, or similar projects. Their applications include:
• Text analysis, which is applied to measure the similarity
between documents and offers crucial functionality for search
engines and information retrieval systems, as shown in the
example.
• Recommendation systems, to recommend similar items based
on user preferences or to suggest similar users in social network
applications. An example is to recommend the next page on
product documentation based on the text similarity found.
• Data clustering, which in machine learning acts as a metric to
classify or cluster similar data points, and in that way, it helps
make data-driven decisions.
• Semantic similarity, which, when paired with word embedding
techniques like Word2Vec, is used to determine the semantic
similarity between words or documents.

You might also like