Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
41 views6 pages

Unsupervised Learning & RL Guide

Unsupervised learning techniques aim to uncover hidden patterns in unlabeled data. Common approaches include clustering, dimensionality reduction, and association rule learning. Frequent itemset mining discovers item co-occurrences in transactional data using the Apriori algorithm. PCA performs dimensionality reduction by projecting data onto orthogonal principal components that maximize variance.

Uploaded by

padmhastaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views6 pages

Unsupervised Learning & RL Guide

Unsupervised learning techniques aim to uncover hidden patterns in unlabeled data. Common approaches include clustering, dimensionality reduction, and association rule learning. Frequent itemset mining discovers item co-occurrences in transactional data using the Apriori algorithm. PCA performs dimensionality reduction by projecting data onto orthogonal principal components that maximize variance.

Uploaded by

padmhastaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

ML End-Sem

Unsupervised learning
It is a type of machine learning where algorithms are used to uncover patterns or hidden
structures in unlabeled data. Unlike supervised learning, where the algorithm learns from labeled
data (input-output pairs), unsupervised learning deals with input data that doesn't have
corresponding output labels.

There are several approaches to unsupervised learning, each serving different purposes:
1. Clustering: Clustering algorithms aim to partition data points into groups or clusters based on
similarities in their features. Some popular clustering algorithms include:
• K-means: Divides data into K clusters, where each data point belongs to the cluster with the
nearest mean.
• Hierarchical clustering: Builds a hierarchy of clusters by either merging or splitting them
based on distance metrics.
• DBSCAN: Density-Based Spatial Clustering of Applications with Noise identifies clusters in
high-density areas separated by low-density regions.

2. Dimensionality Reduction: These techniques aim to reduce the number of features


(dimensions) in a dataset while preserving important information. They help in visualizing and
compressing data, as well as reducing computational complexity. Common methods include:
• Principal Component Analysis (PCA): Finds linear combinations of features that maximize
variance.

3. Association Rule Learning: This technique discovers interesting relationships or associations


between variables in large datasets. A famous algorithm is the Apriori algorithm, used for
mining frequent itemsets in transactional databases, often applied in market basket analysis.

Frequent itemset mining


It is a fundamental concept in machine learning and data mining used to discover interesting
associations or relationships between items in a dataset. It's commonly applied in market basket
analysis, recommendation systems, and other areas where understanding co-occurrences or
patterns among items is crucial.
Here's an explanation of frequent itemset mining:
1. Support and Itemset
• Itemset: A collection of one or more items grouped together. For instance, in a market
basket dataset, an itemset could be {bread, milk, eggs}.
• Support: It is a measure indicating how frequently an itemset appears in a dataset.
Mathematically, support is defined as the proportion of transactions in the dataset that
contain the itemset.

2. Frequent Itemset Mining


• Objective: Discover itemsets that have support greater than or equal to a predefined
minimum support threshold.
• Apriori Principle: This principle suggests that if an itemset is frequent, then all of its subsets
must also be frequent. This property helps in reducing the search space while mining
frequent itemsets.
• Apriori Algorithm: A widely used algorithm for frequent itemset mining. It operates in
iterations, gradually finding itemsets with higher support.
o Initially, it finds frequent individual items (singletons) by scanning the dataset to
calculate their support.
o Then, it uses these singletons to generate candidate itemsets of length 2 (pairs) and
checks their support in the dataset.
o The algorithm continues this process, creating larger candidate itemsets by joining
frequent itemsets of length k to create candidates of length k+1, and then checking their
support.
o It stops when no new frequent itemsets can be found or when no candidate itemsets
meet the minimum support threshold.

3. Association Rule Generation


• Once frequent itemsets are discovered, association rules can be generated from these
itemsets. An association rule is an implication of the form X ➞ Y, where X and Y are
itemsets.
• Two common metrics used for association rules are:
o Confidence: Measures the likelihood of item Y being purchased when itemset X is
purchased. It's calculated as support(X ∪ Y) / support(X).
o Lift: Measures the strength of a rule by comparing the observed support of X and Y
appearing together to what would be expected if they were independent. Lift =
support(X ∪ Y) / (support(X) * support(Y)).

Applications:
• Market Basket Analysis: Understanding which items are frequently bought together to
drive product placement, marketing strategies, or bundle offerings.
• Recommendation Systems: Generating recommendations by analyzing user-item
interactions and suggesting items based on co-occurrence patterns.

Principal Component Analysis (PCA)


1. Dimensionality Reduction Technique: PCA is a technique used for reducing the
dimensionality of data by transforming it into a new coordinate system.
2. Maximizing Variance: It identifies the directions (principal components) in which the data
varies the most.
3. Orthogonal Components: Each principal component is orthogonal (uncorrelated) to each
other, capturing different aspects of the variation present in the data.
4. Preserving Information: PCA reorients data to preserve as much variance as possible in a
lower-dimensional space, often by selecting the top principal components that retain most
of the variance.
5. Mathematical Process: Involves eigenvalue decomposition or Singular Value Decomposition
(SVD) to compute the principal components.
6. Applications:
• Reducing dimensionality for visualization and computational efficiency.
• Feature extraction by transforming high-dimensional data into a lower-dimensional
space while retaining important information.
7. Assumptions:
• Linearity: PCA assumes a linear relationship between variables.
• Gaussian Distribution: Assumes the data follows a Gaussian distribution.
8. Limitations:
• Assumes linear relationships which might not hold in all datasets.
• Might not perform well if the variance does not represent important information.
9. Use Cases:
• Image and signal processing.
• Preprocessing step in machine learning pipelines to reduce computational complexity.
• Exploratory data analysis to visualize high-dimensional data.

Tabular difference
Ensemble methods
Ensemble methods in machine learning refer to techniques that combine predictions from
multiple individual models to produce a stronger, more accurate predictive model. These methods
aim to improve the overall performance and robustness compared to using a single model.
Reinforcement Learning (RL)
It is a type of machine learning paradigm where an agent learns to make sequential decisions by
interacting with an environment to achieve a specific goal. In RL, the agent learns through a trial-
and-error process by receiving feedback in the form of rewards or penalties based on its actions.
Key components of reinforcement learning:
1. Agent: The learner or decision-maker that interacts with the environment. It observes the
environment, takes actions, and receives feedback.
2. Environment: The external system with which the agent interacts. It responds to the actions
taken by the agent and provides feedback in the form of rewards or penalties.
3. Actions: Choices made by the agent that influence the state of the environment.
4. State: Represents the current situation or configuration of the environment, which the agent
perceives before taking actions.
5. Rewards: Feedback signals provided by the environment to the agent after each action.
Rewards guide the agent toward maximizing cumulative reward over time, aligning with its
goal.
6. Policy: The strategy or set of rules that the agent uses to decide actions in different states.
7. Value Function: Estimates the expected cumulative reward an agent can obtain from a
particular state or action, helping the agent make better decisions.
8. Learning Process: The agent learns by interacting with the environment, using experiences
(state, action, reward) to update its policy or value function to make better decisions over
time.
9. Exploration vs. Exploitation: Balancing between exploring new actions and exploiting known
actions to maximize rewards while learning.

Applications of reinforcement learning:


• Game playing (e.g., AlphaGo, Atari games).
• Robotics (e.g., controlling robotic arms).
• Autonomous vehicles.
• Recommendation systems.
• Finance (e.g., portfolio management).

Temporal Difference (TD) learning


TD is a type of reinforcement learning technique used for estimating value functions or learning
from experiences in an environment without requiring a model of the environment's dynamics
(model-free learning). TD learning combines elements of Monte Carlo methods and dynamic
programming.
1. Advantages:
• More sample-efficient compared to Monte Carlo methods as it updates value estimates at
each time step rather than waiting until the end of an episode.
• Suitable for online and incremental learning scenarios.

2. Types of TD Learning:
• SARSA (State-Action-Reward-State-Action): TD learning algorithm that updates value
estimates based on the current state-action pair and the action taken next (on-policy
method).
• Q-learning: TD learning algorithm that updates value estimates based on the current state
and the action that maximizes the value of the next state (off-policy method).

TD learning algorithms, such as SARSA and Q-learning, are fundamental in


reinforcement learning. They enable agents to learn from experiences by iteratively
updating value functions based on observed rewards and transitions between states,
facilitating efficient learning and decision-making in various environments.

You might also like