0% found this document useful (0 votes)

1 views13 pages

ML With Unstructured Data

Uploaded by

shivarachamalla178

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views13 pages

ML With Unstructured Data

Uploaded by

shivarachamalla178

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

---

*Citations and Notes in Machine Learning**

Purpose of Citations in ML**

Citations serve to:

* Credit original ideas, algorithms, datasets, or tools.

* Provide background or theoretical foundations (e.g., referencing neural networks or decision trees).

* Support claims with prior studies or benchmarks.

* Point readers to relevant tools/libraries (e.g., TensorFlow, scikit-learn).

### 🔹 Common Citation Styles in ML

* IEEE (used in engineering/CS conferences)

* ACM (used in computing research)

* APA/MLA/Chicago (used more in interdisciplinary work)

> **Note:** Most ML papers follow **IEEE** or **ACM** formats. In arXiv or NeurIPS papers, citations
often appear in square brackets \[1], \[2].

---

## 📘 Example Citations in Machine Learning

Here are some examples in **IEEE style**, which is common in ML conferences like NeurIPS, ICML, and
CVPR:

### 🔸 1. Algorithm/Model Citation

**Text:**

The backpropagation algorithm revolutionized training of neural networks \[1].

**Citation (IEEE):**

\[1] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating

errors,” *Nature*, vol. 323, no. 6088, pp. 533–536, 1986.

---

### 🔸 2. Dataset Citation

**Text:**

We used the CIFAR-10 dataset for image classification experiments \[2].

**Citation:**

\[2] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” University of Toronto, Tech.
Rep., 2009.

---

### 🔸 3. Software Library Citation

**Text:**

All experiments were conducted using the PyTorch deep learning library \[3].

**Citation:**

\[3] A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning library,” in
*Advances in Neural Information Processing Systems (NeurIPS)*, vol. 32, 2019.

---

### 🔸 4. Benchmark Paper Citation

**Text:**

Transformers have become the backbone of modern NLP models \[4].

**Citation:**

\[4] A. Vaswani et al., “Attention is all you need,” in *Proc. NeurIPS*, 2017.

---

## 📝 Notes (Footnotes/Endnotes)

In ML writing, footnotes are rare in formal papers but may be used:

* For clarifications or exceptions.

* To credit non-academic contributions (e.g., open-source projects or blog posts).

* To explain implementation details not central to the main argument.

Example footnote (APA style):

> The model was trained using a batch size of 128.

> ^1^ We also tested batch sizes of 64 and 256, which yielded similar results.

---

## 🧾 Foundational References in Machine Learning (You Can Cite)

| Topic | Reference |

| ---------------------- | ---------------------------------------------------------------- |

| Neural Networks | Rumelhart et al., 1986 |

| SVMs | Cortes & Vapnik, 1995 |

| Random Forests | Breiman, 2001 |

| Deep Learning | LeCun, Bengio & Hinton, 2015 |

| CNNs | Krizhevsky et al., 2012 (AlexNet) |

| Transformers | Vaswani et al., 2017 |

| Reinforcement Learning | Sutton & Barto, 2018 |

| Datasets | Krizhevsky (CIFAR), Deng et al. (ImageNet), Marcus et al. (SNLI) |

| Software | Paszke et al. (PyTorch), Abadi et al. (TensorFlow) |

---
EVALUATION OF TEXT CLASSIFICATION:

---

## 🧠 Evaluation of Text Classification

Text classification refers to assigning predefined categories to textual data (e.g., spam detection,
sentiment analysis, topic labeling). Evaluation measures how well a model performs this task.

---

## 🔹 1. Common Evaluation Metrics

### ✅ Accuracy

* Definition: Proportion of correctly classified examples out of total examples.

* Use case: Good for balanced datasets.

* **Formula**:

\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}

### ⚖️Precision, Recall, and F1 Score

* Useful for **imbalanced datasets** or when **false positives/negatives** matter.

| Metric | Definition | Formula |

| --------- | ------------------------------------- | --------------------------------- |

| Precision | Correct positive predictions | $\frac{TP}{TP + FP}$ |

| Recall | Captures all actual positives | $\frac{TP}{TP + FN}$ |

| F1 Score | Harmonic mean of precision and recall | $\frac{2 \cdot P \cdot R}{P + R}$ |

### 📊 Confusion Matrix

* A table showing counts of TP, TN, FP, FN.

* Helps visualize where the classifier makes errors.

### Macro, Micro, and Weighted Averages

* For multi-class tasks:

* Macro: Average metric over all classes (equal weight).

* **Micro**: Aggregate metric considering all instances (good for class imbalance).

* Weighted: Weighted average based on support (class frequency).

---

## 🔹 2. Model Evaluation Methodology

### 🧪 Cross-Validation

* K-fold cross-validation increases robustness of results.

* Useful when dataset size is small.

### 📉 Holdout Validation (Train/Dev/Test Split)

* Common: 80% training / 10% validation / 10% test.

* Prevents overfitting and ensures generalization.

### 📈 Learning Curves

* Plots performance vs. number of training samples.

* Helps diagnose underfitting vs. overfitting.

---

## 🔹 3. Specialized Considerations

| Task Type | Additional Evaluation Tools |

| --------------------------- | ----------------------------- |

| Multi-label | Hamming loss, subset accuracy |

| Imbalanced data | ROC-AUC, PR-AUC curves |

| Hierarchical classification | Hierarchical Precision/Recall |

clustering task in text analysis in machine learning:
In machine learning, a text clustering task groups similar text documents into
clusters based on their semantic content, using unsupervised learning because it
does not require labeled data. The process involves representing texts as numerical
vectors, computing their similarity, and then applying a clustering algorithm to form
groups where documents within a cluster share common themes and are dissimilar
to documents in other clusters. Key applications include organizing large document
collections, improving information retrieval, aiding in topic modeling, and creating
datasets for other NLP tasks.

How Text Clustering Works

1. 1. Data Preprocessing:
Text data is cleaned by removing stop words (common words like "the," "a"),
punctuation, and performing stemming or lemmatization to standardize word forms.
2. 2. Feature Extraction:
Texts are converted into numerical representations called vectors or
embeddings. Common methods include:
 Bag-of-Words: Counts the frequency of each word in a document.
 TF-IDF: Weights words by their importance in a document and across a collection.
 Word Embeddings: Uses models like Word2Vec or GloVe to capture semantic
relationships between words.
 Contextual Embeddings: Advanced methods like those
from BERT or LLMs capture word meaning based on its surrounding context.
3. 3. Similarity Computation:
A distance metric (e.g., Euclidean distance) or similarity measure is used to
determine how close two text vectors are in the feature space.
4. 4. Clustering Algorithm:
An algorithm is applied to group the vectors based on their proximity. Popular
algorithms include:
 K-Means: Partitions data into a pre-defined number of clusters (k).
 Hierarchical Clustering: Builds a hierarchy of clusters, merging or splitting them
iteratively.
5. 5. Evaluation:
The quality of the clusters is assessed using metrics like the silhouette
coefficient, which measures how similar a document is to its own cluster compared
to others.
Key Applications
 Document Organization: Groups large collections of documents, making them
easier to navigate and manage.
 Information Retrieval: Improves search engines by grouping related documents,
making it easier to find relevant information.
 Topic Modeling: Identifies underlying themes and topics within a large corpus of
text.
 Data Summarization: Can help condense and organize information by identifying
key themes across documents.
 Recommendation Systems: Used to recommend similar content or items based on
shared textual themes.

General clustering problem:

The clustering problem in machine learning is an unsupervised learning task that
involves grouping unlabeled data into clusters where data points within a cluster are
more similar to each other than to those in other clusters. The goal is to discover
inherent patterns and structures within the data by maximizing within-cluster
similarity and minimizing between-cluster dissimilarity. Common challenges include
selecting the right similarity metric, determining the appropriate number of clusters,
and handling high-dimensional or noisy data.
Key Aspects of the Clustering Problem
 Unsupervised Learning:
Unlike supervised learning, clustering does not use labeled data to learn a target
function.
 Data Partitioning:
The process divides a dataset into homogeneous subsets, or clusters, based on
shared characteristics.
 Similarity Metrics:
Clustering algorithms use various metrics, such as Euclidean distance or cosine
similarity, to measure the closeness between data points.
 No Predefined Clusters:
The number and nature of the clusters are not known beforehand and must be
discovered by the algorithm.
Common Challenges
 Defining "Good" Clusters:
Evaluating the quality of the resulting clusters can be difficult, as the clusters may
not always represent meaningful real-world relationships.
 Curse of Dimensionality:
Clustering becomes challenging in high-dimensional data, where algorithms may
struggle to find meaningful patterns, according to ECML PKDD 2017.

 Scalability:
Many clustering algorithms are computationally expensive, with runtimes that
increase significantly with the number of data points, making them impractical for
large datasets.
 Outliers and Noise:
The presence of outliers or noisy data can negatively impact the quality and
structure of the clusters.
Applications of Clustering
 Customer Segmentation:
Grouping customers with similar purchasing behaviors for targeted marketing
campaigns.
 Image Segmentation:
Dividing an image into distinct regions or segments for object recognition.
 Anomaly Detection:
Identifying unusual data points or outliers that deviate from the normal patterns in a
cluster.
 Document Analysis:
Clustering documents into thematic groups to organize and retrieve information.

clustering algorithms in ML:

Clustering algorithms are unsupervised machine learning techniques used to group
unlabeled data points into clusters based on their similarities. The goal is to identify
intrinsic groupings within the data, where data points within the same cluster are
more similar to each other than to those in other clusters.
Common types of clustering algorithms include:
 Centroid-based Clustering (e.g., K-Means):
This type of algorithm represents each cluster by a central vector (centroid). K-
Means aims to partition data into K clusters, where each data point belongs to the
cluster with the nearest centroid. The algorithm iteratively assigns data points to
clusters and updates the centroids until convergence.
 Connectivity-based Clustering (e.g., Hierarchical Clustering):
These algorithms build a hierarchy of clusters.
 Agglomerative Hierarchical Clustering: Starts with each data point as a separate
cluster and progressively merges the closest clusters until a single cluster remains
or a stopping criterion is met.
 Divisive Hierarchical Clustering: Begins with all data points in one large cluster
and recursively divides it into smaller clusters.
 Density-based Clustering (e.g., DBSCAN):
These algorithms define clusters as areas of high data point density separated by
areas of lower density. DBSCAN can discover clusters of arbitrary shapes and
identify outliers as noise.
 Distribution-based Clustering (e.g., Gaussian Mixture Models):
These algorithms assume that data points within a cluster are generated from a
specific probability distribution (e.g., Gaussian distribution). They aim to find the
parameters of these distributions that best fit the data.
 Neural Network-based Clustering (e.g., Self-Organizing Maps - SOMs):
SOMs are a type of unsupervised neural network that maps high-dimensional data
onto a lower-dimensional grid, preserving the topological relationships of the input
data. Similar data points are mapped to neighboring nodes on the grid, forming
clusters.
The choice of clustering algorithm depends on the characteristics of the data, the
desired cluster shapes, and the specific application.

clustering of textual data in ML:

Clustering of textual data in machine learning algorithms is an unsupervised learning
technique that groups similar text documents together based on their content. Unlike
classification, it does not require pre-labeled data, making it useful for exploratory
data analysis and organizing large text collections.
Process of Text Clustering:
 Text Preprocessing:
Raw text data is cleaned and transformed into a numerical representation. This
often involves:
 Tokenization: Breaking text into individual words or sub-word units.
 Stop Word Removal: Eliminating common words (e.g., "the," "is") that carry little
semantic meaning.
 Stemming or Lemmatization: Reducing words to their root form (e.g., "running,"
"ran" to "run").
 Vectorization: Converting text into numerical vectors using techniques like:
 TF-IDF (Term Frequency-Inverse Document Frequency): Assigns weights to
words based on their frequency within a document and across the entire corpus.
 Word Embeddings (e.g., Word2Vec, GloVe): Represent words as dense vectors
capturing semantic relationships.
 Contextual Embeddings (e.g., BERT, GPT embeddings): Capture word
meanings based on their context within a sentence or document, often derived from
Large Language Models (LLMs).
 Clustering Algorithm Application:
Once the text is vectorized, a clustering algorithm is applied to group similar
documents:
 K-Means: A centroid-based algorithm that partitions data into k clusters, where k is
a pre-defined number. It iteratively assigns data points to the nearest centroid and
updates the centroids.
 Hierarchical Clustering: Builds a hierarchy of clusters, either by starting with
individual data points and merging them (agglomerative) or by starting with one
large cluster and dividing it (divisive).
 DBSCAN: A density-based algorithm that groups together data points that are
closely packed together, marking as outliers points that lie alone in low-density
regions.
 Affinity Propagation: Identifies exemplars (representative data points) and
assigns other data points to the clusters defined by these exemplars.
 Evaluation and Interpretation:
The quality of the clusters is assessed using metrics like silhouette score, and the
clusters are interpreted to understand the themes or topics present in each
group. Dimensionality reduction techniques like t-SNE or PCA can be used for
visualizing clusters in lower dimensions.
Applications of Text Clustering:
 Document Organization: Grouping articles, research papers, or legal documents
by topic.
 Topic Modeling: Discovering hidden thematic structures within a collection of texts.
 Information Retrieval: Improving search results by clustering relevant documents.
 Customer Feedback Analysis: Grouping similar customer reviews or support
tickets to identify common issues.
 News Article Grouping: Categorizing news articles based on their content.

Designing Machine Learning Systems by Chip Huygen by Rick
100% (1)
Designing Machine Learning Systems by Chip Huygen by Rick
15 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Machine Learning Engineer Interview Preparation Guide
No ratings yet
Machine Learning Engineer Interview Preparation Guide
14 pages
Machine Learning
No ratings yet
Machine Learning
38 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
MachineLearning Chatgpt
No ratings yet
MachineLearning Chatgpt
19 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
ML and DL
No ratings yet
ML and DL
15 pages
Basic Concepts of Machine Learning For Beginners 1732109263
No ratings yet
Basic Concepts of Machine Learning For Beginners 1732109263
102 pages
ML Sem
No ratings yet
ML Sem
24 pages
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
No ratings yet
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
44 pages
AI ML Concepts
No ratings yet
AI ML Concepts
97 pages
ML 7th Sem AIML ITE Notes Complete LONG (1) - 10-33
No ratings yet
ML 7th Sem AIML ITE Notes Complete LONG (1) - 10-33
24 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
5 pages
Machine Learning Guide: Types & Concepts
No ratings yet
Machine Learning Guide: Types & Concepts
4 pages
Unit 1 ML
No ratings yet
Unit 1 ML
41 pages
MACHINE LEARNING 1-5 (Ai &DS)
100% (1)
MACHINE LEARNING 1-5 (Ai &DS)
60 pages
Ahishek File
No ratings yet
Ahishek File
6 pages
DL Insem 2024 FlyHigh Services
No ratings yet
DL Insem 2024 FlyHigh Services
8 pages
Deep Learning U1
No ratings yet
Deep Learning U1
5 pages
MLTAHER
No ratings yet
MLTAHER
14 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
5 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
23 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Machine Learning for Professionals
No ratings yet
Machine Learning for Professionals
26 pages
02-03-Warming-Up and Data and Features
No ratings yet
02-03-Warming-Up and Data and Features
22 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
Comprehensive Overview of Common ML Techniques
No ratings yet
Comprehensive Overview of Common ML Techniques
7 pages
? Task
No ratings yet
? Task
23 pages
MachineLearning Perplexity
No ratings yet
MachineLearning Perplexity
5 pages
ML 1
No ratings yet
ML 1
9 pages
ML Revision
No ratings yet
ML Revision
18 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
ML Overview
No ratings yet
ML Overview
11 pages
CAT King Study Material 4
No ratings yet
CAT King Study Material 4
32 pages
AutoML Tools
No ratings yet
AutoML Tools
2 pages
Machine Learning Overview & Applications
No ratings yet
Machine Learning Overview & Applications
8 pages
5.1 Large Scale ML
No ratings yet
5.1 Large Scale ML
10 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
77 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Lecture Notes On Machine Learning Concepts
No ratings yet
Lecture Notes On Machine Learning Concepts
5 pages
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
No ratings yet
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
20 pages
ML Viva Practice (Answers)
No ratings yet
ML Viva Practice (Answers)
4 pages
DGM Mid Sem
No ratings yet
DGM Mid Sem
39 pages
ML Notes
No ratings yet
ML Notes
16 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
ML Short U1-4
No ratings yet
ML Short U1-4
60 pages
Machine Learning for Beginners
No ratings yet
Machine Learning for Beginners
18 pages
4-Machine Learning and Neural Networks
No ratings yet
4-Machine Learning and Neural Networks
9 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
Module V
No ratings yet
Module V
19 pages
AWS Machine Learning Specialty Master Cheat Sheet
No ratings yet
AWS Machine Learning Specialty Master Cheat Sheet
24 pages
ML Revision
No ratings yet
ML Revision
207 pages
Pa 2
No ratings yet
Pa 2
13 pages
Deep Learning Exam Guide
No ratings yet
Deep Learning Exam Guide
19 pages
Future Proof Yourself-An AI Era Survival Guide
No ratings yet
Future Proof Yourself-An AI Era Survival Guide
259 pages
FML - KNN
No ratings yet
FML - KNN
64 pages
Machine Learning for Data Scientists
No ratings yet
Machine Learning for Data Scientists
14 pages
Paper 2.0
No ratings yet
Paper 2.0
16 pages
NCEEICT Conference Paper Format
No ratings yet
NCEEICT Conference Paper Format
5 pages
R Studio Notes
No ratings yet
R Studio Notes
10 pages
Tutorial 2
No ratings yet
Tutorial 2
2 pages
Migration POC
No ratings yet
Migration POC
10 pages
Going Beyond T-SNE: Exposing Whatlies in Text Embeddings
No ratings yet
Going Beyond T-SNE: Exposing Whatlies in Text Embeddings
8 pages
Xs Max
No ratings yet
Xs Max
5 pages
Unit 1 Big Data
No ratings yet
Unit 1 Big Data
124 pages
Week 04 Data Base Design: Database System
No ratings yet
Week 04 Data Base Design: Database System
47 pages
Global Innovation by Design Toshiba - A History of Leadership
No ratings yet
Global Innovation by Design Toshiba - A History of Leadership
6 pages
CV - Andi Kurniawan - 2023
No ratings yet
CV - Andi Kurniawan - 2023
6 pages
How To Access XRK Files Data Without Aim Software - 100
No ratings yet
How To Access XRK Files Data Without Aim Software - 100
5 pages
FUJITSU Mainboard D3401-B ATX: Data Sheet
No ratings yet
FUJITSU Mainboard D3401-B ATX: Data Sheet
6 pages
Screenshot 2024-03-12 at 6.57.10 PM
No ratings yet
Screenshot 2024-03-12 at 6.57.10 PM
1 page
OS - Chapter - 4 - Memory Management
No ratings yet
OS - Chapter - 4 - Memory Management
48 pages
ODM03D User Guide (06) (PDF) - EN
No ratings yet
ODM03D User Guide (06) (PDF) - EN
38 pages
OEM OEM Preinstallation Preinstallation Kit (OPK) Overview Kit (OPK) Overview
No ratings yet
OEM OEM Preinstallation Preinstallation Kit (OPK) Overview Kit (OPK) Overview
32 pages
Remote Radiotherapy Planning The EIMRT Project
No ratings yet
Remote Radiotherapy Planning The EIMRT Project
7 pages
BMW ICOM Firmware Update Guide
No ratings yet
BMW ICOM Firmware Update Guide
4 pages
Self Cleaning NH4 N Modbus Instruction en
No ratings yet
Self Cleaning NH4 N Modbus Instruction en
21 pages
Training Report On Machine Learning
No ratings yet
Training Report On Machine Learning
27 pages
IT 118 - SIA - Module 5
No ratings yet
IT 118 - SIA - Module 5
23 pages
PROF ED 108: Technology For Teaching and Learning
No ratings yet
PROF ED 108: Technology For Teaching and Learning
43 pages
CÁC CÁCH ĐƯA RA LẬP LUẬN TRONG BÀI VIẾT new
No ratings yet
CÁC CÁCH ĐƯA RA LẬP LUẬN TRONG BÀI VIẾT new
9 pages
III Computer Lesson Answers 143 18-11-2023
No ratings yet
III Computer Lesson Answers 143 18-11-2023
15 pages
Curriculum Vitae: Profile
No ratings yet
Curriculum Vitae: Profile
34 pages
FAQs On OTS Registration Process
No ratings yet
FAQs On OTS Registration Process
3 pages
Information For Admission: FIITJEE Talent Reward Exam
No ratings yet
Information For Admission: FIITJEE Talent Reward Exam
3 pages
Improved BMS A Smart Electric Vehicle Design Based On An Intelligent Battery Management System
No ratings yet
Improved BMS A Smart Electric Vehicle Design Based On An Intelligent Battery Management System
8 pages
Log
No ratings yet
Log
2,243 pages
CompTIA A 2009 in Depth 3rd Edition Jean (Jean Andrews) Andrews - Get Instant Access To The Full Ebook Content
100% (1)
CompTIA A 2009 in Depth 3rd Edition Jean (Jean Andrews) Andrews - Get Instant Access To The Full Ebook Content
43 pages

ML With Unstructured Data

Uploaded by

ML With Unstructured Data

Uploaded by

---

*Citations and Notes in Machine Learning**

Purpose of Citations in ML**

Citations serve to:

* Credit original ideas, algorithms, datasets, or tools.

* Support claims with prior studies or benchmarks.

* Point readers to relevant tools/libraries (e.g., TensorFlow, scikit-learn).

### 🔹 **Common Citation Styles in ML**

* **IEEE** (used in engineering/CS conferences)

* **ACM** (used in computing research)

* **APA/MLA/Chicago** (used more in interdisciplinary work)

## 📘 **Example Citations in Machine Learning**

### 🔸 1. Algorithm/Model Citation

The backpropagation algorithm revolutionized training of neural networks \[1].

\[1] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating

### 🔸 2. Dataset Citation

We used the CIFAR-10 dataset for image classification experiments \[2].

### 🔸 3. Software Library Citation

### 🔸 4. Benchmark Paper Citation

Transformers have become the backbone of modern NLP models \[4].

In ML writing, **footnotes** are rare in formal papers but may be used:

* For clarifications or exceptions.

* To credit non-academic contributions (e.g., open-source projects or blog posts).

**Example footnote (APA style):**

> The model was trained using a batch size of 128.

## 🧾 Foundational References in Machine Learning (You Can Cite)

| Neural Networks | Rumelhart et al., 1986 |

| SVMs | Cortes & Vapnik, 1995 |

| Random Forests | Breiman, 2001 |

| Deep Learning | LeCun, Bengio & Hinton, 2015 |

| CNNs | Krizhevsky et al., 2012 (AlexNet) |

| Transformers | Vaswani et al., 2017 |

| Reinforcement Learning | Sutton & Barto, 2018 |

| Datasets | Krizhevsky (CIFAR), Deng et al. (ImageNet), Marcus et al. (SNLI) |

| Software | Paszke et al. (PyTorch), Abadi et al. (TensorFlow) |

## 🧠 **Evaluation of Text Classification**

## 🔹 **1. Common Evaluation Metrics**

* **Definition**: Proportion of correctly classified examples out of total examples.

* **Use case**: Good for balanced datasets.

\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}

### ⚖️Precision, Recall, and F1 Score

| Metric | Definition | Formula |

| --------- | ------------------------------------- | --------------------------------- |

| Precision | Correct positive predictions | $\frac{TP}{TP + FP}$ |

| Recall | Captures all actual positives | $\frac{TP}{TP + FN}$ |

### 📊 Confusion Matrix

* A table showing counts of TP, TN, FP, FN.

* Helps visualize where the classifier makes errors.

### Macro, Micro, and Weighted Averages

* For multi-class tasks:

* **Macro**: Average metric over all classes (equal weight).

* **Weighted**: Weighted average based on support (class frequency).

## 🔹 **2. Model Evaluation Methodology**

* K-fold cross-validation increases robustness of results.

* Useful when dataset size is small.

### 📉 Holdout Validation (Train/Dev/Test Split)

* Common: 80% training / 10% validation / 10% test.

* Prevents overfitting and ensures generalization.

### 📈 Learning Curves

* Plots performance vs. number of training samples.

* Helps diagnose underfitting vs. overfitting.

## 🔹 **3. Specialized Considerations**

| Task Type | Additional Evaluation Tools |

| Multi-label | Hamming loss, subset accuracy |

| Imbalanced data | ROC-AUC, PR-AUC curves |

| Hierarchical classification | Hierarchical Precision/Recall |

How Text Clustering Works

General clustering problem:

clustering algorithms in ML:

clustering of textual data in ML:

You might also like

### 🔹 Common Citation Styles in ML

* IEEE (used in engineering/CS conferences)

* ACM (used in computing research)

* APA/MLA/Chicago (used more in interdisciplinary work)

## 📘 Example Citations in Machine Learning

In ML writing, footnotes are rare in formal papers but may be used:

Example footnote (APA style):

## 🧠 Evaluation of Text Classification

## 🔹 1. Common Evaluation Metrics

* Definition: Proportion of correctly classified examples out of total examples.

* Use case: Good for balanced datasets.

* Macro: Average metric over all classes (equal weight).

* Weighted: Weighted average based on support (class frequency).

## 🔹 2. Model Evaluation Methodology

## 🔹 3. Specialized Considerations