0% found this document useful (0 votes)

5 views8 pages

Rohan Reflections

The document outlines the solutions implemented during a research-oriented internship at Samsung R&D Institute to address challenges in AI development for promotional content generation. Key solutions included creating a custom dataset, evaluating and selecting models, optimizing fine-tuning processes, and ensuring prompt engineering for accurate outputs. The evaluation of fine-tuned models showed Mistral-7B-instruct-v0.3 as the best performer, achieving high accuracy and efficient token usage.

Uploaded by

Rohan Rajashekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views8 pages

Rohan Reflections

Uploaded by

Rohan Rajashekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

CHAPTER 3

REFLECTIONS

3.1 Solutions
The research-oriented internship at Samsung R&D Institute – Bangalore posed a number of
technical and operational challenges, many of which were addressed through structured
experimentation, mentorship support, and the application of best practices in AI development.
The following solutions were implemented to successfully complete the assigned tasks:
1. Custom Dataset Creation
Problem Addressed: Lack of existing training datasets specific to promotional offers.
Solution:
• Developed a custom dataset comprising over 13,000 smartphone promotional offers
by web scraping commercial sources.
• Ensured consistency in format, logical phrasing, numeric accuracy, and brand-
specific constraints using normalization and data-cleaning routines.
• Applied manual tagging and filtering to enhance contextual relevance and instruction-
following quality.
2. Model-wise Evaluation and Selection
Problem Addressed: Difficulty in determining which fine-tuned model performed best.
Solution:
• Conducted structured evaluation of each model (Mistral-7B-instruct-v0.1/v0.2/v0.3,
LLaMA-3.1-8B, LLaMA-3.2-1B, LLaMA-3.2-3B) using consistent benchmarks.
• Metrics included: accuracy, perplexity, instruction adherence, context retention,
response time, and token efficiency.
• Finalized Mistral-7B-instruct-v0.3 as the best-performing model due to its 93.2%
accuracy and 94.2% instruction adherence.
3. Infrastructure and Execution Strategy
Problem Addressed: Limited compute availability for training large models.
Solution:

B.E,Dept. of CSE, CITech 2024-25 Page 10

Fine-Tuning LLMs for Prompt Generation Using Web-Crawled Data
– Samsung Research Institute Bangalore (PRISM Program) Reflections

• Used Samsung’s lab GPU server for high-memory, resource-heavy training tasks via
secure Tailscale access.
• Conducted evaluation and inference using Google Colab, which helped separate
training and testing workflows.
• Enabled 4-bit quantization with BitsAndBytes to reduce memory usage and speed up
inference.
4. Fine-Tuning Optimization with Hugging Face AutoTrain
Problem Addressed: Managing complex training configurations and multiple model
checkpoints.
Solution:
• Executed Hugging Face AutoTrain locally to allow full control over fine-tuning
parameters (epochs, batch size, warm-up ratio, learning rate scheduler, gradient
norm).
• Implemented parameter-efficient fine-tuning techniques such as:
o Low-Rank Adaptation (LoRA)
o Prefix-tuning
o Adapter layers
• Ensured model reproducibility with defined config files and logging.
5. Prompt Engineering and Output Validation
Problem Addressed: Inconsistent model responses or failure to follow prompt
instructions.
Solution:
• Designed and iteratively refined structured prompts for each model during evaluation.
• Evaluated generated outputs against original dataset for instruction compliance and
semantic accuracy.
• Used custom formulas to validate generated offer values (e.g., Discount = (50 ×
Minimum Purchase Value) × K), allowing precise rule adherence during testing.
These solutions collectively contributed to building a scalable, accurate, and instruction-
aligned system for personalized promotional content generation using fine-tuned LLMs.

B.E,Dept. of CSE, CITech 2024-25 Page 11

Fine-Tuning LLMs for Prompt Generation Using Web-Crawled Data
– Samsung Research Institute Bangalore (PRISM Program) Reflections

3.2 Experimental Results and Model Evaluation Tables

Table 3.1 Model Comparison Matrix – Mistral vs. LLaMA (Fine-Tuned Variants)

Table 3.1 presents a comprehensive evaluation of six fine-tuned instruction-based LLMs,

including Mistral-7B-instruct variants (v0.1, v0.2, v0.3) and LLaMA models (LLaMA-3.1–8B,
LLaMA-3.2–1B, LLaMA-3.2–3B). The models are compared across eight key performance
metrics that reflect their suitability for real-time, instruction-aligned promotional offer generation
tasks.
Key Observations:
1. Accuracy:
• Mistral-7B-instruct-v0.3 achieved the highest accuracy (93.2%), indicating superior
ability to generate correct, structured outputs.
• LLaMA-3.2–1B recorded the lowest accuracy (88.7%), highlighting a trade-off for its
speed.
2. Perplexity:
• The lowest perplexity score (7.5) was observed in Mistral-7B-instruct-v0.3,
demonstrating confident and fluent token predictions.

B.E,Dept. of CSE, CITech 2024-25 Page 12

Fine-Tuning LLMs for Prompt Generation Using Web-Crawled Data
– Samsung Research Institute Bangalore (PRISM Program) Reflections

• LLaMA-3.2–1B exhibited the highest perplexity (10.2), implying less certainty in

output.
3. Latency & Response Time:
• LLaMA-3.2–1B offered the lowest latency (89 ms) and faster response time (3.08s),
making it ideal for speed-sensitive scenarios.
• Mistral-7B-instruct-v0.1 showed the highest latency (120 ms) and slowest response
(5.84s).
4. Token Efficiency:
• Mistral-7B-instruct-v0.3 achieved the highest token efficiency (91%), indicating
efficient use of input tokens for generating relevant content.
• LLaMA-3.2–1B had the lowest token efficiency (84%).
5. Throughput:
• LLaMA-3.2–1B leads with 72 tokens/sec throughput, suitable for batch generation
use cases.
• LLaMA-3.1–8B had the lowest throughput (48 tokens/sec), reflecting its
computational overhead.
6. Memory & FLOPs:
• LLaMA-3.1–8B required the most memory (18 GB) and compute (5.4T FLOPs),
which may affect scalability.
• LLaMA-3.2–1B was the most resource-efficient with only 9 GB memory usage and
2.1T FLOPs.

Table 3.2 Output Matching Evaluation of Fine-Tuned Models Against Dataset

Model Response Accuracy Response Time
Llama-3.1-8B ₹250 (₹7,500 min) Correct 4.91s
Llama-3.2-1B ₹2500 (₹1,25,000 min) Incorrect 3.08s
Llama-3.2-3B ₹2500 (₹75,000 min) Incorrect 2.28s
Mistral-7B-v0.1 ₹1200 (₹15,000 min) Correct 5.84s
Mistral-7B-v0.2 ₹250 (₹1,25,000 min, 9M EMI) Correct 5.07s
Mistral-7B-v0.3 ₹1250 (₹30,000 min, Non-EMI) Correct 4.76s

B.E,Dept. of CSE, CITech 2024-25 Page 13

Fine-Tuning LLMs for Prompt Generation Using Web-Crawled Data
– Samsung Research Institute Bangalore (PRISM Program) Reflections

Table 3.2 presents a qualitative and quantitative evaluation of how closely the generated outputs
from fine-tuned models aligned with real promotional content in the dataset. The aim was to
validate whether the models could replicate human-like, logically structured offers based on
minimal input prompts.
Objective:
The primary objective of this evaluation was to assess the ability of fine-tuned models—
specifically Mistral-7B and LLaMA variants—to generate promotional offers that matched
ground truth entries in terms of discount logic, numeric thresholds, and structural coherence.
Methodology:
• A subset of test prompts from the original dataset was provided to each fine-tuned model.
• Generated outputs were then compared to real promotional entries using two main
criteria:
1. Instruction Adherence: Did the model follow format rules like “Get ₹X off on Y”
or “₹X cashback on minimum purchase of ₹Y”?
2. Numerical Accuracy: Were the values for discounts, purchase limits, and bonus
conditions correctly derived from the rules?
• A formula was also used for offer value calculation, e.g.:
Discount = (50 × Minimum Purchase Value) × K
Where K ≈
o 2.5 for EMI Transactions
o 1.5 for Non-EMI Transactions
o Additional ₹150–₹250 for Long-term EMI (9+ months)
Analysis:
• Mistral-7B-instruct-v0.3 demonstrated both logical and contextual alignment, generating
a Non-EMI-based discount with precise calculation and condition adherence.
• Llama-3.1–8B showed reliable performance with correct formatting and value estimation.
• Llama-3.2–1B and 3.2–3B generated disproportionately large discount values, indicating
weak adherence to conditional logic.
• Mistral-7B-v0.1 and v0.2 produced correctly formatted and context-aware responses,
showing model maturity in understanding instructions and context.

B.E,Dept. of CSE, CITech 2024-25 Page 14

Fine-Tuning LLMs for Prompt Generation Using Web-Crawled Data
– Samsung Research Institute Bangalore (PRISM Program) Reflections

3.3 Screenshots

Figure 3.1 AutoTrain Interface for LLM SFT (Supervised Fine-Tuning) on Hugging Face

Figure 3.1 shows the AutoTrain interface on Hugging Face used for supervised fine-tuning (SFT)
of large language models, highlighting model selection, parameter configuration, and dataset
mapping options.

Figure 3.2 Tokenized dataset preview used for model training, after applying data cleaning and formatting.

B.E,Dept. of CSE, CITech 2024-25 Page 15

Fine-Tuning LLMs for Prompt Generation Using Web-Crawled Data
– Samsung Research Institute Bangalore (PRISM Program) Reflections

Figure 3.2 shows the tokenized dataset preview used for model training, after applying data
cleaning and formatting to structure prompt–response pairs effectively.
Mistral-7B v0.1

Mistral-7B v0.2

Mistral-7B v0.3

Llama3.2-1B

Llama3.2-3B

Llama-3.1 1B

B.E,Dept. of CSE, CITech 2024-25 Page 16

Fine-Tuning LLMs for Prompt Generation Using Web-Crawled Data
– Samsung Research Institute Bangalore (PRISM Program) Reflections

Figure 3.3 Google Colab inference results for prompt testing, highlighting model response quality and
formatting adherence.
Figure 3.3 shows Google Colab inference results for prompt testing, highlighting the response
quality, accuracy, and formatting adherence of the fine-tuned models.

B.E,Dept. of CSE, CITech 2024-25 Page 17

AI Engineer Roadmap
No ratings yet
AI Engineer Roadmap
22 pages
Introduction To Generative AI - Pre Quiz - Attempt Review
100% (1)
Introduction To Generative AI - Pre Quiz - Attempt Review
4 pages
Rohan Task Performed
No ratings yet
Rohan Task Performed
6 pages
Predibase Fine-Tuning LLMs Ebook
No ratings yet
Predibase Fine-Tuning LLMs Ebook
20 pages
Project (8th)
No ratings yet
Project (8th)
15 pages
Unit 3 Tuning and Optimization Techniques
No ratings yet
Unit 3 Tuning and Optimization Techniques
5 pages
A Study On The Implementation of Generative AI Ser
No ratings yet
A Study On The Implementation of Generative AI Ser
26 pages
Demystifying LLMs
No ratings yet
Demystifying LLMs
53 pages
Assessing Fine-Tuning Efficacy in LLMS: A Case Study With Learning Guidance Chatbots
No ratings yet
Assessing Fine-Tuning Efficacy in LLMS: A Case Study With Learning Guidance Chatbots
11 pages
Week 11 Chats
No ratings yet
Week 11 Chats
5 pages
LLM's For Code Generation
No ratings yet
LLM's For Code Generation
31 pages
Shivi
No ratings yet
Shivi
5 pages
Small Language Models (SLMS)
No ratings yet
Small Language Models (SLMS)
23 pages
LLMs in Software Engineering
No ratings yet
LLMs in Software Engineering
75 pages
Gen AI Notes Paer 2
No ratings yet
Gen AI Notes Paer 2
14 pages
Introduction To Large Language Models-2025072419561496
No ratings yet
Introduction To Large Language Models-2025072419561496
16 pages
Total 40 Questions and Answers Added Now!
No ratings yet
Total 40 Questions and Answers Added Now!
14 pages
Guide To Fine-Tuning LLMs From Basics
100% (1)
Guide To Fine-Tuning LLMs From Basics
114 pages
Performance-Aligned Llms For Generating Fast Code
No ratings yet
Performance-Aligned Llms For Generating Fast Code
12 pages
MEDFIT-LLM - Medical Enhancements Through Domain-Focused Fine Tuning of Small Language Models - V0.0.3
No ratings yet
MEDFIT-LLM - Medical Enhancements Through Domain-Focused Fine Tuning of Small Language Models - V0.0.3
5 pages
Research Paper
No ratings yet
Research Paper
28 pages
Subramanian Venkataraman - Crafting Effective Prompts - A Guide To Prompt Engineering-Independently Published (2024)
100% (1)
Subramanian Venkataraman - Crafting Effective Prompts - A Guide To Prompt Engineering-Independently Published (2024)
211 pages
Day 3 - Customizing ChatGPT
No ratings yet
Day 3 - Customizing ChatGPT
44 pages
Genai Llms w2
No ratings yet
Genai Llms w2
114 pages
代码大模型
No ratings yet
代码大模型
18 pages
LLM-Driven Testing For Autonomous Driving Scenarios
No ratings yet
LLM-Driven Testing For Autonomous Driving Scenarios
6 pages
Advances in Fine Tuning Large Language M
No ratings yet
Advances in Fine Tuning Large Language M
11 pages
Code Generation With LLMs
No ratings yet
Code Generation With LLMs
59 pages
Full Fine-Tuning, PEFT, Prompt Engineering, or RAG
No ratings yet
Full Fine-Tuning, PEFT, Prompt Engineering, or RAG
23 pages
LLMS Investigative Reporting NorthBaySolutions
No ratings yet
LLMS Investigative Reporting NorthBaySolutions
73 pages
Rohan Abstract
No ratings yet
Rohan Abstract
1 page
Pytoch Modeling
No ratings yet
Pytoch Modeling
16 pages
Data Seminar
No ratings yet
Data Seminar
10 pages
W03 Benchmarking
No ratings yet
W03 Benchmarking
25 pages
Guide to LLMs and Prompt Crafting
No ratings yet
Guide to LLMs and Prompt Crafting
2 pages
Session 7 LLMs Fine Tuning and RAG
No ratings yet
Session 7 LLMs Fine Tuning and RAG
21 pages
Iiit Final
No ratings yet
Iiit Final
44 pages
AI Interview Questions FREEGuide Ed1
No ratings yet
AI Interview Questions FREEGuide Ed1
63 pages
Customizing LLMs for Developers
No ratings yet
Customizing LLMs for Developers
52 pages
Sayiqa - AI Engineer
No ratings yet
Sayiqa - AI Engineer
4 pages
NAIPDC 2025 Bootcamp Slides - National AI Prompt Design Challenge Philippines
No ratings yet
NAIPDC 2025 Bootcamp Slides - National AI Prompt Design Challenge Philippines
93 pages
2412.01253v5【Yi Lightning】Technical Report
No ratings yet
2412.01253v5【Yi Lightning】Technical Report
17 pages
Table Contents
No ratings yet
Table Contents
24 pages
A Survey On Transformers in NLP With Focus On Efficiency
No ratings yet
A Survey On Transformers in NLP With Focus On Efficiency
31 pages
Rohan Conclusion
No ratings yet
Rohan Conclusion
1 page
Index of Terms
No ratings yet
Index of Terms
28 pages
Lesson 02 Optimizing GenAI Models
No ratings yet
Lesson 02 Optimizing GenAI Models
40 pages
Fine Tuning LLM For Enterprise: Practical Guidelines and Recommendations
No ratings yet
Fine Tuning LLM For Enterprise: Practical Guidelines and Recommendations
17 pages
Fine Tuning
No ratings yet
Fine Tuning
2 pages
Alopex: On-Device LLM Function Calls
No ratings yet
Alopex: On-Device LLM Function Calls
12 pages
SSRN Id4655822
No ratings yet
SSRN Id4655822
9 pages
Synthetic Data
No ratings yet
Synthetic Data
33 pages
Nebius LLM Fine Tuning Mlflow
No ratings yet
Nebius LLM Fine Tuning Mlflow
24 pages
LLMOps For LLM Models
No ratings yet
LLMOps For LLM Models
7 pages
Notes
No ratings yet
Notes
21 pages
Function Calling at Edge
No ratings yet
Function Calling at Edge
9 pages
Final Reprt 8th Sem Piyu Final
No ratings yet
Final Reprt 8th Sem Piyu Final
64 pages
Prompt Cook Book
No ratings yet
Prompt Cook Book
24 pages
Data Science Graduate's Resume
No ratings yet
Data Science Graduate's Resume
2 pages
Lima Et Al., 2022, MLOps Practices, Maturity Models, Roles, Tools, and Challenges - A Systematic Literature Review
No ratings yet
Lima Et Al., 2022, MLOps Practices, Maturity Models, Roles, Tools, and Challenges - A Systematic Literature Review
13 pages
Smartphone Applications For Pavement Condition Monitoring: A Review
No ratings yet
Smartphone Applications For Pavement Condition Monitoring: A Review
20 pages
SRS Document
100% (1)
SRS Document
14 pages
Adversarial AI
No ratings yet
Adversarial AI
4 pages
Research Paper
No ratings yet
Research Paper
5 pages
Unit 3-Fuzzy Clustering
No ratings yet
Unit 3-Fuzzy Clustering
34 pages
Explainable Artificial Intelligence For Smart Cities 1st Edition Mohamed Lahby PDF Download
No ratings yet
Explainable Artificial Intelligence For Smart Cities 1st Edition Mohamed Lahby PDF Download
71 pages
hw1 f21112 Problems11
No ratings yet
hw1 f21112 Problems11
2 pages
Chapter - 2 - Start (1) UPDATED
No ratings yet
Chapter - 2 - Start (1) UPDATED
22 pages
AI & ML in Business Transformation
No ratings yet
AI & ML in Business Transformation
5 pages
AI Lab Manual
No ratings yet
AI Lab Manual
25 pages
Deep Learning for ADR Prediction
No ratings yet
Deep Learning for ADR Prediction
36 pages
MSC Proj
No ratings yet
MSC Proj
102 pages
Understanding Deep Learning Chitta Ranjan
No ratings yet
Understanding Deep Learning Chitta Ranjan
13 pages
Software Defect Prediction Using ML
No ratings yet
Software Defect Prediction Using ML
6 pages
A Comprehensive Review of Artificial Intelligence Applications in Enhancing Cybersecurity Threat Detection and Response Mechanisms
No ratings yet
A Comprehensive Review of Artificial Intelligence Applications in Enhancing Cybersecurity Threat Detection and Response Mechanisms
15 pages
Roleof Artificial Intelligencein Education Published
No ratings yet
Roleof Artificial Intelligencein Education Published
6 pages
Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review
No ratings yet
Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review
30 pages
Loan Prediction System
No ratings yet
Loan Prediction System
5 pages
Machine Learning Based Crop Recommendation System For Local Farmers of Pakistan
No ratings yet
Machine Learning Based Crop Recommendation System For Local Farmers of Pakistan
12 pages
Retele Neuronale Convolutionale
No ratings yet
Retele Neuronale Convolutionale
60 pages
1 Introduction
No ratings yet
1 Introduction
12 pages
A IEEE Topics 2024-25
No ratings yet
A IEEE Topics 2024-25
3 pages
Sumeru Solution JD and Drive
No ratings yet
Sumeru Solution JD and Drive
2 pages
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 62-92
No ratings yet
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 62-92
31 pages
Artificial Intelligence For Autonomous Networks Gilbert Mazin Instant Download
100% (1)
Artificial Intelligence For Autonomous Networks Gilbert Mazin Instant Download
82 pages
Artificial Intelligence Assignment
No ratings yet
Artificial Intelligence Assignment
8 pages
Project Report 2024-25
No ratings yet
Project Report 2024-25
46 pages

Rohan Reflections

Uploaded by

Rohan Reflections

Uploaded by

CHAPTER 3

B.E,Dept. of CSE, CITech 2024-25 Page 10

B.E,Dept. of CSE, CITech 2024-25 Page 11

3.2 Experimental Results and Model Evaluation Tables

Table 3.1 presents a comprehensive evaluation of six fine-tuned instruction-based LLMs,

B.E,Dept. of CSE, CITech 2024-25 Page 12

• LLaMA-3.2–1B exhibited the highest perplexity (10.2), implying less certainty in

Table 3.2 Output Matching Evaluation of Fine-Tuned Models Against Dataset

B.E,Dept. of CSE, CITech 2024-25 Page 13

B.E,Dept. of CSE, CITech 2024-25 Page 14

B.E,Dept. of CSE, CITech 2024-25 Page 15

B.E,Dept. of CSE, CITech 2024-25 Page 16

B.E,Dept. of CSE, CITech 2024-25 Page 17

You might also like