Rohan Task Performed

The internship at Samsung Research Institute – Bangalore focused on fine-tuning Large Language Models (LLMs) for promotional offer generation, providing hands-on experience with model training, dataset engineering, and evaluation metrics. Key skills acquired included configuring models using Hugging Face's AutoTrain, data preprocessing, and prompt engineering, alongside overcoming challenges related to memory constraints and model compatibility. The experience culminated in the successful development of a high-volume dataset and the benchmarking of multiple LLM variants.

Uploaded by

Rohan Rajashekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views6 pages

Rohan Task Performed

Uploaded by

Rohan Rajashekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

CHAPTER 2

TASK PERFORMED

2.1 Learning Experiences

The internship at Samsung Research Institute – Bangalore (SRI-B), conducted under the PRISM
(Preparing and Inspiring Student Minds) initiative, provided a rigorous, hands-on introduction to
the process of building, fine-tuning, and evaluating instruction-based Large Language Models
(LLMs). This experience enabled direct exposure to enterprise-grade model development
environments and workflows.
Initially, the learning curve was steep due to the technical complexity involved in fine-tuning
transformer-based LLMs. Without prior experience in training high-parameter models,
understanding the intricacies of model architecture, configuration parameters, and the impact of
different quantization techniques was particularly challenging. Furthermore, managing multiple
model variants such as Mistral-7B (versions v0.1, v0.2, and v0.3) and Meta’s LLaMA series
(3.1–8B, 3.2–1B, 3.2–3B) introduced the challenge of evaluating comparative performance
across multiple configurations.
A significant part of the learning involved gaining proficiency with Hugging Face’s AutoTrain
framework, which was executed locally rather than via the Hugging Face platform. This
involved configuring the training pipeline manually including epochs, learning rate schedules,
quantization types (e.g., 4-bit low precision), gradient norms, warm-up ratios, and checkpoint
intervals — all while ensuring optimal memory usage.
Additionally, the internship required secure remote access to Samsung’s internal training
infrastructure, which was achieved using Tailscale. Learning to operate within this environment
demanded a working knowledge of virtual environments, CUDA GPU resource allocation, and
distributed training using frameworks like DeepSpeed.
Despite these challenges, consistent guidance from Samsung researchers, structured
documentation, and weekly review sessions helped accelerate the learning process. The outcome
was a progressively deeper understanding of end-to-end model training and deployment
pipelines, grounded in both theory and application.