0% found this document useful (0 votes)

25 views16 pages

Prompt Engineering for LLMs

Uploaded by

Nguyễn Như Giáp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views16 pages

Prompt Engineering for LLMs

Uploaded by

Nguyễn Như Giáp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Prompting

Sunday, October 20, 2024 5:01 PM

1. Learning from Demonstration: Few-shot Prompting

- A prompt is a text string that a user issues to a language model to get the model to do something
useful. Thus, the prompt creates a context that guides LLMs to generate useful outputs to achieve
some user goal.
○ The process of finding effective prompts for a task is known as prompt engineering.
- With suitable additions to the context a single LLM can produce outputs appropriate for many
different tasks.
E.g.1:
○ A summary
○ Whether the review was truthful or likely to have been fabricated
○ A translation to another language

- Each template consists of an input text, designed as {input}, followed by a verbatim prompt to be
passed to an LLM. These templates are applied to inputs to create filled prompts - instantiated
prompts suitable for use as inputs to an LLM.
- Consider the following example:
Translate English to French:
Did not like the service that I was provided!
This prompt doesn't do a good job of constraining possible continuations. Instead of a French
translation, models given this prompt may instead generate another sentence in English that
simply extends the English review.
=> Prompts need to be designed unambiguously, so that any reasonable continuation would
accomplish the desired task.
E.g.2:

Model Alignment, Prompting and In-Context Learning Page 1

- We can also prompt the system to break down complex task, using methods like chain-of-
thought. Or we may want to restrict a summary to be a particular length, to have an answer
generated according to some kind of persona or role.
- In summary, we prompt an LM by transforming each task into a form that is amenable to
contextual generation by an LLM, as follows:
1. For a given task, develop a task-specific template that has a free parameter for the input
text
2. Given that input and the task-specific template, the input is used to instantiate a filled
prompt that is then passed to a pretrained language model
3. Autoregressive decoding is then used to generate a sequence of token outputs.
4. The output of the model can either be used directly as the desired output (as in the case of
naturally generative tasks such as translation or summarization), or a task-appropriate
answer can be extracted from the generated output (as in the case of classification).
A. Learning from Demonstrations: Few-Shot Prompting
- We can improve a prompt by including some labeled examples in the prompt template:
demonstrations
○ The task of prompting with examples is sometimes called few-shot prompting, as
contrasted with zero-shot prompting which means instructions that don't include labeled
examples.

Model Alignment, Prompting and In-Context Learning Page 2

- How many demonstrations?
The number of demonstrations doesn’t have to be large. A small number of randomly selected
labeled examples used as demonstrations can be sufficient to improve performance over the zero -
shot setting. Indeed, the largest performance gains in few-shot prompting tends to come from the
first training example, with diminishing returns for subsequent demonstrations
- Why isn't it useful to have more demonstrations?
The reason is the primary benefit in examples is to demonstrate the task to be performed by an
LLM and the format of the sequence, not to provide relevant information as to the right answer
for any particular question.
In fact, demonstrations that have incorrect answers can still improve a system. Adding too many
examples seems to cause the model to overfit to details of the exact examples chosen and
generalize poorly.
- How to Select Demonstrations?
Generally, the best way to select demonstrations from the training set is programmatically:
choosing the set of demonstrations that most increases task performance of the prompt on a test
set. Task performance for sentiment analysis or multiple-choice question answering can be
measured in accuracy; for machine translation with chrF, and for summarization via Rouge .
B. In-Context Learning and Induction Heads
- Learning via pretraining means updating the model's parameters by using gradient descent over
some loss function >< Prompting with demonstrations can teach a model to do a new task.
- We use the term in-context learning to refer to the kinds of learning that language model do from
their prompts. In-context learning means language models learning to do new tasks, better predict
tokens, or generally reduce their loss, but without any gradient -based updates to the model’s
parameters.
- How does in-context learning work?
○ One hypothesis is based on the idea of induction heads. The induction head circuit is part of
the attention computation in transformers, discovered by looking at mini language models
with only 1-2 attention heads.
○ The function of the induction head is to predict repeated sequences. For example, if it sees
the pattern AB...A in an input sequence, it predicts that B will follow, instantiating the

Model Alignment, Prompting and In-Context Learning Page 3

the pattern AB...A in an input sequence, it predicts that B will follow, instantiating the
pattern completion rule AB...A! B.
It does this by having a prefix matching component of the attention computation that, when
looking at the current token A, searches back over the context to find a prior instance of A. If
it finds one, the induction head has a copying mechanism that “copies” the token B that
followed the earlier A, by increasing the probability the B will occur next.

• Further Readings:

Model Alignment, Prompting and In-Context Learning Page 4

Post-training and Model Alignment
Saturday, October 26, 2024 10:07 PM

- There are limits to how much can be expected from a model whose sole training objective is to
predict the next word from large amounts of pretraining text.
- To see this, consider the following example:

○ Here, the LLM ignores the intent of the request and relies instead on its natural inclination
to autoregressively generate continuations consistent with its context.
○ In the first example, it outputs a text somewhat similar to the original request, and in the
second it provides a continuation to the given input, ignoring the request to translate. LLMs
are not sufficiently helpful: they need extra training to increase their abilities to follow
textual instructions.
- A deeper problem is that LLMs can simultaneously be too harmful. For example, they can
generate text that is false, including unsafe misinformation, or text that is toxic in many ways,
such as facilitating the spread of hate speech. Even completely non-toxic prompts can lead LLMs
to output hate speech and abuse their users.
=> In attempt to address these two problems, LMs generally include two additional kinds of training
for model alignment: methods designed to adjust LLMs to better align them to human needs for models
to be helpful.
○ Instruction Tuning (SFT for supervised finetuning): models are finetuned on a corpus of
instructions and questions with their corresponding responses.
○ Preference Alignment: Reinforcement Learning from Human Feedback (RLHF)
- We'll use the term base model to mean a model that has been pretrained but hasn't yet been
aligned either by instruction tuning or RLHF. And we'll refer to these steps as post-training,
meaning that they apply after the model has been pretrained.

Model Alignment, Prompting and In-Context Learning Page 5

Model Alignment: Instruction Tuning
Sunday, October 27, 2024 9:38 AM

- Instruction Tuning: is a method for making an LLM better at following instructions. It involves
taking a base pretrained LLM and training it to follow instructions for a range of tasks, from
Machine Translation to Meal Planning.
○ The resulting model engages in a form of meta-learning - it improves its ability to follow
instructions generally.
- Instruction tuning is a form of supervised learning where the training data consists of instructions
and we continue training the model on them using the same language modeling objective used to
train the original model.
- Each instruction or question in the instruction tuning data has a supervised objective: a correct
answer to the question or a response to the instruction.

A. Instructions as Training Data

- The instruction-tuning datasets are created in 4 ways:
○ Write the instances directly. E.g: Aya instruct finetuning corpus
○ Make use of the copious amounts of supervised training data that have been curated over
the years for a wide range of natural language tasks. E.g: SQuAD datasets. These data can be
automatically converted into sets of instruction prompts and input/output demonstration
pairs via simple templates

Model Alignment, Prompting and In-Context Learning Page 6

pairs via simple templates

○ Because supervised NLP datasets are themselves often produced by crowdworkers based on
carefully written annotation guidelines, we can draw on these guidelines, which can include
detailed step-by-step instructions, pitfalls to avoid, formatting instructions, etc. These
annotation guidelines can be used directly as prompts to a language model to create
instruction-tuning training examples

Model Alignment, Prompting and In-Context Learning Page 7

○ Use language models to help at each stage. For example:

B. Evaluation of Instruction-Tuned Models

- In assessing instruction-tuning methods, we need to assess how well an instruction-trained model
platforms on novel tasks for which it has not been given explicit instructions.
- Leave-one-out approach: instruction-tune a model on some large set of tasks and then assess it
on a withheld task.
○ Problem: The enormous numbers of tasks in instruction-tuning datasets often overlap.
- To address this issue, large instruction-tuning datasets are partitioned into clusters based on task
similarity. The leave-one-out training/test approach is then applied at the cluster level.
For example:

Model Alignment, Prompting and In-Context Learning Page 8

For example:

Model Alignment, Prompting and In-Context Learning Page 9

Chain-of-Thought Prompting
Sunday, October 27, 2024 10:34 AM

- Goal: improve performance on difficult reasoning tasks that language models tend to fail on.
- Each of the demonstrations in the few-shot prompt is augmented with some text explaining some
reasoning steps. We want the model to output similar kinds of reasoning steps for the problem
being solved, and for the output of those reasoning steps to cause the system to generate the
correct answer.

Model Alignment, Prompting and In-Context Learning Page 10

Automatic Prompt Optimization
Sunday, October 27, 2024 10:38 AM

- Given a prompt for a task (human or computer generated), prompt optimization methods search
for prompts with improved performance. Most of these approaches can be viewed as a form of
iterative improvement search through a space of possible prompts for those that optimize
performance on a task.
- These approaches all share the following components:
○ A start state - An initial human or machine generated prompt or prompts suitable for some
task
○ A scoring metric - A method for assessing how well a given prompt performs on the task
○ An explanation method - A method for generating variations of a prompt.
- Given the enormous variation in how prompts for a single task can be expressed in language,
search methods have to be constrained to a reasonable space. Beam-search is widely used
method that combines BFS with a fixed-width priority queue that focuses the search effort on the
top performing variants.

A. Candidate Scoring
- Candidate scoring methods assess the likely performance of potential prompts, both to identify
promising avenues of search and to prune those that are unlikely to be effective. Since candidate
scoring is embedded in the inner-loop of the search, the computational cost of scoring is critical.
- Given access to labeled training data, candidate prompts can be scored based on execution
accuracy. In this approach, candidate prompts are combined with inputs sampled from the
training data and passed to an LLM for decoding. The LLM output is evaluated against the training

Model Alignment, Prompting and In-Context Learning Page 11

training data and passed to an LLM for decoding. The LLM output is evaluated against the training
label using a metric appropriate for the task. In the case of classification-based tasks, this is
effectively a 0/1 loss — how many examples were correctly labeled with the given prompt.
Generative applications such as summarization or translation use task-specific similarity scores
such as BERTScore, Bleu, or ROUGE
- Given the computational cost of issuing calls to an LLM, evaluating each candidate prompt against
a complete training set would be infeasible. Instead, prompt performance is estimated from a
small sample of training data.
B. Prompt Expansion
- Prompt expansion generates variants of a given prompt to create an expanded set of neighboring
prompts that may improve performance over the original.
- A common method is to use the language model to create paraphrases:

- A variation of this method is to truncate the current prompt at set of random locations,
generating a set of prompt prefixes. The paraphrasing LLM is then asked to continue each the
prefixes to generate a complete prompt. => uniformed search
- By contrast, Prasad employ a candidate expansion technique that explicitly attempts to generate
superior prompts during the expansion process:
In this approach, the current candidate is first applied to a sample of training examples using the
execution accuracy approach. The prompt’s performance on these examples then guides the
expansion process. Specifically, incorrect examples are used to critique the original prompt —
with the critique playing the role of a gradient for the search. The method includes the following
steps:
1. Run the prompt on a sample of training examples
2. Identify examples where the prompt fails
3. Ask an LLM to produce to critique of the prompt in light of the failed examples
4. Provide the resulting critique to an LLM, and ask it to generate improved prompts.

Model Alignment, Prompting and In-Context Learning Page 12

Model Alignment, Prompting and In-Context Learning Page 13
Evaluating Prompted Language Models
Sunday, October 27, 2024 11:26 AM

Model Alignment, Prompting and In-Context Learning Page 14

Model Alignment, Prompting and In-Context Learning Page 15
Model Alignment with Human Preferences: RLHF and DPO
Sunday, October 27, 2024 11:28 AM

Model Alignment, Prompting and In-Context Learning Page 16

Il Bilancio. Analisi Economiche Per Le Decisioni e La Comunicazione Della Performance Robert Anthony Download
100% (1)
Il Bilancio. Analisi Economiche Per Le Decisioni e La Comunicazione Della Performance Robert Anthony Download
90 pages
Large Language Models (LLM)
100% (1)
Large Language Models (LLM)
139 pages
Glints CV Template
No ratings yet
Glints CV Template
1 page
Model Alignment, Prompting, and In-Context Learning
No ratings yet
Model Alignment, Prompting, and In-Context Learning
20 pages
Prompt Engineering
No ratings yet
Prompt Engineering
26 pages
Prompt Engineering
No ratings yet
Prompt Engineering
26 pages
EY 2 Syllabus Overview 2024-2025
No ratings yet
EY 2 Syllabus Overview 2024-2025
4 pages
DSPy Not Your Average Prompt Engineering 1
No ratings yet
DSPy Not Your Average Prompt Engineering 1
42 pages
Prompt Engineering
No ratings yet
Prompt Engineering
26 pages
2.FH Primary 5 Basica Micro Planning
No ratings yet
2.FH Primary 5 Basica Micro Planning
152 pages
Session 3 Prompt Engineering For Generative AI
100% (1)
Session 3 Prompt Engineering For Generative AI
25 pages
Prompt Engineering
No ratings yet
Prompt Engineering
18 pages
Prompt Engineering
No ratings yet
Prompt Engineering
26 pages
Fundamental Limit of Alignment in LLM
No ratings yet
Fundamental Limit of Alignment in LLM
39 pages
PAPER Prompt Engineering For LLM
No ratings yet
PAPER Prompt Engineering For LLM
6 pages
Language Models Can Exploit Cross-Task In-Context Learning For Data-Scarce Novel Tasks
No ratings yet
Language Models Can Exploit Cross-Task In-Context Learning For Data-Scarce Novel Tasks
20 pages
4 Alignment
No ratings yet
4 Alignment
48 pages
AI and The Future of Work
No ratings yet
AI and The Future of Work
5 pages
Appendix A - Advanced Prompting Techniques
No ratings yet
Appendix A - Advanced Prompting Techniques
29 pages
Sociocultural Theory PDF
No ratings yet
Sociocultural Theory PDF
13 pages
Democratizing Llms For Low-Resource Languages by Leveraging Their English Dominant Abilities With Linguistically-Diverse Prompts
No ratings yet
Democratizing Llms For Low-Resource Languages by Leveraging Their English Dominant Abilities With Linguistically-Diverse Prompts
16 pages
Prompt Engineering For Large Language Models
No ratings yet
Prompt Engineering For Large Language Models
9 pages
3 Prompting
No ratings yet
3 Prompting
59 pages
Lab: L - S A C B: Arge Cale Lignment For HAT OTS
No ratings yet
Lab: L - S A C B: Arge Cale Lignment For HAT OTS
10 pages
This 200-Page LLM Guide Will Save You Months - Here's The Gold in 5 Minutes
No ratings yet
This 200-Page LLM Guide Will Save You Months - Here's The Gold in 5 Minutes
22 pages
Summary - Foundations On LLMs
No ratings yet
Summary - Foundations On LLMs
6 pages
Circus Skills Analysis Report
No ratings yet
Circus Skills Analysis Report
102 pages
LLM Book 103-161
No ratings yet
LLM Book 103-161
59 pages
Aims of Socialization
100% (2)
Aims of Socialization
3 pages
Progress Test Files 1-5 Answer Key A Grammar, Vocabulary, and Pronunciation
100% (13)
Progress Test Files 1-5 Answer Key A Grammar, Vocabulary, and Pronunciation
8 pages
Prompting Techniques for LLMs
No ratings yet
Prompting Techniques for LLMs
10 pages
5624 Large Language Models Are Huma
No ratings yet
5624 Large Language Models Are Huma
43 pages
Los Modelos Basados en Indicaciones Realmente Entienden El Significado de Sus Indicaciones
No ratings yet
Los Modelos Basados en Indicaciones Realmente Entienden El Significado de Sus Indicaciones
45 pages
Merged
No ratings yet
Merged
28 pages
50 LLM Interview Questions
100% (2)
50 LLM Interview Questions
56 pages
Adobe Scan Aug 01, 2025
No ratings yet
Adobe Scan Aug 01, 2025
10 pages
Universal Prompt Generator for LLMs
No ratings yet
Universal Prompt Generator for LLMs
10 pages
Prompting in Large Language Models
No ratings yet
Prompting in Large Language Models
66 pages
Prompt Engineering
No ratings yet
Prompt Engineering
44 pages
Prompt Engineering
No ratings yet
Prompt Engineering
26 pages
Training The Application of LLM
No ratings yet
Training The Application of LLM
68 pages
Introduction To Large Language Models-2025072419561496
No ratings yet
Introduction To Large Language Models-2025072419561496
16 pages
Icaps LLM Tut Slides Posted
No ratings yet
Icaps LLM Tut Slides Posted
97 pages
Zsiga - Segments - The Phonetics Phonology Interface
No ratings yet
Zsiga - Segments - The Phonetics Phonology Interface
28 pages
Lilianweng Github Io Posts 2023-03-15 Prompt Engineering
No ratings yet
Lilianweng Github Io Posts 2023-03-15 Prompt Engineering
9 pages
Summary All Individual Development Plan
No ratings yet
Summary All Individual Development Plan
12 pages
Advanced Prompt Engineering
No ratings yet
Advanced Prompt Engineering
27 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
68 LLM Informed Discrete Promp
No ratings yet
68 LLM Informed Discrete Promp
6 pages
Clase1 Generating Your First Text
No ratings yet
Clase1 Generating Your First Text
18 pages
The Process of Values Clarification, Formation and Inculcation
No ratings yet
The Process of Values Clarification, Formation and Inculcation
6 pages
Detecting Mental Disorders in Social Media Through Emotional Patterns - The Case of Anorexia and Depression
No ratings yet
Detecting Mental Disorders in Social Media Through Emotional Patterns - The Case of Anorexia and Depression
12 pages
Llms 16 30
No ratings yet
Llms 16 30
15 pages
Vocab Reading 2012 Correctedproofs
No ratings yet
Vocab Reading 2012 Correctedproofs
8 pages
Improving Students' Speaking Skill by Using Social Media
No ratings yet
Improving Students' Speaking Skill by Using Social Media
10 pages
3 Paradigm 2: Prompt-Based Learning: Table 2: Example Prompt Designs For Learning From In-Structions
No ratings yet
3 Paradigm 2: Prompt-Based Learning: Table 2: Example Prompt Designs For Learning From In-Structions
10 pages
Efficient LLaMA Fine-tuning
No ratings yet
Efficient LLaMA Fine-tuning
18 pages
Cutting Down On Prompts and Parameters: Simple Few-Shot Learning With Language Models
No ratings yet
Cutting Down On Prompts and Parameters: Simple Few-Shot Learning With Language Models
12 pages
Efficient Prompting Methods For Large Language Models - A Survey
100% (1)
Efficient Prompting Methods For Large Language Models - A Survey
18 pages
PromptEngineering 20230208
No ratings yet
PromptEngineering 20230208
59 pages
Generative Ai Terminology
67% (3)
Generative Ai Terminology
26 pages
Large Language Models Meet NLP: A Survey
No ratings yet
Large Language Models Meet NLP: A Survey
20 pages
Userdrive 1844/AIPrompts/65da8a56045061708821078
No ratings yet
Userdrive 1844/AIPrompts/65da8a56045061708821078
62 pages
R. Basson. Human Sex-Response Cycles 2001
No ratings yet
R. Basson. Human Sex-Response Cycles 2001
11 pages
Power Affirmations To Spark Charge Success in Your Life Self Carla Da Costa
No ratings yet
Power Affirmations To Spark Charge Success in Your Life Self Carla Da Costa
20 pages
ChatGPT KZ Feb2023 PDF
No ratings yet
ChatGPT KZ Feb2023 PDF
7 pages
Deep Learning: Large Language Models
No ratings yet
Deep Learning: Large Language Models
58 pages
Welcome To This Course On ChatGPT Intro 1
No ratings yet
Welcome To This Course On ChatGPT Intro 1
2 pages
19 20-gpt-3 Prompts
No ratings yet
19 20-gpt-3 Prompts
68 pages
The Ultimate Resume Newnew New
No ratings yet
The Ultimate Resume Newnew New
3 pages
Basics of NLP
No ratings yet
Basics of NLP
9 pages
Nursing Health Assessment Guide
No ratings yet
Nursing Health Assessment Guide
8 pages
Customizing LLMs for Developers
No ratings yet
Customizing LLMs for Developers
52 pages
Prompt Engineering Mastery
No ratings yet
Prompt Engineering Mastery
9 pages
Prompt Engineering Guide
No ratings yet
Prompt Engineering Guide
122 pages
Week2 Llms
No ratings yet
Week2 Llms
25 pages
Guide 4 Prompt Engineering
No ratings yet
Guide 4 Prompt Engineering
1 page
AI Prompting Techniques Guide
No ratings yet
AI Prompting Techniques Guide
5 pages
MODULE 3 Communication Studies
100% (4)
MODULE 3 Communication Studies
8 pages
Lesson Plan 6th Grade
No ratings yet
Lesson Plan 6th Grade
4 pages
Chapter 1 and 2
No ratings yet
Chapter 1 and 2
13 pages
A. I. 2. Feedback in Communication
No ratings yet
A. I. 2. Feedback in Communication
2 pages
The Temperment Trap
No ratings yet
The Temperment Trap
4 pages
Pronunciation Teaching Guide
100% (7)
Pronunciation Teaching Guide
7 pages
Math Lesson Plan
No ratings yet
Math Lesson Plan
6 pages
Classroom Management Guide
No ratings yet
Classroom Management Guide
2 pages
The Success Myth
100% (1)
The Success Myth
2 pages
Trauma and Literary Studies: Some "Enabling Questions": by Elissa Marder
100% (1)
Trauma and Literary Studies: Some "Enabling Questions": by Elissa Marder
6 pages
Pe Reflection
0% (1)
Pe Reflection
2 pages