Assignment 4
Title: systematic literature search using the search log table and the detailed LR table
Topic: Large Language Models (LLMs)
Submitted by: Nader Eltayeb Yousif Eltayeb (016-614)
Purpose:
To explore the development, applications, and impacts of Large Language Models (LLMs) in
various fields such as natural language processing, artificial intelligence, and data science.
Objectives:
1. To identify key advancements in the development of LLMs.
2. To review the different applications of LLMs in industry and academia.
3. To evaluate the challenges and limitations associated with LLMs.
4. To understand the future trends and research directions in the field of LLMs.
Scope: From year 2018 to 2024.
Reference No. of Search No. of
Type Citations Results No. Included
No. Date/Time Keyword References
1 16-6-2024 "Large Journal 3200 1000 5
Language Article
Models"
2 16-6-2024 "GPT-3 Conference 2500 900 4
Applications" Paper
3 16-6-2024 "BERT Model" Journal 4500 800 3
Article
4 16-6-2024 "Transformer Technical 3000 750 2
Models" Report
5 16-6-2024 "AI Language Journal 4000 850 3
Models" Article
Future Your
No. Citation Title Type Cited Aim Method Findings Work Comment
by
No.
1 Brown et Language Journal 2500 To Experim GPT-3 Explore Included for
al. Models are Article investigate ental achieves the ethical comprehensive
(2020) Few-Shot the state-of-the-a implicatio analysis of
Learners capabilities rt results on ns of GPT-3
of GPT-3 many NLP LLMs
tasks
2 Devlin et BERT: Confere 1500 To present Experim BERT Optimize Included for its
al. Pre-training of nce BERT, a ental improves training foundational
(2019) Deep Paper new LLM performance efficiency role in LLM
Bidirectional on various development
Transformers for NLP
Language benchmarks
Understanding
3 Radford Language Technica 3200 To explore Experim GPT-2 Address Crucial for
et al. Models are l Report GPT-2's ental performs scalability understanding
(2019) Unsupervised multitasking multiple issues multitask
Multitask abilities tasks with learning
Learners without minimal
fine-tuning supervision
4 Vaswani Attention is All Confere 4500 To introduce Theoreti Transformers Improve Essential for
et al. You Need nce the cal outperform attention LLM
(2017) Paper Transformer RNNs on mechanis architecture
model translation ms
tasks
5 Raffel et Exploring the Journal 900 To propose T5 achieves Enhance Important for
al. Limits of Article T5, a unified Experime SOTA results model unified NLP
(2020) Transfer text-to-text on many efficiency models
Learning with a transformer NLP
Unified model benchmarks
Text-to-Text
Transformer
6 Liu et al. RoBERTa: A Technica 700 To optimize RoBERTa Further Key for
(2019) Robustly l Report BERT's Exper outperforms optimize understanding
Optimized BERT pretraining iment BERT on training optimization in
Pretraining process al GLUE and data LLMs
Approach RACE
benchmarks
7 Yang et XLNet: Confere 800 To introduce XLNet Integrate Relevant for
al. Generalized nce XLNet, Exper surpasses with other hybrid LLM
(2019) Autoregressive Paper combining iment BERT in models approaches
Pretraining for autoregressi al several
Language ve and benchmarks
Understanding autoencodin
g models
8 Lan et al. XLNet: Journal 600 To present a ALBERT Explore Significant for
(2020) Generalized Article lighter Exper reduces compressi efficient LLMs
Autoregressive version of iment parameters on
Pretraining for BERT al while techniques
Language maintaining
Understanding performance
9 Clark et ELECTRA: Confere 400 To propose ELECTRA Refine Important for
al. Pre-training Text nce ELECTRA Exper achieves discrimina efficient
(2020) Encoders as Paper for more iment comparable tor training
Discriminators efficient al results with training methods
Rather Than pretraining fewer
Generators resources
10 Radford Improving Technica 1200 To introduce GPT Expand Foundational
et al. Language l Report GPT and its Exper improves generative for generative
(2018) Understanding by generative iment performance capabilitie LLMs
Generative pre-training al on s
Pre-Training method downstream
tasks
11 Shoeybi Megatron-LM: Confere 500 To discuss Megatron-L Address Crucial for
et al. Training nce training Exper M effectively parallelis scalability of
(2019) Multi-Billion Paper large models iment trains very m LLMs
Parameter using model al large models challenges
Language parallelism
Models Using
Model
Parallelism
12 He et al. DeBERTa: Confere 300 To enhance DeBERTa Further Innovative
(2020) Decoding-enhanc nce BERT with Exper improves improve approach to
ed BERT with Paper disentangled iment upon BERT attention attention in
Disentangled attention al on various disentangl LLMs
Attention mechanisms tasks ement
13 Lewis et BART: Journal 700 To propose BART excels Enhance Important for
al. Denoising Article BART for Exper at text denoising sequence-to-seq
(2020) Sequence-to-Seq sequence-to- iment generation techniques uence learning
uence sequence al and
Pre-training for tasks comprehensi
Natural on
Language
Generation,
Translation, and
Comprehension
14 Yang et FNet: Mixing Journal 150 To introduce FNet shows Explore Relevant for
al. Tokens with Article FNet, an Exper promising more LLM efficiency
(2020) Fourier efficient iment results with efficient
Transforms alternative al reduced architectur
to computation es
transformers
15 Zaheer et Big Bird: Confere 250 To address Big Bird Improve Key for
al. Transformers for nce the handles long-sequ long-context
(2020) Longer Paper limitation of longer ence LLM
Sequences transformers contexts processing applications
with long efficiently
sequences