Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views2 pages

Generative AI Interview Flashcards

The document provides a series of flashcard-style questions and answers about generative AI, focusing on concepts such as GPT, Transformers, prompt engineering, and model evaluation. Key distinctions between techniques like fine-tuning and retrieval-augmented generation (RAG) are highlighted, along with methods for deploying AI models and reducing hallucinations. It also covers the differences between various model types, including diffusion models and GANs.

Uploaded by

0408jaindaksh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views2 pages

Generative AI Interview Flashcards

The document provides a series of flashcard-style questions and answers about generative AI, focusing on concepts such as GPT, Transformers, prompt engineering, and model evaluation. Key distinctions between techniques like fine-tuning and retrieval-augmented generation (RAG) are highlighted, along with methods for deploying AI models and reducing hallucinations. It also covers the differences between various model types, including diffusion models and GANs.

Uploaded by

0408jaindaksh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Generative AI Interview Flashcards

Q: What is GPT trained to do?


A: GPT is trained to predict the next token given the previous tokens. This sequential training allows
it to generate coherent text.

Q: How does attention work in Transformers?


A: Attention computes weights over all tokens, letting the model focus on the most relevant words
for each prediction. This captures long-range dependencies better than RNNs.

Q: Why are Transformers better than RNNs/LSTMs?


A: They parallelize training (process all tokens at once), handle long-range context with attention,
and scale more efficiently to billions of parameters.

Q: What’s the difference between pre-training and fine-tuning?


A: Pre-training: model learns general language patterns from massive text data. Fine-tuning: adapt
the model on a smaller, task-specific dataset (e.g., legal documents, customer support).

Q: What is prompt engineering?


A: Prompt engineering is designing inputs that guide the model toward the desired output, without
changing the model weights.

Q: Fine-tuning vs. Prompt Engineering?


A: Prompting is quick and flexible, but limited. Fine-tuning permanently teaches the model domain
knowledge or style, useful when prompts alone aren’t enough.

Q: What is few-shot and zero-shot prompting?


A: Zero-shot: ask the model directly without examples. Few-shot: provide some examples in the
prompt to guide the model’s style and format.

Q: What is RAG?
A: Retrieval-Augmented Generation combines GPT with a retrieval system (like a vector DB) to
inject external knowledge at query time.

Q: Why use a vector database?


A: Vector DBs store embeddings that capture semantic meaning, enabling similarity search. This
allows GPT to retrieve relevant text even when wording is different.

Q: Fine-tuning vs. RAG: which one for 10,000 PDFs?


A: RAG is better — it scales, is cheaper, and updates easily. Fine-tuning is costly and requires
retraining for updates.

Q: How are embeddings generated?


A: Embeddings are numerical vectors generated by models like BERT, OpenAI’s
text-embedding-ada-002, etc. They represent meaning, so similar text is close in vector space.

Q: How do diffusion models work?


A: They learn to denoise: start from random noise, remove noise step by step until an image
emerges. Training teaches them how to reverse the noising process.

Q: Difference between GPT and Diffusion models?


A: GPT generates sequentially (next token prediction). Diffusion models generate iteratively
(denoising over many steps).

Q: What are GANs vs Diffusion Models?


A: GANs use a generator and discriminator in a game to create images. Diffusion models use a
probabilistic denoising process. Diffusion tends to produce more stable, higher-quality results.

Q: How to evaluate a text generation model?


A: Automatic: Perplexity, BLEU, ROUGE. Practical: human evaluation + hallucination checks.

Q: How to evaluate an image generation model?


A: FID (Fréchet Inception Distance) for realism, IS (Inception Score) for quality/diversity, plus
human evaluation.

Q: What is perplexity?
A: Perplexity measures how well a model predicts the next word. Lower perplexity means better
predictive performance.

Q: How to deploy a GPT-based chatbot?


A: Wrap it in an API (FastAPI/Flask), connect to a vector DB for RAG, deploy on cloud (AWS/GCP),
add monitoring for hallucinations and logging for feedback.

Q: How to reduce hallucinations?


A: Use RAG with trusted knowledge sources, add guardrails with prompt engineering, and apply
human feedback reinforcement (RLHF).

Q: How to optimize large models for deployment?


A: Techniques include model quantization, pruning, knowledge distillation, and using efficient
serving libraries like ONNX or TensorRT.

You might also like