Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views7 pages

Lecture 19

The document outlines the process of synthetic data generation using Large Language Models (LLMs) and the components involved in a Retrieval-Augmented Generation (RAG) architecture. It details the steps from data loading and indexing to retrieval and generation, emphasizing the importance of embedding models and vector databases for efficient information processing. Additionally, it highlights quality assurance measures like safety checks and post-processing techniques to enhance the final output generated by the LLM.

Uploaded by

Eishah Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views7 pages

Lecture 19

The document outlines the process of synthetic data generation using Large Language Models (LLMs) and the components involved in a Retrieval-Augmented Generation (RAG) architecture. It details the steps from data loading and indexing to retrieval and generation, emphasizing the importance of embedding models and vector databases for efficient information processing. Additionally, it highlights quality assurance measures like safety checks and post-processing techniques to enhance the final output generated by the LLM.

Uploaded by

Eishah Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Generative AI

Fundamentals

©2023 Databricks Inc. — All rights reserved


QUIZ 04

What is meant by Synthetic Data Generation?


And how are LLMs aiding this technique?

Start: 3:40 End: 3:50


Infrastructural Components of a (RAG)
Indexing
•The process begins with data loaders, which retrieve data from various sources, including
unstructured documents (e.g., PDFs, docs), semi-structured data (e.g., XML, JSON, CSV), and even
structured data residing in SQL databases using data connectors.

•Document splitters organize the data and prepare it for efficient processing by the embedding model.

•They achieve this by segmenting the documents into logical units – sentences or paragraphs – based
on predefined rules. This segmentation ensures that information remains semantically intact while
preparing it for further processing.

•Tokenizer takes each logical unit (e.g., paragraph) from the document splitter and breaks it
intotokens, depending on the chosen embedding model and the desired level of granularity. Using a
single tokenizer ensures consistency throughout the system.
Indexing

•The embedding model converts each token into a


numerical vector representation, capturing its
semantic meaning within the context of the
surrounding text.

•Pre-trained embedding models, either word embeddings or contextual embeddings, achieve this by
mapping the tokens into these vector representations.

•Finally, an indexing component takes over.

•It packages the generated embedding vectors along with any associated metadata (e.g., document
source information) and sends them to a specialized embedding database – the vector
database (vector DB) – for efficient storage.

•This database becomes the foundation for the retrieval stage, where the RAG architecture searches
for relevant information based on user queries.
Retrieval
•The user submits a prompt that needs to be processed.The prompt is prepared to match the same
structure (embeddings) used during the indexing phase.

•Safety, ethical, and quality checks are applied. Ensures the prompt aligns with guidelines and
prevents misuse.

•The prompt is tokenized and converted to embeddings.

•The system searches a vector database for embeddings similar to the prompt’s vector. Retrieved data
chunks represent relevant content linked to the user’s query.

•A ranking service assigns scores to each chunk based on similarity to the prompt’s vector. The system
prioritizes the most relevant chunks for the response.
Generation
•The top-ranked chunks and the embedded prompt are passed to the LLM (Large Language Model).The
LLM processes the information to generate a coherent, informative response.
•The LLM produces the final output, ensuring it is context-aware and aligned with user expectations and
the response is presented to the user.
The raw output from the LLM might undergo some post-processing steps to enhance its quality. This
could involve tasks like:
Text Normalization: Ensuring consistency in formatting, such as converting all numbers to a standard
format or handling special characters.
Spell Checking: Identifying and correcting any potential typos or spelling errors.
Grammar Correction: Refining the
grammatical structure of the
generated text for clarity and
coherence.
Redundancy Removal: Eliminating
unnecessary repetition or irrelevant
information that may clutter the
response.

You might also like