Podcast Transcript to Blog: Evaluating LLaMA 3.1 Variants

This project evaluates the performance of Meta's LLaMA 3.1 variants (8B, 70B, 405B) in generating high-quality blogs from podcast transcripts. The evaluation utilizes Google's Gemini-1.5-Flash LLM Judge framework to assess generated blogs on various attributes such as clarity, grammar, and engagement.

📝 Introduction

The project focuses on converting podcast transcripts into coherent and engaging blogs using LLaMA 3.1 models. Key objectives include:

Generating blogs from transcripts.
Evaluating the output using an LLM-based judge.
Analyzing the impact of model scaling on blog quality.
Establishing a baseline with the 8B LLaMA variant.

✨ Features

Blog Generation: Transforms podcast transcripts into blog posts using advanced LLaMA 3.1 models.
LLM-Based Evaluation: Blogs are scored on clarity, grammar, tone, and engagement using Google's Gemini-1.5-Flash.
Scalability Analysis: Compares the performance of models with 8B, 70B, and 405B parameters.

🛠️ Workflow

Dataset Preparation:
- Transcripts sourced from the Lex Fridman Podcast Dataset.
- Transcripts are segmented for processing within LLaMA's context window.
Blog Generation:
- Summaries generated for transcript segments.
- Summaries combined into a final blog post.
Evaluation:
- Blogs scored by Gemini-1.5-Flash on multiple attributes.
Comparison:
- Models compared using baseline scores.

📂 Dataset

Source: HuggingFace Lex Fridman Podcast Dataset.
Category: Science and Technology.
Average Transcript Length: ~23,000 words.

📊 Evaluation

Scoring Metrics

Blogs are assessed on:

Clarity
Grammar & Syntax
Tone Appropriateness
Sentence Structure & Flow
Engagement
Conciseness

### Results Summary - Larger models (405B) generally outperform smaller ones in clarity, grammar, tone, and engagement. - The 8B model exhibited superior performance in conciseness.

⚙️ Technologies Used

Models: Meta LLaMA 3.1 (8B, 70B, 405B)
Evaluation Framework: Gemini-1.5-Flash
Platforms:
- Google Colab
- Groq Cloud
- Samba Nova Cloud
Libraries: Hugging Face Transformers

📈 Results

The 405B model showed overall superior performance but struggled with conciseness. The 8B model, despite its smaller size, was efficient in certain metrics, particularly conciseness.

🚀 Future Work

Dynamic Chunking: Enhance context management for long transcripts.
Tone Understanding: Improve model handling of conversational tones, humor, and sarcasm.
Engagement Optimization: Reduce redundancy for better blog readability.

📋 Prompts Given

🙏 Acknowledgements

Special thanks to:

Professor Ndapa Nakashole for guidance and insights.
Yu Miaopeng (TA) for metric definitions and resources.
Cloud Providers: Groq Cloud, Samba Nova, and Google Cloud AI for computational support.

📚 References

Feel free to suggest any improvements or ask questions!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Images		Images
Processed_Data		Processed_Data
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
analyse.ipynb		analyse.ipynb
assessment_405b.ipynb		assessment_405b.ipynb
assessment_70b.ipynb		assessment_70b.ipynb
assessment_8b.ipynb		assessment_8b.ipynb
gemini-judge.ipynb		gemini-judge.ipynb
main_final.ipynb		main_final.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Podcast Transcript to Blog: Evaluating LLaMA 3.1 Variants

📝 Introduction

✨ Features

🛠️ Workflow

📂 Dataset

📊 Evaluation

Scoring Metrics

⚙️ Technologies Used

📈 Results

🚀 Future Work

📋 Prompts Given

🙏 Acknowledgements

📚 References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

abhay-lal/BlogPod

Folders and files

Latest commit

History

Repository files navigation

Podcast Transcript to Blog: Evaluating LLaMA 3.1 Variants

📝 Introduction

✨ Features

🛠️ Workflow

📂 Dataset

📊 Evaluation

Scoring Metrics

⚙️ Technologies Used

📈 Results

🚀 Future Work

📋 Prompts Given

🙏 Acknowledgements

📚 References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages