-
-
Notifications
You must be signed in to change notification settings - Fork 313
Fix education YouTube transcript + AI summary/quiz generation #5318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Hi @aaditya8979! This pull request needs a peer review before it can be merged. Please request a review from a team member who is not:
Once a valid peer review is submitted, this check will pass automatically. Thank you! |
WalkthroughAdds AI-driven educational video support: new models and migrations (EducationalVideo, VideoQuizQuestion, QuizAttempt), views to fetch YouTube transcripts and call OpenAI, templates and routes for listing/detail and quiz submission, Docker/dependency updates, and CI step to run migrations. Changes
Sequence DiagramsequenceDiagram
participant User
participant Web as Django App
participant YT as YouTubeTranscriptApi
participant OpenAI as OpenAI API
participant DB as Database
User->>Web: POST /education (video_title, youtube_url)
Web->>Web: extract youtube_id, validate URL
Web->>YT: fetch transcript for youtube_id
YT-->>Web: transcript / error
alt transcript available
Web->>OpenAI: generate ai_summary & is_educational
OpenAI-->>Web: ai_summary, is_educational
Web->>OpenAI: generate quiz questions (JSON)
OpenAI-->>Web: quiz payload
end
Web->>DB: create EducationalVideo (save ai_summary/is_verified)
DB-->>Web: saved video
opt quiz generated
Web->>DB: create VideoQuizQuestion entries
DB-->>Web: saved questions
end
Web-->>User: redirect/render education page
User->>Web: GET /education/video/<pk>
Web->>DB: fetch EducationalVideo + quiz questions + quiz history
DB-->>Web: data
Web-->>User: render video_detail
User->>Web: POST /education/video/<id>/quiz/submit (answers)
Web->>DB: fetch VideoQuizQuestion correct answers
DB-->>Web: questions
Web->>Web: compute score, create QuizAttempt
Web->>DB: save QuizAttempt
DB-->>Web: saved attempt
Web-->>User: JSON {score, total, percentage, attempt_id}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📊 Monthly LeaderboardHi @aaditya8979! Here's how you rank for December 2025:
Leaderboard based on contributions in December 2025. Keep up the great work! 🚀 |
❌ Pre-commit checks failedThe pre-commit hooks found issues that need to be fixed. Please run the following commands locally to fix them: # Install pre-commit if you haven't already
pip install pre-commit
# Run pre-commit on all files
pre-commit run --all-files
# Or run pre-commit on staged files only
pre-commit runAfter running these commands, the pre-commit hooks will automatically fix most issues. 💡 Tip: You can set up pre-commit to run automatically on every commit by running: pre-commit installPre-commit outputFor more information, see the pre-commit documentation. |
website/views/education.py
Outdated
| context_object_name = "video" | ||
|
|
||
| def get_context_data(self, **kwargs): | ||
| context = super().get_context_data(**kwargs) | ||
| video = self.object | ||
| context["quiz_questions"] = VideoQuizQuestion.objects.filter(video=video) | ||
| if self.request.user.is_authenticated: | ||
| context["quiz_history"] = QuizAttempt.objects.filter( | ||
| user=self.request.user, video=video | ||
| ) | ||
| return context |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion!
In VideoDetailView.get_context_data the template receives quiz_questions , and each VideoQuizQuestion instance already exposes its options via option_a , option_b , option_c , and option_d .
The template renders these fields directly for each question (e.g., question.option_a ), so we don’t need a separate options context variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
website/views/education.py (1)
436-439: Remove unreachable dead code.Lines 436-439 are unreachable because they appear after the
submit_quizfunction definition (lines 373-435). This code will never execute and should be removed.- featured_lectures = Lecture.objects.filter(section__isnull=True) - courses = Course.objects.all() - context = {"is_instructor": is_instructor, "featured_lectures": featured_lectures, "courses": courses} - return render(request, template, context) -
🧹 Nitpick comments (5)
blt/urls.py (1)
20-21: Remove unused imports.
DetailViewfromdjango.views.genericis unused sinceVideoDetailViewis imported directly fromwebsite.views.education. The model imports (EducationalVideo,VideoQuizQuestion,QuizAttempt) are also not used in this URL configuration file.-from django.views.generic import DetailView -from website.models import EducationalVideo, VideoQuizQuestion, QuizAttemptwebsite/migrations/0265_educationalvideo_ai_summary_and_more.py (1)
26-50: Consider adding an index for common query patterns.The
QuizAttemptmodel will likely be queried by(user, video)combination frequently (e.g., to show quiz history). Consider adding a composite index or unique constraint if users should only have one attempt per video.If multiple attempts per video are allowed, consider adding an index:
options={ "ordering": ["-completed_at"], "indexes": [ models.Index(fields=["user", "video"]), ], },If only one attempt per video is intended, consider a unique constraint:
options={ "ordering": ["-completed_at"], }, unique_together=[("user", "video")],website/templates/education/video_detail.html (3)
17-17: Consider removing deprecatedframeborderattribute.The
frameborderattribute is deprecated in HTML5. Modern browsers applyborder: noneby default to iframes, and this can be controlled via CSS if needed.Apply this diff:
- frameborder="0"
77-79: Consider validating that all questions are answered before submission.Currently, users can submit the quiz without selecting any answers. Consider adding client-side validation to ensure all questions have a selected answer before allowing submission.
Add validation in the submit handler:
const quizForm = document.getElementById('quizForm'); if (quizForm) { quizForm.addEventListener('submit', async function(e) { e.preventDefault(); + // Validate all questions are answered + const questions = document.querySelectorAll('.quiz-question'); + for (const question of questions) { + const questionId = question.dataset.questionId; + const selected = document.querySelector(`input[name="question_${questionId}"]:checked`); + if (!selected) { + alert('Please answer all questions before submitting.'); + return; + } + } + const formData = new FormData(this);
83-115: Enhance modal accessibility with keyboard navigation and focus trapping.The results modal should be dismissible with the ESC key and trap focus within the modal for keyboard users to meet accessibility standards (WCAG 2.1).
Add keyboard event handling:
function showResults(score, total, percentage) { const modal = document.getElementById('resultsModal'); // ... existing code ... modal.classList.remove('hidden'); // Focus the first button in modal modal.querySelector('button').focus(); // Handle ESC key to close modal const handleEscape = (e) => { if (e.key === 'Escape') { modal.classList.add('hidden'); document.removeEventListener('keydown', handleEscape); } }; document.addEventListener('keydown', handleEscape); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (8)
Dockerfile(1 hunks)blt/urls.py(3 hunks)website/migrations/0264_educationalvideo.py(1 hunks)website/migrations/0265_educationalvideo_ai_summary_and_more.py(1 hunks)website/models.py(1 hunks)website/templates/education/education.html(1 hunks)website/templates/education/video_detail.html(1 hunks)website/views/education.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
website/migrations/0265_educationalvideo_ai_summary_and_more.py (1)
website/migrations/0264_educationalvideo.py (1)
Migration(6-28)
blt/urls.py (2)
website/models.py (3)
EducationalVideo(3651-3673)VideoQuizQuestion(3676-3688)QuizAttempt(3691-3703)website/views/education.py (2)
submit_quiz(375-439)VideoDetailView(976-989)
🪛 GitHub Check: CodeQL
website/views/education.py
[warning] 432-432: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Run Tests
- GitHub Check: docker-test
🔇 Additional comments (3)
website/templates/education/education.html (1)
151-256: LGTM - Well-structured video submission form and display.The template properly implements:
- CSRF protection for the form
- Privacy-conscious YouTube embedding via
youtube-nocookie.com- Proper
referrerpolicyattribute for security- Conditional rendering for optional description
- Responsive grid layout for video cards
website/migrations/0264_educationalvideo.py (1)
1-28: LGTM - Migration correctly creates the EducationalVideo model.The migration properly defines the initial schema with appropriate field types and ordering.
blt/urls.py (1)
739-743: Thesubmit_quizendpoint inwebsite/views/education.pyhas the@login_requireddecorator present on line 373, directly above the function definition. The endpoint is properly protected from unauthenticated access.Likely an incorrect or invalid review comment.
| RUN pip install -r requirements.txt | ||
| RUN pip install youtube-transcript-api openai |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Move dependency installation before ENTRYPOINT/CMD and before copying application code.
These RUN pip install commands are placed after ENTRYPOINT and CMD, which is unconventional and confusing. While Docker does execute them during build, this placement:
- Breaks layer caching efficiency - dependencies should be installed before copying application code (line 61)
- May cause confusion as it appears these run at container start (they don't)
Additionally, if requirements.txt already includes youtube-transcript-api and openai (as mentioned in PR objectives), line 73 is redundant.
Consider restructuring to:
# Copy application code
COPY . /blt
+# Install Python dependencies
+RUN pip install -r requirements.txt
+RUN pip install youtube-transcript-api openai
+
# Convert line endings and set permissions
RUN dos2unix Dockerfile docker-compose.yml scripts/entrypoint.sh ./blt/settings.py
# Check if .env exists and run dos2unix on it, otherwise skip
RUN if [ -f /blt/.env ]; then dos2unix /blt/.env; fi
RUN chmod +x /blt/scripts/entrypoint.sh
ENTRYPOINT ["/blt/scripts/entrypoint.sh"]
CMD ["poetry", "run", "python", "manage.py", "runserver", "0.0.0.0:8000"]
-
-RUN pip install -r requirements.txt
-RUN pip install youtube-transcript-api openai📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN pip install -r requirements.txt | |
| RUN pip install youtube-transcript-api openai | |
| # Copy application code | |
| COPY . /blt | |
| # Install Python dependencies | |
| RUN pip install -r requirements.txt | |
| RUN pip install youtube-transcript-api openai | |
| # Convert line endings and set permissions | |
| RUN dos2unix Dockerfile docker-compose.yml scripts/entrypoint.sh ./blt/settings.py | |
| # Check if .env exists and run dos2unix on it, otherwise skip | |
| RUN if [ -f /blt/.env ]; then dos2unix /blt/.env; fi | |
| RUN chmod +x /blt/scripts/entrypoint.sh | |
| ENTRYPOINT ["/blt/scripts/entrypoint.sh"] | |
| CMD ["poetry", "run", "python", "manage.py", "runserver", "0.0.0.0:8000"] |
🤖 Prompt for AI Agents
In Dockerfile around lines 72-73, the pip install commands are after
ENTRYPOINT/CMD and after copying application code which harms layer caching and
is confusing; move the dependency installation to before the application COPY
(before line 61) so dependencies are installed on a stable base layer and
benefit from Docker cache, and remove the explicit "pip install
youtube-transcript-api openai" if those packages are already listed in
requirements.txt (or merge installs into a single RUN to reduce layers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed suggestion!
You’re right that installing dependencies after the ENTRYPOINT/CMD is confusing and hurts layer caching. I’ll refactor the Dockerfile so that:
• Python dependencies are installed immediately after copying the application code, before the ENTRYPOINT/CMD.
• requirements.txt is the single source of truth for dependencies (including youtube-transcript-api and openai ), so the separate pip install youtube-transcript-api openai line is no longer needed.
I’ll push an updated Dockerfile reflecting this structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaditya8979 Excellent! Your refactoring plan is spot on. Moving the dependency installation before ENTRYPOINT/CMD and consolidating everything into requirements.txt will both improve layer caching and make the build process clearer.
Feel free to tag me once you've pushed the updated Dockerfile if you'd like me to take another look! 🚀
| def save(self, *args, **kwargs): | ||
| import re | ||
| youtube_regex = r'(?:youtube\.com\/watch\?v=|youtu\.be\/)([^&\n?#]+)' | ||
| match = re.search(youtube_regex, self.youtube_url) | ||
| if match: | ||
| self.youtube_id = match.group(1) | ||
| super().save(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YouTube ID extraction may fail silently for some URL formats.
The regex doesn't handle all valid YouTube URL formats (e.g., /embed/, /v/, youtube-nocookie.com). If the regex doesn't match, youtube_id won't be set, but the field lacks blank=True, which could cause issues.
Consider expanding the regex or adding a fallback:
def save(self, *args, **kwargs):
import re
- youtube_regex = r'(?:youtube\.com\/watch\?v=|youtu\.be\/)([^&\n?#]+)'
+ youtube_regex = r'(?:youtube(?:-nocookie)?\.com\/(?:watch\?v=|embed\/|v\/)|youtu\.be\/)([^&\n?#]+)'
match = re.search(youtube_regex, self.youtube_url)
if match:
self.youtube_id = match.group(1)
+ elif not self.youtube_id:
+ # Fallback: try to extract from query params or path
+ from urllib.parse import urlparse, parse_qs
+ parsed = urlparse(self.youtube_url)
+ if 'v' in parse_qs(parsed.query):
+ self.youtube_id = parse_qs(parsed.query)['v'][0]
super().save(*args, **kwargs)🤖 Prompt for AI Agents
In website/models.py around lines 3667 to 3673, the YouTube ID extraction only
matches watch and youtu.be links and will miss /embed/, /v/,
youtube-nocookie.com and other valid formats causing youtube_id to remain unset;
update the extraction to handle additional patterns (include /embed/, /v/,
youtube-nocookie.com and also parse the "v" query parameter as a fallback) and
ensure a safe fallback by either setting youtube_id to None/'' when no match or
marking the youtube_id model field blank=True so saving never fails; implement
the broader regex or a small parser that checks query params, then normalize or
clear youtube_id before calling super().save().
| {% for option, label in options %} | ||
| <label class="flex items-center p-4 bg-white dark:bg-gray-800 rounded-lg cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-700 transition border border-gray-200 dark:border-gray-600"> | ||
| <input type="radio" name="question_{{ question.id }}" value="{{ label }}" class="w-5 h-5 text-[#e74c3c] cursor-pointer"> | ||
| <span class="ml-3 font-medium text-gray-700 dark:text-gray-300">{{ label }}. {{ option }}</span> | ||
| </label> | ||
| {% endfor %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined template variable options will cause rendering failure.
The template iterates over an options variable that is not provided in the context by VideoDetailView. This will raise a TemplateDoesNotExist or VariableDoesNotExist error when quiz questions are present.
The intended behavior appears to be iterating over the question's four options (A, B, C, D). You need to either:
Solution 1: Pass options from the view
In VideoDetailView.get_context_data (line 981-989 in education.py):
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
video = self.object
context["quiz_questions"] = VideoQuizQuestion.objects.filter(video=video)
context["options"] = [
('option_a', 'A'),
('option_b', 'B'),
('option_c', 'C'),
('option_d', 'D')
]
if self.request.user.is_authenticated:
context["quiz_history"] = QuizAttempt.objects.filter(
user=self.request.user, video=video
)
return contextSolution 2: Use template tag to build options
Replace lines 66-71 with:
{% for label, field in 'A,option_a B,option_b C,option_c D,option_d'|split %}
{% with parts=label|split:',' %}
<label class="flex items-center p-4 bg-white dark:bg-gray-800 rounded-lg cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-700 transition border border-gray-200 dark:border-gray-600">
<input type="radio" name="question_{{ question.id }}" value="{{ parts.0 }}" class="w-5 h-5 text-[#e74c3c] cursor-pointer">
<span class="ml-3 font-medium text-gray-700 dark:text-gray-300">{{ parts.0 }}. {{ question|attr:parts.1 }}</span>
</label>
{% endwith %}
{% endfor %}Or more simply, hardcode the four options:
<label class="flex items-center p-4 bg-white dark:bg-gray-800 rounded-lg cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-700 transition border border-gray-200 dark:border-gray-600">
<input type="radio" name="question_{{ question.id }}" value="A" class="w-5 h-5 text-[#e74c3c] cursor-pointer">
<span class="ml-3 font-medium text-gray-700 dark:text-gray-300">A. {{ question.option_a }}</span>
</label>
<label class="flex items-center p-4 bg-white dark:bg-gray-800 rounded-lg cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-700 transition border border-gray-200 dark:border-gray-600">
<input type="radio" name="question_{{ question.id }}" value="B" class="w-5 h-5 text-[#e74c3c] cursor-pointer">
<span class="ml-3 font-medium text-gray-700 dark:text-gray-300">B. {{ question.option_b }}</span>
</label>
<label class="flex items-center p-4 bg-white dark:bg-gray-800 rounded-lg cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-700 transition border border-gray-200 dark:border-gray-600">
<input type="radio" name="question_{{ question.id }}" value="C" class="w-5 h-5 text-[#e74c3c] cursor-pointer">
<span class="ml-3 font-medium text-gray-700 dark:text-gray-300">C. {{ question.option_c }}</span>
</label>
<label class="flex items-center p-4 bg-white dark:bg-gray-800 rounded-lg cursor-pointer hover:bg-gray-100 dark:hover:bg-gray-700 transition border border-gray-200 dark:border-gray-600">
<input type="radio" name="question_{{ question.id }}" value="D" class="w-5 h-5 text-[#e74c3c] cursor-pointer">
<span class="ml-3 font-medium text-gray-700 dark:text-gray-300">D. {{ question.option_d }}</span>
</label>🤖 Prompt for AI Agents
In website/templates/education/video_detail.html around lines 66-71 the template
loops over an undefined variable "options" which will break rendering; fix by
either (A) updating VideoDetailView.get_context_data in education.py (around
lines 981-989) to add context["options"] =
[('option_a','A'),('option_b','B'),('option_c','C'),('option_d','D')] so the
template can iterate, or (B) change the template to stop using the non-existent
"options" variable and instead render the four option inputs explicitly (or via
a template tag) using question.option_a, question.option_b, question.option_c,
and question.option_d with values "A"/"B"/"C"/"D".
| {% for attempt in quiz_history %} | ||
| <div class="flex items-center justify-between bg-gray-50 dark:bg-gray-700 p-4 rounded-lg border-l-4 {% if attempt.percentage >= 70 %}border-green-500{% else %}border-yellow-500{% endif %}"> | ||
| <div> | ||
| <p class="font-semibold text-gray-800 dark:text-gray-100">{{ attempt.get_date_display }}</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for get_date_display method definition in the codebase
rg -n 'def get_date_display' --type=py
# Also search for QuizAttempt model definition
rg -n 'class QuizAttempt' --type=py -A 5Repository: OWASP-BLT/BLT
Length of output: 507
🏁 Script executed:
#!/bin/bash
# Read the full QuizAttempt model definition
sed -n '3691,3750p' website/models.py
# Also search for any method containing "date" in QuizAttempt area
rg -n 'def.*date' website/models.py | grep -A 2 -B 2 '3691\|3692\|3693\|3694\|3695'Repository: OWASP-BLT/BLT
Length of output: 587
🏁 Script executed:
#!/bin/bash
# Read the template around line 138 to understand the context
sed -n '130,150p' website/templates/education/video_detail.htmlRepository: OWASP-BLT/BLT
Length of output: 1287
attempt.get_date_display method does not exist on QuizAttempt model.
Line 138 calls attempt.get_date_display, but this method is not defined in the QuizAttempt model (website/models.py:3691). The model has a completed_at DateTimeField but no get_date_display method. This will raise an AttributeError at runtime when rendering quiz history. Line 139 correctly uses attempt.completed_at|date:"M d, Y H:i" — either add the get_date_display method to the model or update line 138 to use the same approach.
🤖 Prompt for AI Agents
In website/templates/education/video_detail.html around line 138, the template
calls attempt.get_date_display which doesn't exist on the QuizAttempt model and
will raise AttributeError; update the template to use the existing completed_at
field formatted with the date filter (e.g. the same format used on the next
line: use attempt.completed_at|date:"M d, Y H:i") or alternatively add a
get_date_display method on the QuizAttempt model that returns completed_at
formatted accordingly; prefer updating the template to use
attempt.completed_at|date:"M d, Y H:i" for consistency and to avoid touching the
model.
| def generate_ai_summary_and_verify(youtube_id, title, transcript): | ||
| """ | ||
| Use OpenAI to generate summary and verify if content is educational. | ||
| Returns (summary_text, is_educational_bool). | ||
| """ | ||
| if not client: | ||
| print("DEBUG: OpenAI client not initialized (OPENAI_API_KEY missing)") | ||
| return None, False | ||
|
|
||
| if not openai_api_key: | ||
| print("DEBUG: OPENAI_API_KEY is missing") | ||
| return None, False | ||
|
|
||
| if not transcript: | ||
| print(f"DEBUG: no transcript provided for video {youtube_id}") | ||
| return None, False | ||
|
|
||
| try: | ||
| response = client.chat.completions.create( | ||
| model="gpt-3.5-turbo", | ||
| messages=[ | ||
| { | ||
| "role": "system", | ||
| "content": ( | ||
| "You are an educational content expert. " | ||
| "Analyze the video transcript and determine if it's educational/security-related content. " | ||
| "Respond in JSON format with 'summary' and 'is_educational' fields." | ||
| ), | ||
| }, | ||
| { | ||
| "role": "user", | ||
| "content": ( | ||
| f"Video Title: {title}\n\nTranscript:\n{transcript}\n\n" | ||
| "Provide a brief (100-150 word) summary and determine if this is educational security content. " | ||
| "Respond ONLY with valid JSON using double quotes, like this:\n" | ||
| '{"summary": "...", "is_educational": true}' | ||
| ), | ||
| }, | ||
| ], | ||
| max_tokens=500, | ||
| temperature=0.7, | ||
| ) | ||
|
|
||
| content = response.choices[0].message.content | ||
| print(f"DEBUG: raw OpenAI content for {youtube_id}: {content[:100]}...") # First 100 chars | ||
|
|
||
| # Strip code fences if model wraps JSON in ``` (triple backticks) | ||
| content_stripped = content.strip() | ||
| if content_stripped.startswith("```"): | ||
| lines = content_stripped.splitlines() | ||
| # Remove first line (opening fence) | ||
| if len(lines) > 1: | ||
| lines = lines[1:] | ||
| # Remove last line if it's a closing fence | ||
| if lines and lines[-1].strip().startswith("```"): | ||
| lines = lines[:-1] | ||
| content_stripped = "\n".join(lines).strip() | ||
|
|
||
| data = json.loads(content_stripped) | ||
| summary = data.get("summary", "") | ||
| is_educational = data.get("is_educational", False) | ||
|
|
||
| print( | ||
| f"DEBUG: parsed summary_present={bool(summary)}, " | ||
| f"is_educational={is_educational}" | ||
| ) | ||
| return summary, is_educational | ||
|
|
||
| except json.JSONDecodeError as e: | ||
| print(f"DEBUG: JSON decode error for {youtube_id}: {e}") | ||
| print(f"DEBUG: content that failed to parse: {content_stripped[:200]}") | ||
| return None, False | ||
| except Exception as e: | ||
| print(f"DEBUG: OpenAI error for {youtube_id}: {e}") | ||
| return None, False | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Replace print statements with logger and add timeout for OpenAI API calls.
The function uses print() for debugging throughout (lines 102, 106, 110, 140, 158-161, 165-166, 169) instead of the configured logger. Additionally, the OpenAI API call (line 114) lacks a timeout, which could cause the request to hang indefinitely if the API is slow or unresponsive.
Apply this diff:
def generate_ai_summary_and_verify(youtube_id, title, transcript):
"""
Use OpenAI to generate summary and verify if content is educational.
Returns (summary_text, is_educational_bool).
"""
if not client:
- print("DEBUG: OpenAI client not initialized (OPENAI_API_KEY missing)")
+ logger.warning("OpenAI client not initialized (OPENAI_API_KEY missing)")
return None, False
if not openai_api_key:
- print("DEBUG: OPENAI_API_KEY is missing")
+ logger.warning("OPENAI_API_KEY is missing")
return None, False
if not transcript:
- print(f"DEBUG: no transcript provided for video {youtube_id}")
+ logger.warning(f"No transcript provided for video {youtube_id}")
return None, False
try:
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": (
"You are an educational content expert. "
"Analyze the video transcript and determine if it's educational/security-related content. "
"Respond in JSON format with 'summary' and 'is_educational' fields."
),
},
{
"role": "user",
"content": (
f"Video Title: {title}\n\nTranscript:\n{transcript}\n\n"
"Provide a brief (100-150 word) summary and determine if this is educational security content. "
"Respond ONLY with valid JSON using double quotes, like this:\n"
'{"summary": "...", "is_educational": true}'
),
},
],
max_tokens=500,
- temperature=0.7,
+ temperature=0.3,
+ timeout=30,
)
content = response.choices[0].message.content
- print(f"DEBUG: raw OpenAI content for {youtube_id}: {content[:100]}...") # First 100 chars
+ logger.debug(f"Raw OpenAI response for {youtube_id}: {content[:100]}...")
# Strip code fences if model wraps JSON in ``` (triple backticks)
content_stripped = content.strip()
if content_stripped.startswith("```"):
lines = content_stripped.splitlines()
# Remove first line (opening fence)
if len(lines) > 1:
lines = lines[1:]
# Remove last line if it's a closing fence
if lines and lines[-1].strip().startswith("```"):
lines = lines[:-1]
content_stripped = "\n".join(lines).strip()
data = json.loads(content_stripped)
summary = data.get("summary", "")
is_educational = data.get("is_educational", False)
- print(
- f"DEBUG: parsed summary_present={bool(summary)}, "
- f"is_educational={is_educational}"
- )
+ logger.info(f"Generated summary for {youtube_id}: summary_present={bool(summary)}, is_educational={is_educational}")
return summary, is_educational
except json.JSONDecodeError as e:
- print(f"DEBUG: JSON decode error for {youtube_id}: {e}")
- print(f"DEBUG: content that failed to parse: {content_stripped[:200]}")
+ logger.error(f"JSON decode error for {youtube_id}: {e}")
+ logger.debug(f"Content that failed to parse: {content_stripped[:200]}")
return None, False
except Exception as e:
- print(f"DEBUG: OpenAI error for {youtube_id}: {e}")
+ logger.error(f"OpenAI API error for {youtube_id}: {e}")
return None, FalseNote on temperature: Changed from 0.7 to 0.3 for the verification task. Lower temperature makes the model more deterministic and consistent for classification tasks like is_educational. If creative summaries are preferred, consider using temperature 0.7 only for the summary portion.
🤖 Prompt for AI Agents
In website/views/education.py around lines 96 to 171, replace all print(...)
debug calls with the configured logger (use logger.info for normal successes
like the parsed summary, logger.error for JSON/OpenAI errors, and logger.debug
for raw/large debug payloads), add a timeout argument to the OpenAI API call
(e.g., timeout=15) so the request can't hang indefinitely, and set the chat
completion temperature to 0.3 for deterministic verification; keep the existing
code-fence stripping and JSON parsing but change the specific log calls per the
provided diff (info for generated summary, error/debug for parse failures, error
for API exceptions).
| def generate_quiz_from_transcript(youtube_id, transcript, title): | ||
| """ | ||
| Generate 5-10 quiz questions from transcript using OpenAI. | ||
| Returns list of question dicts or empty list on error. | ||
| """ | ||
| if not client: | ||
| print(f"DEBUG: OpenAI client not initialized, skipping quiz for {youtube_id}") | ||
| return [] | ||
|
|
||
| if not openai_api_key: | ||
| print(f"DEBUG: OPENAI_API_KEY missing, skipping quiz for {youtube_id}") | ||
| return [] | ||
|
|
||
| if not transcript: | ||
| print(f"DEBUG: no transcript for quiz generation for {youtube_id}") | ||
| return [] | ||
|
|
||
| try: | ||
| response = client.chat.completions.create( | ||
| model="gpt-3.5-turbo", | ||
| messages=[ | ||
| { | ||
| "role": "system", | ||
| "content": ( | ||
| "You are a quiz generator expert. Create educational multiple-choice questions " | ||
| "based on the video content. Respond ONLY with valid JSON array." | ||
| ), | ||
| }, | ||
| { | ||
| "role": "user", | ||
| "content": ( | ||
| f"Create 5 multiple-choice questions from this video on '{title}':\n\n{transcript}\n\n" | ||
| "Respond ONLY with valid JSON array in this exact format:\n" | ||
| "[\n" | ||
| " {\n" | ||
| ' "question": "What is...?",\n' | ||
| ' "option_a": "Answer A",\n' | ||
| ' "option_b": "Answer B",\n' | ||
| ' "option_c": "Answer C",\n' | ||
| ' "option_d": "Answer D",\n' | ||
| ' "correct_answer": "A",\n' | ||
| ' "explanation": "The correct answer is..."\n' | ||
| " }\n" | ||
| "]" | ||
| ), | ||
| }, | ||
| ], | ||
| max_tokens=2000, | ||
| temperature=0.7, | ||
| ) | ||
|
|
||
| content = response.choices[0].message.content | ||
| print(f"DEBUG: raw quiz content for {youtube_id}: {content[:100]}...") | ||
|
|
||
| # Strip code fences if present (triple backticks) | ||
| content_stripped = content.strip() | ||
| if content_stripped.startswith("```"): | ||
| lines = content_stripped.splitlines() | ||
| if len(lines) > 1: | ||
| lines = lines[1:] | ||
| if lines and lines[-1].strip().startswith("```"): | ||
| lines = lines[:-1] | ||
| content_stripped = "\n".join(lines).strip() | ||
|
|
||
| questions = json.loads(content_stripped) | ||
| print(f"DEBUG: parsed {len(questions)} quiz questions for {youtube_id}") | ||
| return questions[:10] # Limit to 10 questions | ||
|
|
||
| except json.JSONDecodeError as e: | ||
| print(f"DEBUG: quiz JSON decode error for {youtube_id}: {e}") | ||
| return [] | ||
| except Exception as e: | ||
| print(f"DEBUG: quiz generation error for {youtube_id}: {e}") | ||
| return [] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Replace print statements with logger and add timeout for OpenAI API calls.
Similar to generate_ai_summary_and_verify, this function uses print() statements (lines 179, 183, 187, 225, 238, 242, 245) instead of the logger and lacks a timeout on the OpenAI API call.
Apply this diff:
def generate_quiz_from_transcript(youtube_id, transcript, title):
"""
Generate 5-10 quiz questions from transcript using OpenAI.
Returns list of question dicts or empty list on error.
"""
if not client:
- print(f"DEBUG: OpenAI client not initialized, skipping quiz for {youtube_id}")
+ logger.warning(f"OpenAI client not initialized, skipping quiz for {youtube_id}")
return []
if not openai_api_key:
- print(f"DEBUG: OPENAI_API_KEY missing, skipping quiz for {youtube_id}")
+ logger.warning(f"OPENAI_API_KEY missing, skipping quiz for {youtube_id}")
return []
if not transcript:
- print(f"DEBUG: no transcript for quiz generation for {youtube_id}")
+ logger.warning(f"No transcript for quiz generation for {youtube_id}")
return []
try:
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": (
"You are a quiz generator expert. Create educational multiple-choice questions "
"based on the video content. Respond ONLY with valid JSON array."
),
},
{
"role": "user",
"content": (
f"Create 5 multiple-choice questions from this video on '{title}':\n\n{transcript}\n\n"
"Respond ONLY with valid JSON array in this exact format:\n"
"[\n"
" {\n"
' "question": "What is...?",\n'
' "option_a": "Answer A",\n'
' "option_b": "Answer B",\n'
' "option_c": "Answer C",\n'
' "option_d": "Answer D",\n'
' "correct_answer": "A",\n'
' "explanation": "The correct answer is..."\n'
" }\n"
"]"
),
},
],
max_tokens=2000,
temperature=0.7,
+ timeout=30,
)
content = response.choices[0].message.content
- print(f"DEBUG: raw quiz content for {youtube_id}: {content[:100]}...")
+ logger.debug(f"Raw quiz response for {youtube_id}: {content[:100]}...")
# Strip code fences if present (triple backticks)
content_stripped = content.strip()
if content_stripped.startswith("```"):
lines = content_stripped.splitlines()
if len(lines) > 1:
lines = lines[1:]
if lines and lines[-1].strip().startswith("```"):
lines = lines[:-1]
content_stripped = "\n".join(lines).strip()
questions = json.loads(content_stripped)
- print(f"DEBUG: parsed {len(questions)} quiz questions for {youtube_id}")
+ logger.info(f"Generated {len(questions)} quiz questions for {youtube_id}")
return questions[:10] # Limit to 10 questions
except json.JSONDecodeError as e:
- print(f"DEBUG: quiz JSON decode error for {youtube_id}: {e}")
+ logger.error(f"Quiz JSON decode error for {youtube_id}: {e}")
return []
except Exception as e:
- print(f"DEBUG: quiz generation error for {youtube_id}: {e}")
+ logger.error(f"Quiz generation error for {youtube_id}: {e}")
return []🤖 Prompt for AI Agents
In website/views/education.py around lines 173-247, replace all print()
diagnostics with the module logger (use logger.debug for early checks and raw
content, logger.info for successful generation as shown in the diff, and
logger.error for JSON decode and generic exceptions) and add a timeout to the
OpenAI call (e.g., timeout=15) passed into client.chat.completions.create;
specifically change the prints at the top (client/openai/transcript missing),
the raw quiz content print, and the exception prints to appropriate logger calls
and include timeout parameter on the API invocation to prevent hanging.
| if request.method == "POST": | ||
| print("DEBUG: education_home POST reached") | ||
| youtube_url = request.POST.get("youtube_url", "").strip() | ||
| title = request.POST.get("video_title", "").strip() | ||
| description = request.POST.get("video_description", "").strip() | ||
|
|
||
| print(f"DEBUG: raw POST values title={repr(title)}, url={repr(youtube_url)}") | ||
|
|
||
| if youtube_url and title: | ||
| print(f"DEBUG: got title and url {title} {youtube_url}") | ||
| try: | ||
| # Extract video ID from URL | ||
| match = re.search( | ||
| r'(?:youtube\.com\/watch\?v=|youtu\.be\/)([^&\n?#]+)', | ||
| youtube_url | ||
| ) | ||
| if not match: | ||
| messages.error(request, "Invalid YouTube URL format.") | ||
| return redirect("education") | ||
|
|
||
| youtube_id = match.group(1) | ||
| print(f"DEBUG: extracted youtube_id={youtube_id}") | ||
|
|
||
| # Step 1: Get transcript | ||
| print("DEBUG: calling get_youtube_transcript") | ||
| transcript = get_youtube_transcript(youtube_id) | ||
| print(f"DEBUG: transcript present? {bool(transcript)}") | ||
|
|
||
| # Step 2: Generate summary and educational verification | ||
| print("DEBUG: calling generate_ai_summary_and_verify") | ||
| summary, is_verified = generate_ai_summary_and_verify( | ||
| youtube_id, title, transcript | ||
| ) | ||
|
|
||
| # Step 3: Create video record | ||
| video = EducationalVideo.objects.create( | ||
| title=title, | ||
| youtube_url=youtube_url, | ||
| youtube_id=youtube_id, | ||
| description=description, | ||
| ai_summary=summary or "", | ||
| is_verified=is_verified, | ||
| ) | ||
| print(f"DEBUG: created video record with id={video.id}") | ||
|
|
||
| # Step 4: Generate quiz questions (only if transcript exists) | ||
| if transcript: | ||
| print("DEBUG: calling generate_quiz_from_transcript") | ||
| quiz_questions = generate_quiz_from_transcript( | ||
| youtube_id, transcript, title | ||
| ) | ||
| print(f"DEBUG: quiz_questions returned: {len(quiz_questions)} questions") | ||
|
|
||
| for q_data in quiz_questions: | ||
| try: | ||
| VideoQuizQuestion.objects.create( | ||
| video=video, | ||
| question=q_data.get("question", ""), | ||
| option_a=q_data.get("option_a", ""), | ||
| option_b=q_data.get("option_b", ""), | ||
| option_c=q_data.get("option_c", ""), | ||
| option_d=q_data.get("option_d", ""), | ||
| correct_answer=q_data.get("correct_answer", "A"), | ||
| explanation=q_data.get("explanation", ""), | ||
| ) | ||
| print(f"DEBUG: created quiz question for video {video.id}") | ||
| except Exception as q_err: | ||
| print(f"DEBUG: error creating quiz question: {q_err}") | ||
| else: | ||
| print(f"DEBUG: skipping quiz generation, no transcript for {youtube_id}") | ||
|
|
||
| messages.success( | ||
| request, | ||
| "Video added successfully with AI-generated content!" | ||
| ) | ||
| return redirect("education") | ||
|
|
||
| except Exception as e: | ||
| print(f"DEBUG: exception in POST handler: {e}") | ||
| import traceback | ||
| traceback.print_exc() | ||
| messages.error(request, f"Error processing video: {str(e)}") | ||
| return redirect("education") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Replace print statements with logger in POST handler.
The POST handler uses print() extensively (lines 268, 273, 276, 288, 291-293, 296, 310, 314, 318, 332, 336, 345-347) and imports traceback inline (line 346). Using the logger throughout provides better production observability.
Apply this diff to key sections:
# Handle YouTube video submission with AI processing
if request.method == "POST":
- print("DEBUG: education_home POST reached")
+ logger.info("Processing YouTube video submission")
youtube_url = request.POST.get("youtube_url", "").strip()
title = request.POST.get("video_title", "").strip()
description = request.POST.get("video_description", "").strip()
- print(f"DEBUG: raw POST values title={repr(title)}, url={repr(youtube_url)}")
+ logger.debug(f"Received submission: title={title}, url={youtube_url}")
if youtube_url and title:
- print(f"DEBUG: got title and url {title} {youtube_url}")
+ logger.info(f"Processing video: {title}")
try:
# Extract video ID from URL
match = re.search(
r'(?:youtube\.com\/watch\?v=|youtu\.be\/)([^&\n?#]+)',
youtube_url
)
if not match:
messages.error(request, "Invalid YouTube URL format.")
return redirect("education")
youtube_id = match.group(1)
- print(f"DEBUG: extracted youtube_id={youtube_id}")
+ logger.info(f"Extracted YouTube ID: {youtube_id}")
# Step 1: Get transcript
- print("DEBUG: calling get_youtube_transcript")
transcript = get_youtube_transcript(youtube_id)
- print(f"DEBUG: transcript present? {bool(transcript)}")
# Step 2: Generate summary and educational verification
- print("DEBUG: calling generate_ai_summary_and_verify")
summary, is_verified = generate_ai_summary_and_verify(
youtube_id, title, transcript
)
# Step 3: Create video record
video = EducationalVideo.objects.create(
title=title,
youtube_url=youtube_url,
youtube_id=youtube_id,
description=description,
ai_summary=summary or "",
is_verified=is_verified,
)
- print(f"DEBUG: created video record with id={video.id}")
+ logger.info(f"Created video record: {video.id}")
# Step 4: Generate quiz questions (only if transcript exists)
if transcript:
- print("DEBUG: calling generate_quiz_from_transcript")
quiz_questions = generate_quiz_from_transcript(
youtube_id, transcript, title
)
- print(f"DEBUG: quiz_questions returned: {len(quiz_questions)} questions")
+ logger.info(f"Generated {len(quiz_questions)} quiz questions for video {video.id}")
for q_data in quiz_questions:
try:
VideoQuizQuestion.objects.create(
video=video,
question=q_data.get("question", ""),
option_a=q_data.get("option_a", ""),
option_b=q_data.get("option_b", ""),
option_c=q_data.get("option_c", ""),
option_d=q_data.get("option_d", ""),
correct_answer=q_data.get("correct_answer", "A"),
explanation=q_data.get("explanation", ""),
)
- print(f"DEBUG: created quiz question for video {video.id}")
except Exception as q_err:
- print(f"DEBUG: error creating quiz question: {q_err}")
+ logger.error(f"Error creating quiz question for video {video.id}: {q_err}")
else:
- print(f"DEBUG: skipping quiz generation, no transcript for {youtube_id}")
+ logger.info(f"Skipping quiz generation for {youtube_id}: no transcript available")
messages.success(
request,
"Video added successfully with AI-generated content!"
)
return redirect("education")
except Exception as e:
- print(f"DEBUG: exception in POST handler: {e}")
- import traceback
- traceback.print_exc()
+ logger.exception(f"Error processing video submission: {e}")
messages.error(request, f"Error processing video: {str(e)}")
return redirect("education")Note: Using logger.exception() automatically includes the full stack trace without needing to import traceback.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In website/views/education.py around lines 267 to 349, replace all debug print()
calls and the inline traceback import in the POST handler with structured
logging calls using the module logger (e.g., logger.debug(...)) and use
logger.exception(...) in the exception block to log the error and stack trace;
ensure you import or reference the module-level logger at the top of the file,
remove the inline import traceback, keep the existing messages.* and redirect
behavior unchanged, and preserve variable content in log messages (use
formatting, not prints) for the extracted youtube_id, transcript presence,
created video id, quiz counts, and any quiz creation errors.
website/views/education.py
Outdated
| class VideoDetailView(DetailView): | ||
| model = EducationalVideo | ||
| template_name = "education/video_detail.html" | ||
| context_object_name = "video" | ||
|
|
||
| def get_context_data(self, **kwargs): | ||
| context = super().get_context_data(**kwargs) | ||
| video = self.object | ||
| context["quiz_questions"] = VideoQuizQuestion.objects.filter(video=video) | ||
| if self.request.user.is_authenticated: | ||
| context["quiz_history"] = QuizAttempt.objects.filter( | ||
| user=self.request.user, video=video | ||
| ) | ||
| return context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing options context variable for quiz template.
The video_detail.html template iterates over an options variable (line 66), but VideoDetailView doesn't provide it in the context. This will cause a template rendering error.
Apply this diff:
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
video = self.object
context["quiz_questions"] = VideoQuizQuestion.objects.filter(video=video)
+ context["options"] = [
+ (video.option_a, 'A'),
+ (video.option_b, 'B'),
+ (video.option_c, 'C'),
+ (video.option_d, 'D')
+ ]
if self.request.user.is_authenticated:
context["quiz_history"] = QuizAttempt.objects.filter(
user=self.request.user, video=video
)
return contextWait - correction: The options should be a list of tuples for the labels and field names, not pulling from video object. Let me revise:
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
video = self.object
context["quiz_questions"] = VideoQuizQuestion.objects.filter(video=video)
+ # Provide option labels for template iteration
+ context["options"] = [
+ ('option_a', 'A'),
+ ('option_b', 'B'),
+ ('option_c', 'C'),
+ ('option_d', 'D')
+ ]
if self.request.user.is_authenticated:
context["quiz_history"] = QuizAttempt.objects.filter(
user=self.request.user, video=video
)
return contextThis provides the mapping needed by the template where it expects to iterate like for option, label in options and access question.{{ option }} (option_a, option_b, etc.) and display the label (A, B, C, D).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class VideoDetailView(DetailView): | |
| model = EducationalVideo | |
| template_name = "education/video_detail.html" | |
| context_object_name = "video" | |
| def get_context_data(self, **kwargs): | |
| context = super().get_context_data(**kwargs) | |
| video = self.object | |
| context["quiz_questions"] = VideoQuizQuestion.objects.filter(video=video) | |
| if self.request.user.is_authenticated: | |
| context["quiz_history"] = QuizAttempt.objects.filter( | |
| user=self.request.user, video=video | |
| ) | |
| return context | |
| class VideoDetailView(DetailView): | |
| model = EducationalVideo | |
| template_name = "education/video_detail.html" | |
| context_object_name = "video" | |
| def get_context_data(self, **kwargs): | |
| context = super().get_context_data(**kwargs) | |
| video = self.object | |
| context["quiz_questions"] = VideoQuizQuestion.objects.filter(video=video) | |
| # Provide option labels for template iteration | |
| context["options"] = [ | |
| ('option_a', 'A'), | |
| ('option_b', 'B'), | |
| ('option_c', 'C'), | |
| ('option_d', 'D') | |
| ] | |
| if self.request.user.is_authenticated: | |
| context["quiz_history"] = QuizAttempt.objects.filter( | |
| user=self.request.user, video=video | |
| ) | |
| return context |
🤖 Prompt for AI Agents
In website/views/education.py around lines 976 to 989, the view builds
quiz_questions and quiz_history but does not add the options context variable
the template expects; add an options list of tuples mapping question field names
to labels (e.g. [('option_a','A'), ('option_b','B'), ('option_c','C'),
('option_d','D')]) and include it in the context returned by get_context_data so
the template can iterate with for option, label in options and access
question.<option>.
❌ Pre-commit checks failedThe pre-commit hooks found issues that need to be fixed. Please run the following commands locally to fix them: # Install pre-commit if you haven't already
pip install pre-commit
# Run pre-commit on all files
pre-commit run --all-files
# Or run pre-commit on staged files only
pre-commit runAfter running these commands, the pre-commit hooks will automatically fix most issues. 💡 Tip: You can set up pre-commit to run automatically on every commit by running: pre-commit installPre-commit outputFor more information, see the pre-commit documentation. |
| transcript_text = " ".join(snippet.text for snippet in snippets) | ||
|
|
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
|
|
||
| featured_lectures = Lecture.objects.filter(section__isnull=True) | ||
| courses = Course.objects.all() | ||
| context = {"is_instructor": is_instructor, "featured_lectures": featured_lectures, "courses": courses} |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
website/models.py (1)
2544-2549: Remove redundant local imports around EducationalVideo sectionThere’s a second block of imports near the bottom:
import re from urllib.parse import parse_qs, urlparse from django.core.exceptions import ValidationError from django.db import modelsand an extra
from django.db import modelsright aboveEducationalVideo. These are all already imported at the top of the file, so this block is redundant and can be dropped.Also applies to: 3648-3650
website/views/education.py (1)
48-61: Remove the second import block; it duplicates the header importsLines 48–61 re-import
json,os,re, Django shortcuts/decorators,OpenAI,YouTubeTranscriptApi, and several models that are already imported at the top of the file.Keeping a single, canonical import section at the top avoids confusion and potential merge conflicts; this entire block can be deleted safely.
🧹 Nitpick comments (1)
website/views/education.py (1)
62-68: Switch transcript/AI helpers from
get_youtube_transcript,generate_ai_summary_and_verify, andgenerate_quiz_from_transcriptall useprint()for diagnostics and only do coarse-grained exception handling. This makes production observability harder and mixes debug noise into stdout.Consider:
- Replacing all
print(...)withlogger.debug/info/warning/error/exception(...)as appropriate.- Using the already-imported
NoTranscriptFound/TranscriptsDisabledinget_youtube_transcriptto log “no captions available” as a warning, while keeping a genericExceptionhandler for unexpected failures.- Logging key context (video id, whether a transcript was found, whether JSON parsing succeeded, count of generated questions) at info level, and truncating large payloads at debug level.
Example sketch (not exhaustive):
def get_youtube_transcript(youtube_id): try: logger.info("Fetching transcript for video %s", youtube_id) api = YouTubeTranscriptApi() transcript_list = list(api.list(youtube_id)) snippets = [] for transcript in transcript_list: for snippet in transcript.fetch(): snippets.append(snippet) transcript_text = " ".join(snippet.text for snippet in snippets) if len(transcript_text) > 3000: transcript_text = transcript_text[:3000] logger.info("Transcript fetched for %s (length=%d)", youtube_id, len(transcript_text)) return transcript_text except (TranscriptsDisabled, NoTranscriptFound) as e: logger.warning("No transcript available for %s: %s", youtube_id, e) return None except Exception as e: logger.error("Error fetching transcript for %s: %s", youtube_id, e) return NoneApply the same pattern to the two OpenAI helpers so all AI/transcript failures are consistently recorded in logs instead of printed.
Please double-check the logging levels and any exception types (
NoTranscriptFound,TranscriptsDisabled) against the exact versions ofyoutube-transcript-apiandopenaiyou’re using to ensure the signatures and error classes match your installed packages.Also applies to: 70-96, 99-174, 176-249
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
.github/workflows/ci-cd.yml(1 hunks)blt/urls.py(5 hunks)pyproject.toml(2 hunks)website/models.py(1 hunks)website/views/education.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
website/models.py (1)
website/utils.py (1)
ai_summary(708-729)
blt/urls.py (2)
website/models.py (3)
EducationalVideo(3652-3674)QuizAttempt(3692-3704)VideoQuizQuestion(3677-3689)website/views/education.py (2)
VideoDetailView(979-992)submit_quiz(378-442)
🪛 GitHub Check: CodeQL
.github/workflows/ci-cd.yml
[failure] 314-316: Cache Poisoning via execution of untrusted code
Potential cache poisoning in the context of the default branch due to privilege checkout of untrusted code. (pull_request_target).
Potential cache poisoning in the context of the default branch due to privilege checkout of untrusted code. (workflow_run).
website/views/education.py
[warning] 432-432: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Run Tests
- GitHub Check: docker-test
🔇 Additional comments (5)
pyproject.toml (1)
47-74: youtube-transcript-api dependency looks appropriate for new transcript featuresAdding
youtube-transcript-apihere matches its usage inwebsite/views/education.py. Just ensure the lockfile/Docker image are updated so the same version is installed in all environments.Please confirm you’ve run your usual dependency update steps (e.g.,
poetry lock && poetry installand rebuilt any Docker images) so this package is actually available at runtime.website/models.py (1)
3652-3704: EducationalVideo + quiz models look consistent with intended usageThe
EducationalVideo,VideoQuizQuestion, andQuizAttemptmodels are structurally sound for the views/templates:
- FKs and
related_names line up with how they’re queried.- Ordering and
__str__implementations are reasonable defaults.No blocking issues from the model design itself.
blt/urls.py (1)
145-172: Video detail and quiz submission routes are wired correctlyThe new imports and URL patterns for:
education/video/<pk>/→VideoDetailVieweducation/video/<video_id>/quiz/submit/→submit_quizare consistent with the view signatures and model design, and fit cleanly into the existing education URL namespace.
Also applies to: 729-742
website/views/education.py (2)
356-373: Context additions for videos and quiz history look goodThe extra context for the main education page:
educational_videos = EducationalVideo.objects.all()user_quiz_history = QuizAttempt.objects.filter(user=user)when authenticatedmatches the new models and lets templates render video cards and quiz history without extra queries in the template layer.
979-992: VideoDetailView context matches the quiz models and template needs
VideoDetailView’sget_context_datapopulates:
quiz_questionsviaVideoQuizQuestion.objects.filter(video=video)quiz_historyfor the authenticated user viaQuizAttempt.objects.filter(user=self.request.user, video=video)This is consistent with the model design and gives the template everything it needs to render both the current quiz and prior attempts.
| - name: Run migrations | ||
| run: poetry run python manage.py migrate --noinput |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Migration step exposes cache poisoning vulnerability when running on pull_request_target.
The migration step is functionally necessary for tests to pass (new EducationalVideo, VideoQuizQuestion, and QuizAttempt models require migrations). However, CodeQL correctly identifies a critical security vulnerability:
Vulnerability: When the workflow runs via pull_request_target trigger (line 14), the test job checks out untrusted PR code (line 281) and executes it. Running manage.py migrate executes Python code from migration files, which could be malicious in a forked PR. A malicious migration could:
- Poison shared caches affecting subsequent CI runs
- Execute arbitrary Python code with workflow privileges
- Corrupt the test database or inject backdoors
Note: This vulnerability pre-dates this PR (line 321 already runs manage.py test on untrusted code), but the migration step makes it more exploitable because migrations run earlier with database write access.
Suggested mitigations (choose one):
Option 1: Restrict test job to pull_request trigger (recommended)
Add an if condition to the test job to skip pull_request_target:
test:
name: Run Tests
+ if: github.event_name != 'pull_request_target'
needs: setup
runs-on: ubuntu-latestThen ensure tests run on pull_request instead by adding it to the workflow triggers:
on:
#merge_group:
+ pull_request:
+ types:
+ - opened
+ - synchronize
+ - reopened
+ - ready_for_review
pull_request_target:This removes write permissions for forked PRs, eliminating cache poisoning risk.
Option 2: Require manual approval for first-time contributors
Configure the repository to require approval before running workflows on PRs from first-time contributors:
- Go to Settings → Actions → General
- Under "Fork pull request workflows from outside collaborators", select "Require approval for first-time contributors"
This adds a manual review gate but requires maintainer action.
Option 3: Use separate cache keys for PRs vs main
Modify cache keys to isolate PR caches from main branch:
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}-${{ hashFiles('**/pyproject.toml') }}
+ # Isolate PR caches from main branch to prevent poisoning
+ key: ${{ runner.os }}-poetry-${{ github.event_name }}-${{ github.event.pull_request.number || github.ref }}-${{ hashFiles('**/poetry.lock') }}-${{ hashFiles('**/pyproject.toml') }}This reduces cache poisoning impact but doesn't eliminate arbitrary code execution risk.
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 GitHub Check: CodeQL
[failure] 314-316: Cache Poisoning via execution of untrusted code
Potential cache poisoning in the context of the default branch due to privilege checkout of untrusted code. (pull_request_target).
Potential cache poisoning in the context of the default branch due to privilege checkout of untrusted code. (workflow_run).
🤖 Prompt for AI Agents
.github/workflows/ci-cd.yml lines 314-315: the "Run migrations" step executes
untrusted PR code when the workflow is triggered by pull_request_target,
creating a cache-poisoning/arbitrary execution risk; fix by restricting the test
job to not run for pull_request_target—add an if condition on the job level such
as checking github.event_name == 'pull_request' (or github.event_name !=
'pull_request_target') so migrations and tests only run on pull_request or other
safe events, and ensure the workflow triggers include pull_request;
alternatively, implement one of the suggested mitigations (manual approval for
first-time contributors or separate cache keys) if job-level restriction isn't
possible.
website/views/education.py
Outdated
| @login_required | ||
| @require_http_methods(["POST"]) | ||
| def submit_quiz(request, video_id): | ||
| """ | ||
| Handle quiz submission and score calculation. | ||
| Returns JSON with score, total questions, and percentage. | ||
| """ | ||
| try: | ||
| video = EducationalVideo.objects.get(id=video_id) | ||
| score = 0 | ||
| total_questions = 0 | ||
|
|
||
| # Get all questions for this video | ||
| questions = VideoQuizQuestion.objects.filter(video=video) | ||
| total_questions = questions.count() | ||
|
|
||
| if total_questions == 0: | ||
| return JsonResponse( | ||
| {"error": "No questions for this video"}, | ||
| status=400 | ||
| ) | ||
|
|
||
| # Check each answer | ||
| for question in questions: | ||
| user_answer = request.POST.get( | ||
| f"question_{question.id}", "" | ||
| ).upper().strip() | ||
| if user_answer == question.correct_answer.upper(): | ||
| score += 1 | ||
|
|
||
| percentage = (score / total_questions * 100) if total_questions > 0 else 0 | ||
|
|
||
| # Save quiz attempt | ||
| attempt = QuizAttempt.objects.create( | ||
| user=request.user, | ||
| video=video, | ||
| score=score, | ||
| total_questions=total_questions, | ||
| percentage=percentage, | ||
| ) | ||
|
|
||
| return JsonResponse( | ||
| { | ||
| "success": True, | ||
| "score": score, | ||
| "total": total_questions, | ||
| "percentage": round(percentage, 2), | ||
| "attempt_id": attempt.id, | ||
| } | ||
| ) | ||
|
|
||
| except EducationalVideo.DoesNotExist: | ||
| return JsonResponse( | ||
| {"error": "Video not found"}, | ||
| status=404 | ||
| ) | ||
| except Exception as e: | ||
| print(f"DEBUG: quiz submission error: {e}") | ||
| return JsonResponse( | ||
| {"error": str(e)}, | ||
| status=500 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix quiz submission error exposure and remove unreachable tail code
Two issues in submit_quiz:
-
Information exposure via JSON response
The generic
except Exception as eblock:print(f"DEBUG: quiz submission error: {e}") return JsonResponse({"error": str(e)}, status=500)
both prints the exception and returns
str(e)to the client. This is exactly what CodeQL flagged: it can leak internal error details, stack messages, or other implementation info.Replace with logged exception + generic error message, e.g.:
except Exception as e: logger.exception("Quiz submission error for video %s", video_id) return JsonResponse( {"error": "An error occurred while processing your quiz submission"}, status=500, )
-
Dead/unreachable code at end of function
The block:
featured_lectures = Lecture.objects.filter(section__isnull=True) courses = Course.objects.all() context = {"is_instructor": is_instructor, "featured_lectures": featured_lectures, "courses": courses} return render(request, template, context)
sits after all return paths inside
submit_quizand referencesis_instructor/templatewhich aren’t defined in this scope. It’s effectively dead code and could become a runtime error if someone later alters the control flow.This tail block should be removed entirely; the JSON response paths already fully define the behaviour of
submit_quiz.
Also applies to: 439-442
🧰 Tools
🪛 GitHub Check: CodeQL
[warning] 432-432: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
🤖 Prompt for AI Agents
In website/views/education.py around lines 376 to 437 (also applies to 439-442),
replace the broad except block that prints and returns str(e) with a logged
exception and a generic error JSON response (use logger.exception to record the
full error with context like video_id) so internal details are not exposed to
clients, and remove the unreachable tail code that references undefined
variables (featured_lectures/courses/is_instructor/template) since submit_quiz
returns JSON on all paths; ensure only the safe generic error message is
returned and delete the dead render block.
| {% for attempt in quiz_history %} | ||
| <div class="flex items-center justify-between bg-gray-50 dark:bg-gray-700 p-4 rounded-lg border-l-4 {% if attempt.percentage >= 70 %}border-green-500{% else %}border-yellow-500{% endif %}"> | ||
| <div> | ||
| <p class="font-semibold text-gray-800 dark:text-gray-100">{{ attempt.get_date_display }}</p> |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
website/views/education.py (1)
373-376: Critical: Remove orphaned code that will crash on import.Lines 373-376 are outside any function definition and reference undefined variables (
is_instructor,template). This code will execute when the module is imported and raise aNameError, preventing the application from starting.This appears to be leftover code from a previous refactoring that was inadvertently left in the file.
Remove these lines immediately
- featured_lectures = Lecture.objects.filter(section__isnull=True) - courses = Course.objects.all() - context = {"is_instructor": is_instructor, "featured_lectures": featured_lectures, "courses": courses} - return render(request, template, context)This code is completely orphaned - it's not part of the
submit_quizfunction above (which returns JSON) and it's not part of any other function. It must be removed to prevent a module-levelNameError.Based on sentry bot analysis flagging this as critical severity.
♻️ Duplicate comments (4)
website/views/education.py (4)
82-146: Add logging and timeout to OpenAI API call.This function has no logging despite past review comments suggesting it. When the function returns early (lines 87-94) or encounters errors (lines 142-145), there's no trace in logs. This contradicts the PR objectives emphasizing "contextual information so Docker logs are actionable."
Additionally, the OpenAI API call (line 97) lacks a timeout, which could cause requests to hang indefinitely.
Add logging and timeout
def generate_ai_summary_and_verify(youtube_id, title, transcript): """ Use OpenAI to generate summary and verify if content is educational. Returns (summary_text, is_educational_bool). """ if not client: + logger.warning(f"OpenAI client not initialized for video {youtube_id}") return None, False if not openai_api_key: + logger.warning("OPENAI_API_KEY is missing") return None, False if not transcript: + logger.info(f"No transcript provided for video {youtube_id}, skipping AI summary") return None, False try: + logger.info(f"Generating AI summary for video {youtube_id}") response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ { "role": "system", "content": ( "You are an educational content expert. " "Analyze the video transcript and determine if it's educational/security-related content. " "Respond in JSON format with 'summary' and 'is_educational' fields." ), }, { "role": "user", "content": ( f"Video Title: {title}\n\nTranscript:\n{transcript}\n\n" "Provide a brief (100-150 word) summary and determine if this is educational security content. " "Respond ONLY with valid JSON using double quotes, like this:\n" '{"summary": "...", "is_educational": true}' ), }, ], max_tokens=500, temperature=0.7, + timeout=30, ) content = response.choices[0].message.content # Strip code fences if model wraps JSON in ``` (triple backticks) content_stripped = content.strip() if content_stripped.startswith("```"): lines = content_stripped.splitlines() # Remove first line (opening fence) if len(lines) > 1: lines = lines[1:] # Remove last line if it's a closing fence if lines and lines[-1].strip().startswith("```"): lines = lines[:-1] content_stripped = "\n".join(lines).strip() data = json.loads(content_stripped) summary = data.get("summary", "") is_educational = data.get("is_educational", False) + logger.info(f"Generated summary for {youtube_id}: is_educational={is_educational}") return summary, is_educational except json.JSONDecodeError as e: + logger.error(f"JSON decode error for video {youtube_id}: {e}") return None, False except Exception as e: + logger.error(f"OpenAI API error for video {youtube_id}: {e}") return None, FalseBased on PR objectives emphasizing actionable Docker logs.
293-298: Fix exception handling to prevent information exposure.The exception handler exposes internal error details to users via
messages.error(request, f"Error processing video: {str(e)}")and usestraceback.print_exc()instead of proper logging. This was flagged in multiple past review comments but remains unaddressed.Apply this fix
except Exception as e: - import traceback - - traceback.print_exc() - messages.error(request, f"Error processing video: {str(e)}") + logger.exception(f"Error processing video submission: title={title}, url={youtube_url}") + messages.error(request, "Error processing video. Please try again later.") return redirect("education")Note: Using
logger.exception()automatically includes the full stack trace without needing to importtraceback. The generic user message prevents exposing internal implementation details.Based on PR objectives emphasizing user-friendly error messages with detailed logs.
148-216: Add logging and timeout for quiz generation.This function has the same logging and timeout issues as
generate_ai_summary_and_verify. When quiz generation fails or is skipped, there's no trace in logs.Add logging and timeout
def generate_quiz_from_transcript(youtube_id, transcript, title): """ Generate 5-10 quiz questions from transcript using OpenAI. Returns list of question dicts or empty list on error. """ if not client: + logger.warning(f"OpenAI client not initialized, skipping quiz for {youtube_id}") return [] if not openai_api_key: + logger.warning("OPENAI_API_KEY missing, skipping quiz generation") return [] if not transcript: + logger.info(f"No transcript for quiz generation for {youtube_id}") return [] try: + logger.info(f"Generating quiz questions for video {youtube_id}") response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ { "role": "system", "content": ( "You are a quiz generator expert. Create educational multiple-choice questions " "based on the video content. Respond ONLY with valid JSON array." ), }, { "role": "user", "content": ( f"Create 5 multiple-choice questions from this video on '{title}':\n\n{transcript}\n\n" "Respond ONLY with valid JSON array in this exact format:\n" "[\n" " {\n" ' "question": "What is...?",\n' ' "option_a": "Answer A",\n' ' "option_b": "Answer B",\n' ' "option_c": "Answer C",\n' ' "option_d": "Answer D",\n' ' "correct_answer": "A",\n' ' "explanation": "The correct answer is..."\n' " }\n" "]" ), }, ], max_tokens=2000, temperature=0.7, + timeout=30, ) content = response.choices[0].message.content # Strip code fences if present (triple backticks) content_stripped = content.strip() if content_stripped.startswith("```"): lines = content_stripped.splitlines() if len(lines) > 1: lines = lines[1:] if lines and lines[-1].strip().startswith("```"): lines = lines[:-1] content_stripped = "\n".join(lines).strip() questions = json.loads(content_stripped) + logger.info(f"Generated {len(questions)} quiz questions for {youtube_id}") return questions[:10] # Limit to 10 questions except json.JSONDecodeError as e: + logger.error(f"Quiz JSON decode error for {youtube_id}: {e}") return [] except Exception as e: + logger.error(f"Quiz generation error for {youtube_id}: {e}") return []Based on PR objectives emphasizing actionable Docker logs.
368-371: Critical: Fix information exposure through exception.Line 371 returns
str(e)in the JSON error response, which can expose sensitive stack trace information, internal paths, or implementation details. This is flagged by CodeQL as a security issue and was mentioned in past comments but remains unaddressed.Apply this fix immediately
except EducationalVideo.DoesNotExist: return JsonResponse({"error": "Video not found"}, status=404) except Exception as e: - return JsonResponse({"error": str(e)}, status=500) + logger.exception(f"Quiz submission error for video {video_id}") + return JsonResponse( + {"error": "An error occurred while processing your quiz submission"}, + status=500 + )Based on CodeQL static analysis warning and security best practices.
🧹 Nitpick comments (1)
website/views/education.py (1)
57-79: Add structured logging and improve error handling.The function silently returns
Noneon any exception without logging. This makes debugging transcript retrieval issues difficult in production. Based on the PR objectives emphasizing "actionable Docker logs," this function should log meaningful errors.Additionally, the broad
except Exceptiondoesn't distinguish between different failure modes (e.g., no captions available vs. network errors vs. API issues).Consider adding logging for better observability
def get_youtube_transcript(youtube_id): """ Fetch YouTube transcript as plain text (first ~3000 chars). Compatible with the current YouTubeTranscriptApi version. """ try: + logger.info(f"Fetching transcript for YouTube video: {youtube_id}") api = YouTubeTranscriptApi() transcript_list = api.list(youtube_id) # Returns TranscriptList transcript_list = list(transcript_list) # Convert to [Transcript, ...] # Each Transcript object has .fetch() which returns FetchedTranscriptSnippet iterable snippets = [] for transcript in transcript_list: for snippet in transcript.fetch(): snippets.append(snippet) # FetchedTranscriptSnippet has .text attribute (not ['text'] dict access) transcript_text = " ".join(snippet.text for snippet in snippets) + logger.info(f"Successfully retrieved transcript for {youtube_id}: {len(transcript_text)} characters") return transcript_text[:3000] # Limit to 3000 chars for efficiency except Exception as e: + logger.warning(f"Failed to fetch transcript for {youtube_id}: {type(e).__name__}: {str(e)}") return NoneNote: The PR objectives mention "logging meaningful errors when captions are unavailable or connections close," but this function doesn't distinguish between these cases yet.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (7)
blt/urls.py(3 hunks)website/migrations/0264_educationalvideo.py(1 hunks)website/migrations/0265_educationalvideo_ai_summary_and_more.py(1 hunks)website/models.py(1 hunks)website/templates/education/education.html(1 hunks)website/templates/education/video_detail.html(1 hunks)website/views/education.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- website/templates/education/education.html
- website/templates/education/video_detail.html
- website/migrations/0264_educationalvideo.py
- blt/urls.py
🧰 Additional context used
🧬 Code graph analysis (2)
website/migrations/0265_educationalvideo_ai_summary_and_more.py (1)
website/migrations/0264_educationalvideo.py (1)
Migration(6-27)
website/views/education.py (1)
website/models.py (8)
Course(2516-2528)EducationalVideo(3654-3677)Lecture(2551-2645)LectureStatus(2648-2658)Section(2531-2541)Tag(69-80)UserProfile(896-1109)VideoQuizQuestion(3680-3692)
🪛 GitHub Check: CodeQL
website/views/education.py
[warning] 371-371: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run pre-commit
- GitHub Check: Run Tests
- GitHub Check: docker-test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
website/views/education.py (1)
374-377: Remove orphaned code that will cause NameError on module import.These lines are outside any function definition and reference undefined variables (
is_instructor,template), which will cause aNameErrorwhen the module is imported, crashing the application on startup.🔎 Remove this code block
- featured_lectures = Lecture.objects.filter(section__isnull=True) - courses = Course.objects.all() - context = {"is_instructor": is_instructor, "featured_lectures": featured_lectures, "courses": courses} - return render(request, template, context)This appears to be leftover code from refactoring the
education_homeview and should be completely removed.
♻️ Duplicate comments (2)
website/views/education.py (2)
148-215: Add logging and timeout to quiz generation.Similar to
generate_ai_summary_and_verify, this function lacks logging and timeout handling.🔎 Apply these changes
def generate_quiz_from_transcript(youtube_id, transcript, title): """ Generate 5-10 quiz questions from transcript using OpenAI. Returns list of question dicts or empty list on error. """ if not client: + logger.warning(f"OpenAI client not initialized, skipping quiz for {youtube_id}") return [] if not openai_api_key: + logger.warning(f"OPENAI_API_KEY missing, skipping quiz for {youtube_id}") return [] if not transcript: + logger.warning(f"No transcript for quiz generation for {youtube_id}") return [] try: response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ { "role": "system", "content": ( "You are a quiz generator expert. Create educational multiple-choice questions " "based on the video content. Respond ONLY with valid JSON array." ), }, { "role": "user", "content": ( f"Create 5 multiple-choice questions from this video on '{title}':\n\n{transcript}\n\n" "Respond ONLY with valid JSON array in this exact format:\n" "[\n" " {\n" ' "question": "What is...?",\n' ' "option_a": "Answer A",\n' ' "option_b": "Answer B",\n' ' "option_c": "Answer C",\n' ' "option_d": "Answer D",\n' ' "correct_answer": "A",\n' ' "explanation": "The correct answer is..."\n' " }\n" "]" ), }, ], max_tokens=2000, temperature=0.7, + timeout=30, ) content = response.choices[0].message.content + logger.debug(f"Raw quiz response for {youtube_id}: {content[:100]}...") # Strip code fences if present (triple backticks) content_stripped = content.strip() if content_stripped.startswith("```"): lines = content_stripped.splitlines() if len(lines) > 1: lines = lines[1:] if lines and lines[-1].strip().startswith("```"): lines = lines[:-1] content_stripped = "\n".join(lines).strip() questions = json.loads(content_stripped) + logger.info(f"Generated {len(questions)} quiz questions for {youtube_id}") return questions[:10] # Limit to 10 questions except json.JSONDecodeError as e: + logger.error(f"Quiz JSON decode error for {youtube_id}: {e}") return [] except Exception as e: + logger.error(f"Quiz generation error for {youtube_id}: {e}") return []
82-145: Add logging and timeout to OpenAI API call.This function lacks logging and timeout handling, which contradicts the PR objectives to provide "actionable Docker logs" and prevent hanging on slow API responses.
🔎 Apply these changes
def generate_ai_summary_and_verify(youtube_id, title, transcript): """ Use OpenAI to generate summary and verify if content is educational. Returns (summary_text, is_educational_bool). """ if not client: + logger.warning("OpenAI client not initialized (OPENAI_API_KEY missing)") return None, False if not openai_api_key: + logger.warning("OPENAI_API_KEY is missing") return None, False if not transcript: + logger.warning(f"No transcript provided for video {youtube_id}") return None, False try: response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ { "role": "system", "content": ( "You are an educational content expert. " "Analyze the video transcript and determine if it's educational/security-related content. " "Respond in JSON format with 'summary' and 'is_educational' fields." ), }, { "role": "user", "content": ( f"Video Title: {title}\n\nTranscript:\n{transcript}\n\n" "Provide a brief (100-150 word) summary and determine if this is educational security content. " "Respond ONLY with valid JSON using double quotes, like this:\n" '{"summary": "...", "is_educational": true}' ), }, ], max_tokens=500, - temperature=0.7, + temperature=0.3, + timeout=30, ) content = response.choices[0].message.content + logger.debug(f"Raw OpenAI response for {youtube_id}: {content[:100]}...") # Strip code fences if model wraps JSON in ``` (triple backticks) content_stripped = content.strip() if content_stripped.startswith("```"): lines = content_stripped.splitlines() # Remove first line (opening fence) if len(lines) > 1: lines = lines[1:] # Remove last line if it's a closing fence if lines and lines[-1].strip().startswith("```"): lines = lines[:-1] content_stripped = "\n".join(lines).strip() data = json.loads(content_stripped) summary = data.get("summary", "") is_educational = data.get("is_educational", False) + logger.info(f"Generated summary for {youtube_id}: is_educational={is_educational}") return summary, is_educational except json.JSONDecodeError as e: + logger.error(f"JSON decode error for {youtube_id}: {e}") + logger.debug(f"Content that failed to parse: {content_stripped[:200]}") return None, False except Exception as e: + logger.error(f"OpenAI API error for {youtube_id}: {e}") return None, False
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
website/views/education.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
website/views/education.py (1)
website/models.py (3)
EducationalVideo(3654-3677)QuizAttempt(3695-3707)VideoQuizQuestion(3680-3692)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Run Tests
- GitHub Check: docker-test
🔇 Additional comments (1)
website/views/education.py (1)
322-372: LGTM! Exception handling is correct.The
submit_quizfunction properly useslogger.exception()for server-side logging and returns a generic error message to clients, preventing information exposure.
| def get_youtube_transcript(youtube_id): | ||
| """ | ||
| Fetch YouTube transcript as plain text (first ~3000 chars). | ||
| Compatible with the current YouTubeTranscriptApi version. | ||
| """ | ||
| try: | ||
| api = YouTubeTranscriptApi() | ||
| transcript_list = api.list(youtube_id) # Returns TranscriptList | ||
| transcript_list = list(transcript_list) # Convert to [Transcript, ...] | ||
|
|
||
| # Each Transcript object has .fetch() which returns FetchedTranscriptSnippet iterable | ||
| snippets = [] | ||
| for transcript in transcript_list: | ||
| for snippet in transcript.fetch(): | ||
| snippets.append(snippet) | ||
|
|
||
| # FetchedTranscriptSnippet has .text attribute (not ['text'] dict access) | ||
| transcript_text = " ".join(snippet.text for snippet in snippets) | ||
|
|
||
| return transcript_text[:3000] # Limit to 3000 chars for efficiency | ||
|
|
||
| except Exception as e: | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add logging for transcript fetch operations.
The youtube-transcript-api documentation shows that FetchedTranscriptSnippet objects have a .text attribute that can be accessed directly, so the current implementation at line 74 is correct. However, the function currently has no logging despite the PR objectives requiring "logging meaningful errors when captions are unavailable."
Following Django best practices, add a logger at module level using import logging and logger = logging.getLogger(__name__). Add logger.info() calls to track transcript fetch operations and logger.error() when exceptions occur, replacing the silent return None.
Additionally, consider catching specific exceptions like TranscriptsDisabled and NoTranscriptFound instead of the generic Exception to provide more meaningful error context.
| # Handle YouTube video submission with AI processing | ||
| if request.method == "POST": | ||
| youtube_url = request.POST.get("youtube_url", "").strip() | ||
| title = request.POST.get("video_title", "").strip() | ||
| description = request.POST.get("video_description", "").strip() | ||
|
|
||
| if youtube_url and title: | ||
| try: | ||
| # Extract video ID from URL | ||
| match = re.search(r"(?:youtube\.com\/watch\?v=|youtu\.be\/)([^&\n?#]+)", youtube_url) | ||
| if not match: | ||
| messages.error(request, "Invalid YouTube URL format.") | ||
| return redirect("education") | ||
|
|
||
| youtube_id = match.group(1) | ||
|
|
||
| # Step 1: Get transcript | ||
|
|
||
| transcript = get_youtube_transcript(youtube_id) | ||
|
|
||
| # Step 2: Generate summary and educational verification | ||
|
|
||
| summary, is_verified = generate_ai_summary_and_verify(youtube_id, title, transcript) | ||
|
|
||
| # Step 3: Create video record | ||
| video = EducationalVideo.objects.create( | ||
| title=title, | ||
| youtube_url=youtube_url, | ||
| youtube_id=youtube_id, | ||
| description=description, | ||
| ai_summary=summary or "", | ||
| is_verified=is_verified, | ||
| ) | ||
|
|
||
| # Step 4: Generate quiz questions (only if transcript exists) | ||
| if transcript: | ||
| quiz_questions = generate_quiz_from_transcript(youtube_id, transcript, title) | ||
|
|
||
| for q_data in quiz_questions: | ||
| try: | ||
| VideoQuizQuestion.objects.create( | ||
| video=video, | ||
| question=q_data.get("question", ""), | ||
| option_a=q_data.get("option_a", ""), | ||
| option_b=q_data.get("option_b", ""), | ||
| option_c=q_data.get("option_c", ""), | ||
| option_d=q_data.get("option_d", ""), | ||
| correct_answer=q_data.get("correct_answer", "A"), | ||
| explanation=q_data.get("explanation", ""), | ||
| ) | ||
|
|
||
| except Exception as q_err: | ||
| logger.warning("DEBUG: error creating quiz question: %s", q_err) | ||
| else: | ||
| logger.info("DEBUG: skipping quiz generation, no transcript for %s", youtube_id) | ||
| messages.success(request, "Video added successfully with AI-generated content!") | ||
| return redirect("education") | ||
|
|
||
| except Exception as e: | ||
| import traceback | ||
|
|
||
| traceback.print_exc() | ||
| messages.error(request, f"Error processing video: {str(e)}") | ||
| return redirect("education") | ||
| else: | ||
| messages.error(request, "Please provide both title and YouTube URL.") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix exception exposure and remove debug artifacts.
The PR commit message states "Avoid exposing exception details in education view," but line 297 still exposes str(e) to users via messages.error(). Additionally, lines 294-296 use traceback.print_exc() instead of proper logging.
🔎 Apply this fix
except Exception as e:
- import traceback
-
- traceback.print_exc()
- messages.error(request, f"Error processing video: {str(e)}")
+ logger.exception(f"Error processing video submission: title={title}, url={youtube_url}")
+ messages.error(request, "Error processing video. Please try again later.")
return redirect("education")
else:
messages.error(request, "Please provide both title and YouTube URL.")Also fix the logging calls that use "DEBUG:" prefix:
except Exception as q_err:
- logger.warning("DEBUG: error creating quiz question: %s", q_err)
+ logger.error(f"Error creating quiz question for video {video.id}: {q_err}")
else:
- logger.info("DEBUG: skipping quiz generation, no transcript for %s", youtube_id)
+ logger.info(f"Skipping quiz generation for {youtube_id}: no transcript available")🤖 Prompt for AI Agents
In website/views/education.py around lines 235 to 301, the except block prints
the full traceback and exposes exception text to users; replace
traceback.print_exc() and messages.error(... str(e) ...) with a generic
user-facing error message (e.g., "An error occurred while processing the
video.") and log the full exception server-side using logger.exception(...) or
logger.error(..., exc_info=True) so the stacktrace remains in logs only. Also
remove the "DEBUG:" literal prefix from logger.warning and logger.info calls and
use clear contextual messages (e.g., logger.warning("error creating quiz
question: %s", q_err) and logger.info("skipping quiz generation, no transcript
for %s", youtube_id)).
|
@aaditya8979 I cannot find any issue linked to it, please do so, if there is no issue regarding please create one, Get it verfied by Donnie and then resolve the PR based on it. Thanks for you effort. |
|
Hi @mdkaifansari04, |
|
Hi @DonnieBLT , |
|
Please resolve the comments by the bots here as well. |
Summary
This PR fixes and hardens the AI‑powered education experience for YouTube videos in the BLT platform. It ensures transcripts are reliably fetched, AI summaries are generated safely, and quizzes are created when possible, while keeping the UI stable even when external APIs fail.
Problem
The existing
/education/flow had several issues:youtube-transcript-apiversion, causing transcript fetch failures.What this PR changes
1. Reliable transcript fetching
get_youtube_transcriptto work with the project’sYouTubeTranscriptApiversion..list(video_id)and.fetch()) and builds the transcript fromFetchedTranscriptSnippet.text.2. Robust AI summary generation
Updated
generate_ai_summary_and_verifyto:…{"summary": "...", "is_educational": true}shape safely.The education page now continues to work even if OpenAI returns errors (e.g., quota exceeded), without breaking the user experience.
3. Robust quiz generation
Updated
generate_quiz_from_transcriptto:Quiz rows (
VideoQuizQuestion) are only created when valid question data is returned, avoiding half‑broken quiz states.4. Safer
education_homeflowImproved the
/education/POST handler to:youtube_id, transcript presence, and creation of theEducationalVideorecord.5. Development and dependency fixes
Added missing but required dependencies to
requirements.txtso a fresh local/dev environment matches the Docker image:dj-database-urldjango-annoyingdjango-simple-captchayoutube-transcript-apiThis prevents common
ModuleNotFoundErrorissues when contributors run checks locally.How this was tested
Ran the app via Docker and submitted multiple YouTube videos (with and without English subtitles) through the
/education/form.Verified in logs that:
Confirmed that:
/education/and/education/video/<id>/render successfully even when YouTube or OpenAI fail.Impact
This PR makes the education feature significantly more reliable and contributor‑friendly:
Summary by CodeRabbit
New Features
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.