Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Nachiket-Roy
Copy link
Contributor

@Nachiket-Roy Nachiket-Roy commented Dec 17, 2025

CLOSES : #5226

Summary

This PR introduces a stored freshness score for projects and computes it using a Bumper-style activity decay model, based on repository update recency. The goal is to make project freshness queryable, filterable, and consistent across the platform.

Background : In the Projects section, there is a “Freshness” filter that calculates how active a repository is. However, this field is not stored in the database, which causes a FieldError.
Solution :
In this PR, I have fixed that issue by properly covering the filter. To calculate repository freshness, I am using the same algorithm as blt-bumper:
Last commit within 7 days → 1.0 point
Last commit within 30 days → 0.6 points
Last commit within 90 days → 0.3 points
Score is capped at 20, so highly active repositories do not outweigh less active ones by a large margin.

populates freshness for all existing projects.

python manage.py update_project_freshness

Key Changes

  • Added a freshness field to the Project model (indexed, 0–100 scale)
  • Implemented calculate_freshness() using:
  • Repository activity as a proxy for the activity graph
  • Time-decayed weighting (7 / 30 / 90 days)
  • Archived repositories excluded from scoring
  • Added a management command to periodically recalculate freshness for all projects
  • Integrated the command into the existing scheduled runner
  • Exposed freshness via the API and enabled DB-level filtering
  • Updated tests to validate freshness behavior and filtering

Algorithm Notes
The freshness score is calculated using a Bumper-style time-decay activity model. Repositories are grouped into rolling 7-day, 30-day, and 90-day windows based on their most recent activity. Each window contributes with decreasing weights (1.0, 0.6, 0.3 respectively) to reflect diminishing relevance over time. The weighted score is normalized to a 0–100 range with an upper cap to prevent dominance by large projects. This approach mirrors Bumper’s activity graph logic by prioritizing recent activity while naturally discounting stale repositories.

Why this approach
Enables efficient sorting and filtering by freshness at the database level
Keeps scope limited to existing data and infrastructure

Summary by CodeRabbit

  • New Features

    • Projects now store a freshness score (0–100); API supports filtering by minimum freshness.
  • Bug Fixes

    • Freshness filter enforces numeric 0–100 input and returns descriptive 400 errors for invalid values.
  • Chores

    • Added a command to recalculate and persist freshness; invoked by the daily runner.
    • Removed legacy composite filter endpoint.
  • Tests

    • Added unit and integration tests for freshness calculation, API filtering, and the update command.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link
Contributor

👋 Hi @Nachiket-Roy!

This pull request needs a peer review before it can be merged. Please request a review from a team member who is not:

  • The PR author
  • DonnieBLT
  • coderabbitai
  • copilot

Once a valid peer review is submitted, this check will pass automatically. Thank you!

@github-actions github-actions bot added files-changed: 7 PR changes 7 files migrations PR contains database migration files needs-peer-review PR needs peer review labels Dec 17, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 17, 2025

Walkthrough

Adds a stored Project.freshness field and calculate_freshness() method with migration, a management command to recalc/persist freshness (invoked from the daily runner), API validation/filtering for a freshness query param, moves serializer to expose stored freshness, removes legacy ProjectViewSet.filter, and updates tests.

Changes

Cohort / File(s) Summary
Model & Migration
website/models.py, website/migrations/0264_project_freshness.py
Adds freshness DecimalField on Project (max_digits=5, decimal_places=2, default=0.0, db_index=True) and Project.calculate_freshness() computing weighted activity across 7/30/90-day windows, normalizing to 0–100, capping and rounding to 2 decimals. Migration adds the new field.
Management Commands
website/management/commands/update_project_freshness.py, website/management/commands/run_daily.py
New update_project_freshness command batches project IDs (500), locks each row, calls calculate_freshness(), saves freshness with update_fields=["freshness"], logs per-project errors and progress, and prints totals and elapsed time. run_daily.py now invokes this command and logs exceptions.
API & Serializer
website/api/views.py, website/serializers.py
ProjectViewSet.list now parses/validates freshness query param (float, 0–100 inclusive), returns 400 on invalid input, and applies freshness__gte filtering when provided. Removed legacy ProjectViewSet.filter action. ProjectSerializer.freshness changed from SerializerMethodField to DecimalField(..., read_only=True) and get_freshness() removed.
Tests
website/tests/test_project_aggregation.py, website/tests/test_api.py, website/tests/test_project_freshness.py, website/tests/test_update_project_freshness_command.py
Tests adjusted/added: integration test for calculate_freshness persistence, unit tests for freshness calculation edge cases and caps, API tests for freshness filtering (valid/invalid/decimal/combined filters), and comprehensive tests for the management command including per-project error handling and timing. Removed mocking of serializer freshness.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant API as Django API (ProjectViewSet)
    participant DB as Database (Project, Repo)
    Note over API,DB: API request flow for filtering projects by stored freshness
    Client->>API: GET /api/v1/projects/?freshness=50
    API->>API: parse & validate freshness param (float, 0–100)
    alt invalid
        API-->>Client: 400 Bad Request (error message)
    else valid
        API->>DB: Query Projects with freshness__gte=50 (plus other filters)
        DB-->>API: matching Project rows (includes stored freshness)
        API-->>Client: 200 OK (serialized projects with read-only freshness)
    end
Loading
sequenceDiagram
    autonumber
    participant Scheduler
    participant Cmd as update_project_freshness
    participant DB as Database (Project, Repo)
    participant Logger
    Note over Cmd,DB: Daily freshness update batch flow
    Scheduler->>Cmd: call_command("update_project_freshness")
    Cmd->>DB: fetch Project ids (batched)
    loop per project in batch
        Cmd->>DB: load project with row lock (select_for_update)
        Cmd->>Cmd: freshness = project.calculate_freshness()
        alt success
            Cmd->>DB: save(project, update_fields=["freshness"])
        else error
            Cmd->>Logger: log exception for project id
        end
    end
    Cmd-->>Scheduler: print summary (processed, errors, elapsed time)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review focus:
    • website/models.py: time-window queries, weight constants, normalization, Decimal vs float rounding and DB field interactions.
    • website/management/commands/update_project_freshness.py: batching, transactions/locks (select_for_update), error handling, and potential N+1 or performance considerations.
    • website/api/views.py: freshness param parsing/validation messages and ensuring removal of legacy filter endpoint is safe.
    • Tests: time-dependent setups, deterministic expectations for rounding/capping, and command output assertions.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 61.76% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding a database-backed project freshness score feature, directly matching the PR's primary objective.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Nachiket-Roy Nachiket-Roy marked this pull request as draft December 17, 2025 13:04
@github-actions
Copy link
Contributor

📊 Monthly Leaderboard

Hi @Nachiket-Roy! Here's how you rank for December 2025:

Rank User PRs Reviews Comments Total
🥇 #1 @Nachiket-Roy 17 24 54 374
#2 @DonnieBLT 9 25 29 248

Leaderboard based on contributions in December 2025. Keep up the great work! 🚀

@github-actions github-actions bot added the pre-commit: passed Pre-commit checks passed label Dec 17, 2025
@github-actions github-actions bot added the tests: passed Django tests passed label Dec 17, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between cd7312d and bbe2c76.

📒 Files selected for processing (7)
  • website/api/views.py (1 hunks)
  • website/management/commands/run_daily.py (1 hunks)
  • website/management/commands/update_project_freshness.py (1 hunks)
  • website/migrations/0264_project_freshness.py (1 hunks)
  • website/models.py (1 hunks)
  • website/serializers.py (1 hunks)
  • website/tests/test_project_aggregation.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
website/models.py (1)
website/api/views.py (2)
  • filter (375-414)
  • filter (844-922)
website/management/commands/update_project_freshness.py (1)
website/models.py (12)
  • Project (1366-1477)
  • calculate_freshness (1401-1440)
  • save (74-77)
  • save (277-289)
  • save (1442-1469)
  • save (1577-1580)
  • save (1770-1773)
  • save (1888-1902)
  • save (1993-2015)
  • save (2814-2817)
  • save (3346-3356)
  • save (3654-3663)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run Tests
  • GitHub Check: docker-test
🔇 Additional comments (6)
website/management/commands/run_daily.py (1)

48-51: New daily freshness task wiring looks consistent

The new update_project_freshness invocation matches the existing error-handling pattern for other daily commands and won’t break the overall job on failures. No changes needed.

website/api/views.py (1)

856-869: Freshness filter validation and semantics are sound

Range checking, type validation, and freshness__gte behavior all look correct and consistent with existing stars/forks filters.

website/tests/test_project_aggregation.py (1)

16-24: Test setup now correctly relies on real freshness behavior

Keeping only the prefetch_related patch and dropping any freshness mocking is appropriate given the new DB-backed field. The updated comment still accurately describes the remaining patch.

website/serializers.py (1)

129-141: Serializer exposure of freshness matches the model and requirements

Using a read-only DecimalField(max_digits=5, decimal_places=2) on ProjectSerializer cleanly reflects the stored freshness value while preventing client writes. Looks good.

website/migrations/0264_project_freshness.py (1)

1-17: Migration correctly adds indexed freshness field

The migration cleanly adds Project.freshness with the expected DecimalField shape and index, aligned with the model definition and PR intent.

website/management/commands/update_project_freshness.py (1)

1-40: Freshness recalculation command is correct and robust

The command correctly:

  • Iterates all projects,
  • Uses project.calculate_freshness(),
  • Saves only the freshness field,
  • Isolates per-project failures while continuing,
  • Reports progress and timing.

This is a solid implementation for a daily batch job.

@github-project-automation github-project-automation bot moved this from Backlog to Ready in 📌 OWASP BLT Project Board Dec 17, 2025
@Nachiket-Roy
Copy link
Contributor Author

@coderabbitai help add tests for the changes made in this pr

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 17, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the test directory structure
fd -t f "test.*\.py$" website/tests/ | head -20

# Look at existing test patterns for models
cat website/tests/test_models.py 2>/dev/null || echo "test_models.py not found"

# Check for API test files
fd -t f "test.*api.*\.py$" website/tests/ || fd -t f "test.*view.*\.py$" website/tests/

# Look at the actual calculate_freshness implementation
ast-grep --pattern $'def calculate_freshness($$$) {
  $$$
}'

# Check management command test patterns if they exist
fd -t f "test.*command.*\.py$" website/tests/

Length of output: 826


🏁 Script executed:

#!/bin/bash
# Get the calculate_freshness implementation from models.py
rg -A 40 "def calculate_freshness" website/models.py

# Check the management command implementation
cat website/management/commands/update_project_freshness.py

# Look at test_api.py to understand API test patterns
head -100 website/tests/test_api.py

# Check how Project model tests are structured
rg -C 5 "class.*Project.*Test" website/tests/

Length of output: 10233


🏁 Script executed:

#!/bin/bash
# Check the API filtering implementation for freshness
rg -B 5 -A 15 "def filter" website/api/views.py | head -80

# Understand Project and Repo relationship
rg -C 3 "class Project" website/models.py | head -50

# Check Repo model for updated_at field
rg -C 5 "class Repo" website/models.py | head -100

# Look at more API test examples
rg -A 20 "class.*Test.*API" website/tests/test_api.py | head -100

# Check test_github_commands to understand command testing patterns
head -80 website/tests/test_github_commands.py

Length of output: 7006


@Nachiket-Roy: I'll help you create comprehensive tests for the freshness scoring feature. Here are test files covering all the changes:

1. Model Tests - website/tests/test_project_freshness.py

"""
Tests for Project freshness calculation functionality.
"""
from datetime import timedelta

from django.test import TestCase
from django.utils import timezone

from website.models import Organization, Project, Repo


class ProjectFreshnessCalculationTestCase(TestCase):
    """Test cases for Project.calculate_freshness() method"""

    def setUp(self):
        """Set up test data"""
        self.org = Organization.objects.create(
            name="Test Organization",
            url="https://test.org"
        )
        self.project = Project.objects.create(
            name="Test Project",
            organization=self.org,
            url="https://github.com/test/project"
        )
        self.now = timezone.now()

    def test_freshness_no_repos(self):
        """Test freshness calculation when project has no repositories"""
        freshness = self.project.calculate_freshness()
        self.assertEqual(freshness, 0.0)

    def test_freshness_all_archived_repos(self):
        """Test freshness when all repos are archived"""
        Repo.objects.create(
            project=self.project,
            name="archived-repo",
            repo_url="https://github.com/test/archived",
            is_archived=True,
            updated_at=self.now - timedelta(days=1)
        )
        freshness = self.project.calculate_freshness()
        self.assertEqual(freshness, 0.0)

    def test_freshness_single_repo_7_days(self):
        """Test freshness with one repo active in last 7 days"""
        Repo.objects.create(
            project=self.project,
            name="active-repo",
            repo_url="https://github.com/test/active",
            is_archived=False,
            updated_at=self.now - timedelta(days=3)
        )
        freshness = self.project.calculate_freshness()
        # 1 repo * 1.0 weight = 1.0, normalized: (1/20)*100 = 5.0
        self.assertEqual(freshness, 5.0)

    def test_freshness_single_repo_30_days(self):
        """Test freshness with one repo active in 8-30 day window"""
        Repo.objects.create(
            project=self.project,
            name="recent-repo",
            repo_url="https://github.com/test/recent",
            is_archived=False,
            updated_at=self.now - timedelta(days=15)
        )
        freshness = self.project.calculate_freshness()
        # 1 repo * 0.6 weight = 0.6, normalized: (0.6/20)*100 = 3.0
        self.assertEqual(freshness, 3.0)

    def test_freshness_single_repo_90_days(self):
        """Test freshness with one repo active in 31-90 day window"""
        Repo.objects.create(
            project=self.project,
            name="older-repo",
            repo_url="https://github.com/test/older",
            is_archived=False,
            updated_at=self.now - timedelta(days=60)
        )
        freshness = self.project.calculate_freshness()
        # 1 repo * 0.3 weight = 0.3, normalized: (0.3/20)*100 = 1.5
        self.assertEqual(freshness, 1.5)

    def test_freshness_mixed_activity(self):
        """Test freshness with repos in different time windows"""
        # 2 repos in last 7 days
        Repo.objects.create(
            project=self.project,
            name="very-active-1",
            repo_url="https://github.com/test/active1",
            is_archived=False,
            updated_at=self.now - timedelta(days=2)
        )
        Repo.objects.create(
            project=self.project,
            name="very-active-2",
            repo_url="https://github.com/test/active2",
            is_archived=False,
            updated_at=self.now - timedelta(days=5)
        )
        # 3 repos in 8-30 day window
        for i in range(3):
            Repo.objects.create(
                project=self.project,
                name=f"recent-{i}",
                repo_url=f"https://github.com/test/recent{i}",
                is_archived=False,
                updated_at=self.now - timedelta(days=10 + i)
            )
        # 1 repo in 31-90 day window
        Repo.objects.create(
            project=self.project,
            name="older",
            repo_url="https://github.com/test/older",
            is_archived=False,
            updated_at=self.now - timedelta(days=45)
        )
        
        freshness = self.project.calculate_freshness()
        # raw_score = 2*1.0 + 3*0.6 + 1*0.3 = 2 + 1.8 + 0.3 = 4.1
        # normalized: (4.1/20)*100 = 20.5
        self.assertEqual(freshness, 20.5)

    def test_freshness_no_recent_activity(self):
        """Test freshness when all repos are inactive (>90 days)"""
        Repo.objects.create(
            project=self.project,
            name="stale-repo",
            repo_url="https://github.com/test/stale",
            is_archived=False,
            updated_at=self.now - timedelta(days=120)
        )
        freshness = self.project.calculate_freshness()
        self.assertEqual(freshness, 0.0)

    def test_freshness_max_score_cap(self):
        """Test that freshness is capped at 100"""
        # Create 25 repos active in last 7 days (25*1.0 = 25 > MAX_SCORE)
        for i in range(25):
            Repo.objects.create(
                project=self.project,
                name=f"active-repo-{i}",
                repo_url=f"https://github.com/test/repo{i}",
                is_archived=False,
                updated_at=self.now - timedelta(days=1)
            )
        
        freshness = self.project.calculate_freshness()
        # raw_score = 25, normalized: (25/20)*100 = 125, capped at 100
        self.assertEqual(freshness, 100.0)

    def test_freshness_exact_boundaries(self):
        """Test repos exactly at time boundaries"""
        # Repo exactly 7 days old (should be in 7-day window)
        Repo.objects.create(
            project=self.project,
            name="boundary-7",
            repo_url="https://github.com/test/boundary7",
            is_archived=False,
            updated_at=self.now - timedelta(days=7)
        )
        # Repo exactly 30 days old (should be in 30-day window)
        Repo.objects.create(
            project=self.project,
            name="boundary-30",
            repo_url="https://github.com/test/boundary30",
            is_archived=False,
            updated_at=self.now - timedelta(days=30)
        )
        # Repo exactly 90 days old (should be in 90-day window)
        Repo.objects.create(
            project=self.project,
            name="boundary-90",
            repo_url="https://github.com/test/boundary90",
            is_archived=False,
            updated_at=self.now - timedelta(days=90)
        )
        
        freshness = self.project.calculate_freshness()
        # All three should be counted: 1*1.0 + 1*0.6 + 1*0.3 = 1.9
        # normalized: (1.9/20)*100 = 9.5
        self.assertEqual(freshness, 9.5)

    def test_freshness_mixed_archived_and_active(self):
        """Test that archived repos are excluded from calculation"""
        # Active repo
        Repo.objects.create(
            project=self.project,
            name="active",
            repo_url="https://github.com/test/active",
            is_archived=False,
            updated_at=self.now - timedelta(days=2)
        )
        # Archived repo (should be ignored)
        Repo.objects.create(
            project=self.project,
            name="archived",
            repo_url="https://github.com/test/archived",
            is_archived=True,
            updated_at=self.now - timedelta(days=1)
        )
        
        freshness = self.project.calculate_freshness()
        # Only 1 active repo counted: 1*1.0 = 1.0, (1/20)*100 = 5.0
        self.assertEqual(freshness, 5.0)

    def test_freshness_rounding(self):
        """Test that freshness is rounded to 2 decimal places"""
        # Create scenario that produces non-round number
        for i in range(7):
            Repo.objects.create(
                project=self.project,
                name=f"repo-{i}",
                repo_url=f"https://github.com/test/repo{i}",
                is_archived=False,
                updated_at=self.now - timedelta(days=1)
            )
        
        freshness = self.project.calculate_freshness()
        # raw_score = 7, normalized: (7/20)*100 = 35.0
        self.assertEqual(freshness, 35.0)
        # Verify it's a float with max 2 decimal places
        self.assertIsInstance(freshness, float)
        self.assertEqual(len(str(freshness).split('.')[-1]), 1)  # Only 1 decimal place for .0

    def test_freshness_persistence(self):
        """Test that freshness can be saved to the database"""
        Repo.objects.create(
            project=self.project,
            name="test-repo",
            repo_url="https://github.com/test/repo",
            is_archived=False,
            updated_at=self.now - timedelta(days=5)
        )
        
        freshness = self.project.calculate_freshness()
        self.project.freshness = freshness
        self.project.save()
        
        # Reload from database
        reloaded = Project.objects.get(id=self.project.id)
        self.assertEqual(float(reloaded.freshness), freshness)

2. Management Command Tests - website/tests/test_update_project_freshness_command.py

"""
Tests for the update_project_freshness management command.
"""
from datetime import timedelta
from io import StringIO
from unittest.mock import patch

from django.core.management import call_command
from django.test import TestCase
from django.utils import timezone

from website.models import Organization, Project, Repo


class UpdateProjectFreshnessCommandTestCase(TestCase):
    """Test cases for update_project_freshness management command"""

    def setUp(self):
        """Set up test data"""
        self.org = Organization.objects.create(
            name="Test Org",
            url="https://test.org"
        )
        self.now = timezone.now()

    def test_command_updates_all_projects(self):
        """Test that command updates freshness for all projects"""
        # Create projects with different activity levels
        project1 = Project.objects.create(
            name="Active Project",
            organization=self.org,
            url="https://github.com/test/active"
        )
        Repo.objects.create(
            project=project1,
            name="active-repo",
            repo_url="https://github.com/test/active-repo",
            is_archived=False,
            updated_at=self.now - timedelta(days=2)
        )

        project2 = Project.objects.create(
            name="Inactive Project",
            organization=self.org,
            url="https://github.com/test/inactive"
        )
        Repo.objects.create(
            project=project2,
            name="old-repo",
            repo_url="https://github.com/test/old-repo",
            is_archived=False,
            updated_at=self.now - timedelta(days=100)
        )

        project3 = Project.objects.create(
            name="No Repos Project",
            organization=self.org,
            url="https://github.com/test/empty"
        )

        # Run command
        out = StringIO()
        call_command('update_project_freshness', stdout=out)

        # Verify all projects were updated
        project1.refresh_from_db()
        project2.refresh_from_db()
        project3.refresh_from_db()

        self.assertEqual(float(project1.freshness), 5.0)  # 1 repo * 1.0 weight
        self.assertEqual(float(project2.freshness), 0.0)  # No recent activity
        self.assertEqual(float(project3.freshness), 0.0)  # No repos

        # Check output
        output = out.getvalue()
        self.assertIn('Starting freshness update', output)
        self.assertIn('Processed: 3', output)
        self.assertIn('Errors: 0', output)
        self.assertIn('Freshness update completed', output)

    def test_command_progress_reporting(self):
        """Test that command reports progress every 100 projects"""
        # Create 250 projects
        for i in range(250):
            Project.objects.create(
                name=f"Project {i}",
                organization=self.org,
                url=f"https://github.com/test/project{i}"
            )

        out = StringIO()
        call_command('update_project_freshness', stdout=out)

        output = out.getvalue()
        # Should see progress at 100 and 200
        self.assertIn('Processed 100/250', output)
        self.assertIn('Processed 200/250', output)

    def test_command_handles_errors_gracefully(self):
        """Test that command handles individual project errors without stopping"""
        project1 = Project.objects.create(
            name="Good Project",
            organization=self.org,
            url="https://github.com/test/good"
        )
        Repo.objects.create(
            project=project1,
            name="good-repo",
            repo_url="https://github.com/test/good-repo",
            is_archived=False,
            updated_at=self.now - timedelta(days=5)
        )

        project2 = Project.objects.create(
            name="Error Project",
            organization=self.org,
            url="https://github.com/test/error"
        )

        out = StringIO()
        err = StringIO()

        # Mock calculate_freshness to raise error for one project
        original_calculate = Project.calculate_freshness
        def mock_calculate(self):
            if self.name == "Error Project":
                raise ValueError("Test error")
            return original_calculate(self)

        with patch.object(Project, 'calculate_freshness', mock_calculate):
            call_command('update_project_freshness', stdout=out, stderr=err)

        # Check that good project was updated
        project1.refresh_from_db()
        self.assertEqual(float(project1.freshness), 5.0)

        # Check error was logged
        error_output = err.getvalue()
        self.assertIn(f'[ERROR] Project ID {project2.id}', error_output)
        self.assertIn('Test error', error_output)

        # Check summary shows 1 error
        output = out.getvalue()
        self.assertIn('Processed: 1', output)
        self.assertIn('Errors: 1', output)

    def test_command_only_updates_freshness_field(self):
        """Test that command only updates the freshness field, not other fields"""
        project = Project.objects.create(
            name="Test Project",
            organization=self.org,
            url="https://github.com/test/project"
        )
        Repo.objects.create(
            project=project,
            name="repo",
            repo_url="https://github.com/test/repo",
            is_archived=False,
            updated_at=self.now - timedelta(days=3)
        )

        # Get original updated timestamp
        original_name = project.name

        # Mock save to track what fields are updated
        with patch.object(Project, 'save') as mock_save:
            call_command('update_project_freshness', stdout=StringIO())
            
            # Verify save was called with update_fields=['freshness']
            self.assertTrue(mock_save.called)
            call_args = mock_save.call_args
            self.assertEqual(call_args[1].get('update_fields'), ['freshness'])

    def test_command_execution_time_reported(self):
        """Test that command reports execution time"""
        Project.objects.create(
            name="Test Project",
            organization=self.org,
            url="https://github.com/test/project"
        )

        out = StringIO()
        call_command('update_project_freshness', stdout=out)

        output = out.getvalue()
        self.assertIn('Execution time:', output)
        self.assertIn('s', output)  # Should have 's' for seconds

    def test_command_with_zero_projects(self):
        """Test command behavior when there are no projects"""
        out = StringIO()
        call_command('update_project_freshness', stdout=out)

        output = out.getvalue()
        self.assertIn('Starting freshness update for 0 projects', output)
        self.assertIn('Processed: 0', output)
        self.assertIn('Errors: 0', output)

3. API Filtering Tests - Add to website/tests/test_api.py

# Add this test class to the existing test_api.py file

class ProjectFreshnessFilteringTestCase(APITestCase):
    """Test cases for Project API freshness filtering"""

    def setUp(self):
        """Set up test data"""
        self.org = Organization.objects.create(
            name="Test Organization",
            url="https://test.org"
        )
        
        # Create projects with different freshness scores
        self.high_freshness_project = Project.objects.create(
            name="High Freshness",
            organization=self.org,
            url="https://github.com/test/high",
            freshness=85.50
        )
        
        self.medium_freshness_project = Project.objects.create(
            name="Medium Freshness",
            organization=self.org,
            url="https://github.com/test/medium",
            freshness=50.25
        )
        
        self.low_freshness_project = Project.objects.create(
            name="Low Freshness",
            organization=self.org,
            url="https://github.com/test/low",
            freshness=15.75
        )
        
        self.zero_freshness_project = Project.objects.create(
            name="Zero Freshness",
            organization=self.org,
            url="https://github.com/test/zero",
            freshness=0.0
        )

    def test_filter_by_valid_freshness(self):
        """Test filtering projects by valid freshness threshold"""
        response = self.client.get('/api/v1/projects/filter/?freshness=50')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # Should return projects with freshness >= 50
        self.assertEqual(len(data['results']), 2)
        names = [p['name'] for p in data['results']]
        self.assertIn('High Freshness', names)
        self.assertIn('Medium Freshness', names)

    def test_filter_by_zero_freshness(self):
        """Test filtering with freshness=0 (should return all projects)"""
        response = self.client.get('/api/v1/projects/filter/?freshness=0')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # All projects have freshness >= 0
        self.assertEqual(len(data['results']), 4)

    def test_filter_by_high_freshness(self):
        """Test filtering with high freshness threshold"""
        response = self.client.get('/api/v1/projects/filter/?freshness=80')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # Only high freshness project should match
        self.assertEqual(len(data['results']), 1)
        self.assertEqual(data['results'][0]['name'], 'High Freshness')

    def test_filter_by_freshness_100(self):
        """Test filtering with freshness=100 (maximum valid value)"""
        response = self.client.get('/api/v1/projects/filter/?freshness=100')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # No projects have freshness >= 100
        self.assertEqual(len(data['results']), 0)

    def test_filter_freshness_invalid_negative(self):
        """Test that negative freshness values are rejected"""
        response = self.client.get('/api/v1/projects/filter/?freshness=-10')
        
        self.assertEqual(response.status_code, 400)
        self.assertIn('must be between 0 and 100', response.json()['error'])

    def test_filter_freshness_invalid_over_100(self):
        """Test that freshness values over 100 are rejected"""
        response = self.client.get('/api/v1/projects/filter/?freshness=150')
        
        self.assertEqual(response.status_code, 400)
        self.assertIn('must be between 0 and 100', response.json()['error'])

    def test_filter_freshness_invalid_non_numeric(self):
        """Test that non-numeric freshness values are rejected"""
        response = self.client.get('/api/v1/projects/filter/?freshness=invalid')
        
        self.assertEqual(response.status_code, 400)
        self.assertIn('must be a valid number', response.json()['error'])

    def test_filter_freshness_decimal_value(self):
        """Test filtering with decimal freshness value"""
        response = self.client.get('/api/v1/projects/filter/?freshness=50.5')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # Should return projects with freshness >= 50.5
        self.assertEqual(len(data['results']), 1)
        self.assertEqual(data['results'][0]['name'], 'High Freshness')

    def test_filter_freshness_combined_with_other_filters(self):
        """Test freshness filter combined with other filters"""
        # Add repos for star filtering
        Repo.objects.create(
            project=self.high_freshness_project,
            name="popular-repo",
            repo_url="https://github.com/test/popular",
            stars=1000,
            forks=100
        )
        Repo.objects.create(
            project=self.low_freshness_project,
            name="unpopular-repo",
            repo_url="https://github.com/test/unpopular",
            stars=10,
            forks=5
        )

        # Filter by both freshness and stars
        response = self.client.get('/api/v1/projects/filter/?freshness=50&stars=500')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # Should return only high freshness project with enough stars
        self.assertEqual(len(data['results']), 1)
        self.assertEqual(data['results'][0]['name'], 'High Freshness')

    def test_filter_without_freshness_parameter(self):
        """Test that filtering works when freshness parameter is not provided"""
        response = self.client.get('/api/v1/projects/filter/')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # Should return all projects
        self.assertEqual(len(data['results']), 4)

    def test_freshness_field_in_api_response(self):
        """Test that freshness field is included in API response"""
        response = self.client.get('/api/v1/projects/filter/')
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        
        # Check that freshness field exists in response
        for project in data['results']:
            self.assertIn('freshness', project)
            self.assertIsNotNone(project['freshness'])

4. Integration Test - Add to website/tests/test_project_aggregation.py

# Add this test method to the existing ProjectAggregationTestCase class

def test_freshness_calculation_integration(self):
    """Integration test for freshness calculation with real data flow"""
    from datetime import timedelta
    from django.utils import timezone
    
    # Create project with repos
    org = Organization.objects.create(name="Integration Org", url="https://int.org")
    project = Project.objects.create(
        name="Integration Project",
        organization=org,
        url="https://github.com/int/project"
    )
    
    now = timezone.now()
    
    # Add repos with different activity levels
    Repo.objects.create(
        project=project,
        name="very-active",
        repo_url="https://github.com/int/active",
        is_archived=False,
        updated_at=now - timedelta(days=2)
    )
    Repo.objects.create(
        project=project,
        name="somewhat-active",
        repo_url="https://github.com/int/somewhat",
        is_archived=False,
        updated_at=now - timedelta(days=20)
    )
    Repo.objects.create(
        project=project,
        name="old-active",
        repo_url="https://github.com/int/old",
        is_archived=False,
        updated_at=now - timedelta(days=60)
    )
    Repo.objects.create(
        project=project,
        name="archived",
        repo_url="https://github.com/int/archived",
        is_archived=True,
        updated_at=now - timedelta(days=1)  # Should be ignored
    )
    
    # Calculate freshness
    freshness = project.calculate_freshness()
    
    # Expected: 1*1.0 + 1*0.6 + 1*0.3 = 1.9, normalized: (1.9/20)*100 = 9.5
    self.assertEqual(freshness, 9.5)
    
    # Save and verify persistence
    project.freshness = freshness
    project.save()
    
    project.refresh_from_db()
    self.assertEqual(float(project.freshness), 9.5)

These tests cover:

  • ✅ All time window scenarios (7/30/90 days)
  • ✅ Edge cases (no repos, archived repos, boundaries)
  • ✅ Score calculation accuracy
  • ✅ Management command execution and error handling
  • ✅ API filtering validation (valid/invalid inputs)
  • ✅ Combined filtering with other parameters
  • ✅ Database persistence
  • ✅ Integration testing

You can run these tests with:

# Run all new tests
python manage.py test website.tests.test_project_freshness
python manage.py test website.tests.test_update_project_freshness_command
python manage.py test website.tests.test_api.ProjectFreshnessFilteringTestCase

# Or run all tests
python manage.py test website.tests

@github-actions github-actions bot added changes-requested PR has requested changes from a reviewer files-changed: 10 PR changes 10 files and removed files-changed: 7 PR changes 7 files labels Dec 17, 2025
@Nachiket-Roy Nachiket-Roy marked this pull request as ready for review December 17, 2025 13:59
@Nachiket-Roy Nachiket-Roy marked this pull request as draft December 17, 2025 13:59
@Nachiket-Roy Nachiket-Roy marked this pull request as ready for review December 17, 2025 13:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between bbe2c76 and 8777dcc.

📒 Files selected for processing (6)
  • website/api/views.py (1 hunks)
  • website/models.py (1 hunks)
  • website/tests/test_api.py (2 hunks)
  • website/tests/test_project_aggregation.py (3 hunks)
  • website/tests/test_project_freshness.py (1 hunks)
  • website/tests/test_update_project_freshness_command.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • website/models.py
🧰 Additional context used
🧬 Code graph analysis (3)
website/tests/test_project_aggregation.py (1)
website/models.py (3)
  • Project (1366-1479)
  • Repo (1948-2025)
  • calculate_freshness (1401-1442)
website/tests/test_api.py (1)
website/models.py (2)
  • Project (1366-1479)
  • Repo (1948-2025)
website/tests/test_project_freshness.py (1)
website/models.py (4)
  • Organization (181-289)
  • Project (1366-1479)
  • Repo (1948-2025)
  • calculate_freshness (1401-1442)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: auto-assign
  • GitHub Check: remove_last_active_label
  • GitHub Check: add_changes_requested_label
  • GitHub Check: check_peer_review
  • GitHub Check: remove_last_active_label
  • GitHub Check: Run Tests
🔇 Additional comments (4)
website/tests/test_project_aggregation.py (1)

160-215: Excellent integration test with correct freshness calculation.

The test properly validates the end-to-end freshness calculation and persistence flow. The expected value of 9.5 is correct:

  • very-active (2 days ago) → active_7 = 1 → weight 1.0
  • somewhat-active (20 days ago) → active_30 = 1 → weight 0.6
  • old-active (60 days ago) → active_90 = 1 → weight 0.3
  • archived (1 day ago) → correctly excluded
  • raw_score = 1.9, normalized = (1.9/20)*100 = 9.5
website/api/views.py (1)

760-775: LGTM! Clean implementation of freshness filtering.

The validation logic is consistent with the existing stars/forks parameter handling, uses appropriate >= semantics with freshness__gte, and provides clear error messages for invalid input.

website/tests/test_update_project_freshness_command.py (1)

15-136: LGTM! Comprehensive command test coverage.

The test suite thoroughly exercises the management command with good coverage of:

  • Batch updates across multiple projects with varying activity levels
  • Graceful per-project error handling using proper mocking
  • Execution time reporting
  • Zero-projects edge case

The expected freshness value of 5.0 at Line 105 is correctly calculated (1 repo active 5 days ago → raw_score 1.0 → 5.0 freshness).

website/tests/test_api.py (1)

299-427: LGTM! Comprehensive API filtering test coverage.

The test suite thoroughly validates freshness filtering including:

  • Threshold filtering with correct >= semantics
  • Input validation for negative, >100, and non-numeric values
  • Decimal freshness values
  • Combined filtering with other parameters (stars)
  • Freshness field presence in API responses

The use of hardcoded freshness values in setUp (Lines 307-321) is appropriate for API-level testing. The actual freshness calculation logic is validated by the integration test in test_project_aggregation.py (lines 160-215).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
website/management/commands/update_project_freshness.py (1)

25-29: Add row-level locking to prevent concurrent update races.

As noted in the PR comments, concurrent executions of this command (scheduled run overlapping with a manual trigger, or a long-running job still active when the next scheduled run starts) can cause race conditions: both instances read and update the same projects, and the last write wins. The "losing" update is wasted work.

Recommended fix: Use select_for_update() to lock rows

Refactor to select and lock the batch inside the transaction:

     for offset in range(0, total, BATCH_SIZE):
-        batch = qs[offset : offset + BATCH_SIZE]
-
         with transaction.atomic():
+            batch = list(qs.select_for_update()[offset : offset + BATCH_SIZE])
             for project in batch:

Note: This assumes you adopt per-project transactions (from the previous comment). If you keep batch-level transactions, select_for_update() will lock all 500 rows for the duration of the batch, which is acceptable but reduces concurrency. If you move to per-project transactions, you'd need a different approach (e.g., select_for_update(skip_locked=True) to skip locked rows, or redesign locking strategy).

Based on learnings from PR comments: "Review identified concurrency/race concerns and lack of transaction/locking around updates. Recommendation: wrap updates in transactions and consider select_for_update()."

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2471406 and 3829d7f.

📒 Files selected for processing (1)
  • website/management/commands/update_project_freshness.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
website/management/commands/update_project_freshness.py (1)
website/models.py (2)
  • Project (1366-1481)
  • calculate_freshness (1401-1444)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run pre-commit
  • GitHub Check: Run Tests
  • GitHub Check: docker-test

@github-actions github-actions bot added tests: failed Django tests failed and removed tests: passed Django tests passed labels Dec 18, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
website/management/commands/update_project_freshness.py (1)

26-32: Batch-level transaction causes entire batch rollback on single failure.

The current implementation wraps 500 projects in a single transaction.atomic() block. If any project's calculate_freshness() or save() fails (e.g., due to an IntegrityError, database deadlock, or calculation error), Django marks the transaction as aborted and rolls back all 500 projects in that batch — even those that were processed successfully before the failure.

Impact: While select_for_update() prevents concurrent modifications (addressing the "concurrency fixed" commit), it doesn't provide isolation from rollbacks within the batch. A single failing project can cause 499 successful updates to be lost.

🔎 Alternative: Per-project transactions for better isolation

As recommended in the previous review, move transaction.atomic() inside the per-project loop to isolate failures:

         for offset in range(0, total, BATCH_SIZE):
-            try:
-                with transaction.atomic():
-                    batch = list(qs.select_for_update()[offset : offset + BATCH_SIZE])
-                    for project in batch:
-                        project.freshness = project.calculate_freshness()
-                        project.save(update_fields=["freshness"])
-                        processed += 1
-            except Exception as e:
-                errors += 1
-                self.stderr.write(f"[ERROR] Project ID {project.id}: {str(e)}")
+            batch = list(qs[offset : offset + BATCH_SIZE])
+            for project in batch:
+                try:
+                    with transaction.atomic():
+                        freshness = project.calculate_freshness()
+                        project.freshness = freshness
+                        project.save(update_fields=["freshness"])
+                    processed += 1
+                except Exception as e:
+                    errors += 1
+                    self.stderr.write(f"[ERROR] Project ID {project.id}: {str(e)}")

Trade-offs:

  • ✅ One failure doesn't affect other projects
  • ✅ Accurate error reporting
  • ❌ Removes batch-level select_for_update() (loses row-level locking for the batch)
  • ❌ Slightly higher transaction overhead (500 small transactions vs. 1 large)

Alternative hybrid approach: Use select_for_update(skip_locked=True) with per-project transactions to skip locked rows instead of waiting, or add retry logic for transient deadlocks.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 08b37bf and c934c8c.

📒 Files selected for processing (1)
  • website/management/commands/update_project_freshness.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
website/management/commands/update_project_freshness.py (3)
website/models.py (12)
  • Project (1366-1481)
  • calculate_freshness (1401-1444)
  • save (74-77)
  • save (277-289)
  • save (1446-1473)
  • save (1581-1584)
  • save (1774-1777)
  • save (1892-1906)
  • save (1997-2019)
  • save (2818-2821)
  • save (3350-3360)
  • save (3658-3667)
website/management/commands/run_daily.py (2)
  • Command (10-54)
  • handle (13-54)
website/static/js/debug-panel.js (1)
  • options (399-406)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run Tests
  • GitHub Check: docker-test

@Nachiket-Roy Nachiket-Roy marked this pull request as draft December 18, 2025 21:29
@github-actions github-actions bot added tests: passed Django tests passed and removed tests: failed Django tests failed labels Dec 18, 2025
@Nachiket-Roy Nachiket-Roy marked this pull request as ready for review December 18, 2025 21:38
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
website/management/commands/update_project_freshness.py (1)

41-41: Minor: Progress message could be clearer.

The progress message reports "Processed X/Y projects..." but this count includes both successful and failed updates (it's actually "attempted"). The final summary (lines 45-48) correctly separates processed vs. errors, so the information is available at the end.

Optional: Make progress message more precise

Consider rephrasing to reflect that this is the attempted count:

-            self.stdout.write(f"Processed {min(offset + BATCH_SIZE, total)}/{total} projects...")
+            self.stdout.write(f"Attempted {min(offset + BATCH_SIZE, total)}/{total} projects...")

Or, for more detail, include the current processed/error counts in the progress message:

-            self.stdout.write(f"Processed {min(offset + BATCH_SIZE, total)}/{total} projects...")
+            self.stdout.write(
+                f"Progress: {min(offset + BATCH_SIZE, total)}/{total} attempted "
+                f"({processed} successful, {errors} errors)"
+            )
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between c934c8c and 6e3ebb2.

📒 Files selected for processing (1)
  • website/management/commands/update_project_freshness.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
website/management/commands/update_project_freshness.py (1)
website/models.py (12)
  • Project (1366-1481)
  • calculate_freshness (1401-1444)
  • save (74-77)
  • save (277-289)
  • save (1446-1473)
  • save (1581-1584)
  • save (1774-1777)
  • save (1892-1906)
  • save (1997-2019)
  • save (2818-2821)
  • save (3350-3360)
  • save (3658-3667)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run Tests
  • GitHub Check: docker-test
🔇 Additional comments (4)
website/management/commands/update_project_freshness.py (4)

1-8: LGTM: Imports and batch size configuration are appropriate.

The imports are necessary for the command functionality, and a batch size of 500 provides a good balance between memory usage and the number of ID-fetching queries.


11-23: LGTM: Command initialization is well-structured.

The use of only("id") optimizes memory usage by fetching only the necessary field. The counter initialization and progress messaging provide good visibility into the command's execution.


25-39: Excellent fix: Per-project transactions correctly address all past critical issues.

The refactored implementation successfully resolves the previously identified problems:

  • Per-project isolation (line 30): Each project's update is wrapped in its own transaction.atomic(), so one failure no longer rolls back the entire batch.
  • No UnboundLocalError (line 39): The exception handler references project_id (always defined in the loop) rather than project.id.
  • Accurate metrics (line 35): The processed counter increments only after a successful save, ensuring counts reflect actual persisted updates.

The row-level lock via select_for_update() (line 31) is correctly placed inside the transaction, preventing race conditions during the freshness calculation and save.

Trade-off note: Holding the lock during calculate_freshness() (line 33) ensures consistency but increases lock duration since the method queries related repos. This is acceptable for correctness, though it does mean slightly longer lock hold times per project.


43-48: LGTM: Summary output is clear and informative.

The final summary correctly reports the processed count, errors, and execution time separately, providing clear visibility into the command's results.

@Nachiket-Roy
Copy link
Contributor Author

Hey Man Great work!! Just small issues and you should be good to go:- 1)The current implementation is vulnerable to race conditions if multiple tasks update freshness concurrently 2)The update_project_freshness command doesn’t explicitly wrap batch updates in transactions, which could lead to data inconsistencies during failures

Good catch! I have resolved the issues feel free to recheck

@github-actions github-actions bot added tests: failed Django tests failed and removed tests: passed Django tests passed labels Dec 18, 2025
@Nachiket-Roy Nachiket-Roy marked this pull request as draft December 18, 2025 21:50
@Nachiket-Roy Nachiket-Roy marked this pull request as ready for review December 18, 2025 22:11
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
website/management/commands/update_project_freshness.py (1)

25-44: Excellent fix! Previous critical issues resolved.

The per-project transaction approach correctly addresses all the concerns raised in previous reviews:

  1. UnboundLocalError fixed: Line 39 now references project_id (always defined) instead of project (which might be unassigned if an exception occurs during get()).

  2. Transaction isolation: Each project is processed in its own atomic() block (line 30), so one failure won't roll back others in the batch.

  3. Accurate metrics: The processed counter is incremented inside the atomic block (line 35), ensuring it only counts successfully committed updates.

  4. Correct locking: select_for_update() (line 31) prevents concurrent updates to the same project.

The two-level loop structure (batching IDs for memory management, then per-project transactions) strikes a good balance between efficiency and robustness.

Optional: Consider adding a timeout to select_for_update()

If two instances of this command run concurrently, one will wait indefinitely for locks held by the other. You could add a timeout to fail fast:

project = Project.objects.select_for_update(nowait=True).get(pk=project_id)

or

project = Project.objects.select_for_update(skip_locked=True).get(pk=project_id)

nowait=True raises an exception immediately if the row is locked; skip_locked=True would skip locked rows entirely. This is a nice-to-have enhancement, not required for this PR.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 6e3ebb2 and ecd3df7.

📒 Files selected for processing (1)
  • website/management/commands/update_project_freshness.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
website/management/commands/update_project_freshness.py (1)
website/models.py (12)
  • Project (1366-1481)
  • calculate_freshness (1401-1444)
  • save (74-77)
  • save (277-289)
  • save (1446-1473)
  • save (1581-1584)
  • save (1774-1777)
  • save (1892-1906)
  • save (1997-2019)
  • save (2818-2821)
  • save (3350-3360)
  • save (3658-3667)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run Tests
  • GitHub Check: docker-test
🔇 Additional comments (3)
website/management/commands/update_project_freshness.py (3)

1-8: LGTM! Clean imports and reasonable batch size.

The imports are appropriate for a management command that processes records in batched transactions. BATCH_SIZE of 500 provides good balance between memory usage and database round-trips.


14-24: LGTM! Efficient initialization.

Using only("id") when building the queryset is a good optimization that minimizes memory overhead when fetching batch IDs. The initialization logic is clean and includes helpful user feedback.


46-51: LGTM! Comprehensive reporting.

The final summary provides all the key metrics (processed count, error count, execution time) that operators need to assess the command's success. Using self.style.SUCCESS follows Django management command conventions.

@Nachiket-Roy Nachiket-Roy marked this pull request as draft December 18, 2025 22:19
@Nachiket-Roy Nachiket-Roy marked this pull request as ready for review December 18, 2025 22:19
@github-actions github-actions bot added tests: passed Django tests passed and removed tests: failed Django tests failed labels Dec 18, 2025
Copy link
Contributor

@Jayant2908 Jayant2908 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!

@github-actions github-actions bot added the last-active: 0d PR last updated 0 days ago label Dec 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changes-requested PR has requested changes from a reviewer files-changed: 10 PR changes 10 files last-active: 0d PR last updated 0 days ago migrations PR contains database migration files needs-peer-review PR needs peer review pre-commit: passed Pre-commit checks passed quality: high tests: passed Django tests passed

Projects

Status: Ready

Development

Successfully merging this pull request may close these issues.

freshness field architecture problem

3 participants