Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 13, 2025

Overview

This PR adds UUID (Universally Unique Identifier) support to all candidate data in the repository, resolving issue #[issue_number]. Previously, candidate records had no unique identifiers, making it difficult to reference specific candidates across different parts of the application.

Changes Made

1. Created UUID Update Script

Added scripts/add_uuid_to_candidates.py - a standalone, executable script that:

  • Automatically discovers all candidate JSON files in the data directory
  • Generates a unique UUID for each candidate record
  • Is idempotent - safely handles running multiple times without creating duplicates
  • Provides clear progress reporting and statistics

Example usage:

python3 scripts/add_uuid_to_candidates.py
# or
./scripts/add_uuid_to_candidates.py

2. Updated Data Models

Modified app/models/candidate.py to include UUID fields:

  • Added optional uuid field to LokSabhaCandidate model
  • Added optional uuid field to AssemblyCandidate model
  • Fields are optional to maintain backward compatibility

Before:

class AssemblyCandidate(BaseModel):
    constituency_code: str = Field(alias="Constituency Code")
    name: str = Field(alias="Name")
    party: str = Field(alias="Party")
    # ...

After:

class AssemblyCandidate(BaseModel):
    uuid: Optional[str] = None
    constituency_code: str = Field(alias="Constituency Code")
    name: str = Field(alias="Name")
    party: str = Field(alias="Party")
    # ...

3. Updated Scrapers

Modified both scrapers to automatically generate UUIDs for newly scraped candidates:

  • app/scrapers/lok_sabha.py - generates UUIDs when scraping Lok Sabha data
  • app/scrapers/vidhan_sabha.py - generates UUIDs when scraping Vidhan Sabha data

This ensures all future scraped data will include unique identifiers.

4. Updated Existing Data

Ran the UUID update script to add identifiers to all existing candidate records:

  • 769 Delhi Assembly 2025 candidates
  • 4,424 Maharashtra Assembly 2024 candidates
  • Total: 5,193 candidates now have unique UUIDs

All UUIDs follow the standard UUID4 format (e.g., baa14e3d-00e4-49de-8c56-75e06a069eb2).

5. Added Documentation

Created scripts/README.md with comprehensive documentation including:

  • Usage instructions
  • Feature descriptions
  • Example output
  • When to use the script

Sample Data Structure

Before:

{
  "Constituency Code": "DL-1",
  "Name": "RAJ KARAN KHATRI",
  "Party": "Bharatiya Janata Party",
  "Status": "WON",
  "Votes": "87215",
  "Margin": "8596"
}

After:

{
  "uuid": "baa14e3d-00e4-49de-8c56-75e06a069eb2",
  "Constituency Code": "DL-1",
  "Name": "RAJ KARAN KHATRI",
  "Party": "Bharatiya Janata Party",
  "Status": "WON",
  "Votes": "87215",
  "Margin": "8596"
}

Benefits

  • Unique Identification: Each candidate can now be uniquely referenced across the system
  • API Improvements: Enables better RESTful API design with stable candidate URLs
  • Data Integrity: Easier to track and deduplicate candidate records
  • Future-Proof: New scraped data automatically includes UUIDs
  • Non-Breaking: Optional fields maintain backward compatibility

Testing

All changes have been thoroughly tested:

  • ✅ Verified all 5,193 candidates have UUIDs
  • ✅ Confirmed all UUIDs are unique (no duplicates)
  • ✅ Validated UUID format (standard UUID4)
  • ✅ Tested script idempotency (safe to run multiple times)
  • ✅ Verified model compatibility with updated data

Notes

The Lok Sabha 2024 candidates file is currently empty, so no UUIDs were added to it. When data is scraped for this election, the scraper will automatically generate UUIDs.

Original prompt

This section details on the original issue you should resolve

<issue_title>Create Script to Update Candidate Data with UUID</issue_title>
<issue_description>Currently stored Candidate data doesn't have any UUID attached to it. Let's update it</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #3


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add script to update candidate data with UUID Add UUID support to candidate data for unique identification Oct 13, 2025
Copilot AI requested a review from imsks October 13, 2025 08:42
Copilot finished work on behalf of imsks October 13, 2025 08:42
- Replaced `requests` with `httpx` to enable HTTP/2 connections, as required by the ECI website.
- Updated the `requirements.in` and `requirements.txt` files to reflect the change in dependencies.
- Enhanced the scraper's request handling with a shared `httpx` client to maintain cookies and improve performance.
- Introduced a new `normalize_base_url` function to standardize URLs in the scrapers.
- Updated the Lok Sabha and Vidhan Sabha scrapers to utilize the new URL normalization function.
@imsks imsks marked this pull request as ready for review October 14, 2025 17:28
@imsks imsks merged commit eaea536 into main Oct 14, 2025
1 of 5 checks passed
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

"margin": "27862"
}
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing UUIDs in Candidate Data

The candidates.json file, added in this commit, is missing the uuid field for all candidate records. This creates a data inconsistency, as UUIDs are now generated by scrapers and expected by models, which may lead to downstream issues.

Fix in Cursor Fix in Web

@imsks imsks deleted the copilot/update-candidate-data-uuid branch November 12, 2025 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create Script to Update Candidate Data with UUID

2 participants