LeetCode Scraper is a Python-based tool designed to fetch and store details from LeetCode study plans into a PostgreSQL database. This tool leverages Docker for easy setup and environment management.
- Fetches LeetCode problems and study plans
- Stores data in a PostgreSQL/Supabase database
- Provides caching to reduce redundant requests
- Handles rate limiting with retry mechanisms
Before you begin, ensure you have met the following requirements:
- Docker and Docker Compose installed on your machine.
- Python 3.9 or higher.
- PostgreSQL database.
Clone the Repository
git clone --recurse-submodules https://github.com/daily-coding-problem/leetcode-scraper.git
cd leetcode-scraperSetup Python Environment
Use the following commands to set up the Python environment if you do not want to use Docker:
python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install --no-rootSetup Docker
If you would like to use Docker, ensure Docker and Docker Compose are installed on your machine. If not, follow the installation guides for Docker and Docker Compose.
Build Docker Images
docker compose buildCreate the Network
docker network create dcpEnvironment Variables
Create a .env file in the project root with the following content:
# LeetCode credentials
CSRF_TOKEN=your_csrf_token
LEETCODE_SESSION=your_leetcode_session
# PostgreSQL credentials
POSTGRES_USER=your_db_user
POSTGRES_PASSWORD=your_db_password
POSTGRES_DB=your_db_name
POSTGRES_PORT=5432Run the scraper with the specified plans:
docker compose run leetcode-scraper --plans leetcode-75 top-interview-150Or without Docker:
poetry run python main.py --plans leetcode-75 top-interview-150Run the scraper with the specified company and timeframe:
docker compose run leetcode-scraper --company google --timeframe 3mOr without Docker:
poetry run python main.py --company google --timeframe 3mThis will fetch the most asked questions at Google in the last 3 months.
The options for --timeframe are: 30d, 3m, or 6m.
- If no timeframe is specified, the default is
6m. - If the timeframe is invalid, the default will be used.
Run the tests with the following command:
poetry run pytestThis project is licensed under the MIT License - see the LICENSE file for details.