Automates the TaskRabbit booking flow to extract taskers for multiple categories (e.g., Furniture Assembly, Plumbing, Electrical Help, etc.) and saves results as CSV files in the Taskers/ folder. Supports multi-page scraping and both CLI and interactive modes.
- Multi-category support via
CATEGORIESintaskrabbit_parser.py - Automated navigation through category-specific booking flows
- Pagination handling to capture taskers across multiple pages
- Rich extraction: name, hourly rate, ratings, review counts, and more
- CSV output per category in
Taskers/with timestamped filenames - Headless or visible Chrome operation
- Interactive mode when no CLI args are provided
- Python 3.7+
- Google Chrome installed
- Selenium 4.15+ (driver managed by Selenium Manager; no manual ChromeDriver setup needed)
Install dependencies:
pip install -r requirements.txtrequirements.txt:
selenium==4.15.2
webdriver-manager==4.0.1
Note: The script uses Selenium 4’s Selenium Manager to resolve ChromeDriver automatically. No explicit use of webdriver-manager is required, but it remains listed for compatibility.
From the project root (TaskRabbitScanner/):
# Interactive mode (prompts for a category or All) if no args are provided
python taskrabbit_parser.py
# Run a specific category by key
python taskrabbit_parser.py furniture_assembly
python taskrabbit_parser.py plumbing
python taskrabbit_parser.py electrical
# Run all configured categories
python taskrabbit_parser.py allProgrammatic helpers in taskrabbit_parser.py:
from taskrabbit_parser import TaskRabbitParser, run_parser_for_category, run_all_categories
# Single category
parser = TaskRabbitParser(category='plumbing', headless=False, max_pages=5)
parser.run()
# Helper functions
csv_file = run_parser_for_category('furniture_assembly', headless=True, max_pages=None)
results = run_all_categories(headless=False, max_pages=3)Categories are defined in taskrabbit_parser.py under CATEGORIES. Example keys currently included:
furniture_assembly→ Furniture Assemblyplumbing→ Plumbingelectrical→ Electrical Helpdoor_repair→ Door, Cabinet & Furniture Repairsealing_caulking→ Sealing and caulkingappliance_installation→ Appliance Installationflooring_tiling→ Flooring & Tiling Helpwall_repair→ Wall Repairwindow_blinds_repair→ Window & Blinds Repairsmart_home→ Smart Home Installationinterior_painting→ Interior Painting
Each category has its own URL and option flow in CATEGORIES[<key>]['options']. The parser navigates accordingly (e.g., furniture type, size, task details, vehicle requirements).
Configuration lives in taskrabbit_parser.py:
- Headless mode: pass
headless=TruetoTaskRabbitParser(...) - Page limit: set
max_pagesinTaskRabbitParser(...)orMAX_PAGES_FOR_TESTINGconstant - Output directory: CSVs are saved to
Taskers/automatically (created if missing) - Timing controls: adjust
SLEEP_*constants for waits and page loads
Example constructor:
TaskRabbitParser(category='furniture_assembly', headless=False, max_pages=None)CSV files are saved as Taskers/<category_name>_<YYYYMMDD_HHMMSS>.csv. Columns include:
namehourly_ratereview_ratingreview_countfurniture_tasksoverall_taskstwo_hour_minimumelite_status
- Navigates directly to each category page
- Closes overlays/popups defensively
- Enters address
6619 10th Ave, brooklyn, 11219, NY - Selects category-specific options from
CATEGORIES - Extracts tasker cards, paginates, and writes CSV
- Built with
seleniumand Chrome; keep Chrome up to date - Waits and overlay handling are tuned for dynamic content
- When run without arguments, an interactive selector is shown
TaskRabbitScanner/
├── Taskers/ # CSV outputs (created automatically)
├── taskrabbit_parser.py # Main script (multi-category)
├── README.md # This file
├── README_MultiCategory.md # Legacy write-up about multi-category
└── requirements.txt # Python dependencies