A Python-based web scraper built with Selenium to collect public Twitter data for academic and research purposes.
- Collect tweets by keyword, date range, or user.
- Extract text content, likes, retweets, timestamps, and author info.
- Store results in structured formats:
SQLiteandJSON. - Designed for non-commercial, ethical research only.
- This tool does not collect private or sensitive user data.
- Respects
robots.txtand avoids aggressive scraping. - Uses delays between requests to prevent rate-limiting.
- Not intended for commercial use or mass data harvesting.
- Python
- Selenium
- BeautifulSoup (optional)
- SQLite / JSON
- Pandas (for analysis)
{
"tweet_id": "123456789",
"text": "Great day at the university!",
"likes": 12,
"retweets": 3,
"timestamp": "2024-07-15T14:30:00Z",
"author": "@user123"
}