Creators: Bar I. | Omer C. | Sahar G.
Main project purpose:
Create an easy-to-read db of restaurants in chosen cities.
The db will contain 5 tables: cities, restaurants, cuisines, reviews, awards.
See ERD below for tables contents.
By using the tripadvisor_scraper.py you can insert list of cities and number of pages per city (max 30 restaurants per page) and it will insert desired data to db tables.
The arguments of tripadvisor_scraper.py are as follows:
- cities - name of the desired cities,
-c "city_1" "city_2" - pages - Number of restaurants pages to scrape per city
-p #num - API - Optional - perform scraping using Travel Advisor API (RapidAPI)
--API
####Initial Configuration:
- Make sure you have Google Chrome browser installed (relevant for web scraping, not API)
- Install requirements.txt
pip install -r requirements.txt - Edit db_config.py
USERNAMEandPASSWORDwith local MySQL configuration - Edit
HEADERSin config.py for Travel Advisor API based on you personal account https://rapidapi.com/apidojo/api/travel-advisor/
Run tripadvisor_scraper.py -c "city_1" "city_2" etc -p #num#
####Data which can be retrieved only via API:
citiestable -num_restaurants,timezone,num_reviews,latitude,longituderestaurantstable -latitude,longitudeawardstablereviewstable - API is limited to 3 reviews per restaurant, Web scraper limited to 10.