A spider from Instacart!
The spider capture the products of the first store in the account.
To broken the Recaptcha has used the 2Captcha system, so it's necessary to set the API KEY as environment vars.
First of all, create .env file in the root of the project and set all environment vars.
See the .env-example file
1 - Auth credentials of Instacart Site
AUTH_USER=
AUTH_PASSWORD=2 - The 2Captcha API KEY
2CAPTCHA_API_KEY=
2CAPTCHA_URL=https://2captcha.com/in.php3 - Save products on DB (ElasticSearch)
SAVE_DB_ITEM=TrueCreate a virtualenv and install dependencies: make setup
To run, use:
make runTo run using docker, you can use:
make run-dockerYour server is running: http://0.0.0.0:8080
Just access: http://0.0.0.0:8080/instacart
If you set SAVE_DB_ITEM=True and executed make run-docker you can see all products on Kibana here: http://localhost:5601/app/discover#/
1 - Create a Dashboard where is possible to see the processing of scraping in real-time
2 - Unit Tests
3 - Treat all exceptions