Thanks to visit codestin.com
Credit goes to github.com

Skip to content

heloysapelizon/CPA_t1

Repository files navigation

Webscrapping with Python

This repository is made for Project 1 of the discipline Collection, Preparation, and Data Analysis on Pontifical Catholic University of Rio Grande do Sul.

It consists of two activities of webscrapping: one in a desktop application - the 'paises.ipynb'-, and one in a real environment - 'imdb.ipynb' file.

Instructions

This project uses Beautiful Soup on both parts to parse through the html files.

  • For IMDB notebook:

This notebook executes a webscrapping routine on IMDB movie reviewing website. It runs on Selenium extension for Python and your kernel needs to have it installed. For some operational systems, the webdriver doesn't support running Chrome, so it is possible that it would be needed to change it to firefox.

A part of the code searches through the page by strings, so the website needs to be on English.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •