Thanks to visit codestin.com
Credit goes to github.com

Skip to content

files related to gathering Pitchfork music review data.

Notifications You must be signed in to change notification settings

rgreasons/pytch

Repository files navigation

Pytch - tools for Pitchfork web scraping.

this repo contains tools and files related to my work scraping Pitchfork music reviews in Python.

The September 2018 update of this repository is largely a refactor of my previous work, updated to follow the suggestions of David Eads from his blog post for the NPR Tech Blog.

Below are the steps to replicate the scraping yourself. I recommend doing so in a python virtual environment since these scripts use pretty common packages (csvkit, requests) that may create api version conflicts in other projects if installed into your root python folder.

  1. Run python reviewlist.py, which will create a csv in the folder named reviewlist.csv
  2. Run csvcut -c 1 reviewlist.csv | parallel ./PitchforkReview.py {} > pitchforkreviews.csv, which will create a csv file in the folder named pitchforkreviews.csv

About

files related to gathering Pitchfork music review data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published