The project focuses on understanding suicide signs through social media analysis, crawling data from Twitter and Reddit using their respective API's (Tweepy and PRAW).
Crawling code base for a social analytics project part of the IS434: Social Analytics and Applications course at Singapore Management University (fall 2019).
NOTE: This application is not made or designed for commercial usage. These scripts will need modifications as well as some sort of data to be executed, however, we will include how it's used briefly.
This application is used by executing different scripts in a sequence. First of all, you will have to modify reddit_main.py/twitter_main.py with your own API credentials.
When this is done, you need to crawl your data from either Reddit or Twitter:
# Reddit
$ python3 reddit_main.py <insert subreddit> # ie 'askreddit'
# Twitter
# Twitter script will need a DataFrame of desired keywords to look through
$ python3 twitter_main.pySecondly, when data is crawled and saved into your csv files, you will want to analyse this data to score each submission/tweet to acknowledge the level of proneness the user has of committing suicide. This is done by applying our analyzing scripts to our recently crawled data:
# Reddit
$ python3 reddit_a.py <name of one of the subreddits crawled> # ie 'depression'
# Twitter
# This script has a 'keyword' variable with hardcoded elements that it will
# use to iterate through. Change this to whatever you want before usage.
$ python3 twitter_a.pyPull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
TBD