This repository contains some scripts to download spam emails from http://untroubled.org/spam/ archive and convert them in a CSV suitable for training ML models.
- Install Ruby
- Run
bundle install - Check if you have
7zinstalled. Otherwise install it
There are two actions:
-
ruby actions/fetch_data.rbto download the latest data. Data is cached indata/folder -
ruby actions/export_jsonl.rbto export the JSONLs of the downloaded data inoutput/folder