4tv-tumblrbot was a collaborative project I embarked on with my close friend Dima, who goes by @smoqueen on Tumblr. The aim of this endeavor was straightforward yet silly: to develop a Tumblr bot powered by a machine-learning model. This bot would be specifically trained on the content from a particular Tumblr blog or a selected set of blogs, allowing it to mimic the style, tone, and thematic essence of the original posts.
Download and Install TumblThree:
- Visit the official TumblThree GitHub page and download the latest version of the application.
- Extract the downloaded ZIP file and run the
TumblThree.exefile to launch the application.
Add Tumblr Blogs:
- Copy the URLs of the Tumblr blogs you want to download.
- In TumblThree, enter the blogs into the field marked Enter URL, then hit enter.
- The blogs will be added to the list of blogs in the main interface.
Configure Download Settings:
- Click on a blog in the list to view its settings on the right panel.
- Choose which content types you want to download, including text posts, answers, and more.
Start Downloading:
- Click the Download button (represented by a download icon) to begin downloading the selected blogs.
- The application will download all available posts from the blogs based on your configuration.
- Organize your text post files by moving them to the
./datadirectory within the project. Ensure that all your post files are stored here for easy access by the software.
-
Rename each text post file following the format:
blogname_typeofpost.txtFor example:
- If the blog name is "myblog" and the post is a "texts", rename it to
myblog_texts.txt.
- If the blog name is "myblog" and the post is a "texts", rename it to
-
For Windows:
- Download Python 3.11 from the official Python website.
- Run the installer and select the checkbox to "Add Python to PATH" during installation.
- Complete the installation process by following the prompts.
-
For Linux:
-
Open the terminal.
-
Run the following commands:
sudo apt update sudo apt install python3.11
-
-
For Windows:
-
Open Command Prompt.
-
Run the following command to install pip:
python -m ensurepip --upgrade
-
-
For Linux:
-
Open the terminal.
-
Run the following command:
sudo apt install python3-pip
-
-
For Windows:
-
Open Command Prompt.
-
Run the following command to install
virtualenv:pip install virtualenv
-
-
For Linux:
-
Open the terminal.
-
Run the following command:
sudo pip install virtualenv
-
-
For Windows:
-
Navigate to the project directory in Command Prompt.
-
Run the following command:
setup.bat
-
-
For Linux:
-
Open the terminal and navigate to the project directory.
-
Run the following command:
./setup.sh
-
-
For Windows:
-
Activate the virtual environment by running:
.\venv\Scripts\activate
-
Once activated, run any of the Python programs with:
python {program}.py
-
-
For Linux:
-
Activate the virtual environment by running:
source venv/bin/activate -
Once activated, run any of the Python programs with:
python3 {program}.py
-
-
Configure the Settings:
- Open the config file and fill in the required options.
- Leave the
modelfield blank for now, as it will be completed later after model training is complete.
-
Prepare Training Data:
- Place your correctly formatted post files into the
./data/directory. - Run the script
create_training_data.pyto generate the necessary training data.
- Place your correctly formatted post files into the
-
Upload Training Data:
- Once the training files are generated, locate them in the
./output/folder. - Upload these files to OpenAI's fine-tuning web portal.
- Ensure you select the most recent version of the
gpt-4o-minimodel for the fine-tuning process.
- Once the training files are generated, locate them in the
-
Update the Config File:
- After the fine-tuning process is complete, copy the model identifier from OpenAI’s web portal.
- Paste the model identifier into the
modelfield of your config file.
-
Test the Model:
- Run the script
4tv_tumblrbot.pyto generate posts. These posts will be saved in your Tumblr drafts. - Review the output in your drafts. If needed, repeat the process with different blogs or make adjustments to the training data to improve the model.
- Run the script
-
Place Your Python Script and Virtual Environment
- Ensure
4tv_tumblrbot.pyand the virtual environment (.venv/) are located in/home/user/4tv-tumblrbot-1.0/.
- Ensure
-
Create a Shell Script to Run the Python Script
- Create a shell script named
run_4tv_tumblrbot.shin the same directory:
#!/bin/bash cd /home/user/4tv-tumblrbot-1.0/ source .venv/bin/activate python 4tv_tumblrbot.py deactivate
- Create a shell script named
-
Make the Shell Script Executable
- Run the following command to make the shell script executable:
chmod +x /home/user/4tv-tumblrbot-1.0/run_4tv_tumblrbot.sh
-
Schedule the Script Using Cron
- Open the cron table:
crontab -e
- Add the following line to schedule the script to run daily at 2 AM:
0 2 * * * /home/user/4tv-tumblrbot-1.0/run_4tv_tumblrbot.sh >> /home/user/4tv-tumblrbot-1.0/output.log 2>&1
- This will log the output to
output.login the script's directory.
-
Verify the Cron Job
- List your cron jobs to ensure it was added correctly:
crontab -l
-
Place Your Python Script and Virtual Environment
- Ensure
4tv_tumblrbot.pyand the virtual environment (.venv/) are located inC:\Users\user\4tv-tumblrbot-1.0\.
- Ensure
-
Create a Batch File to Run the Python Script
- Create a batch file named
run_4tv_tumblrbot.batin the same directory:
@echo off cd C:\Users\user\4tv-tumblrbot-1.0\ call .venv\Scripts\activate python 4tv_tumblrbot.py call .venv\Scripts\deactivate
- Create a batch file named
-
Schedule the Script Using Task Scheduler
- Open Task Scheduler by searching for it in the Start Menu.
- Click Create Basic Task and name your task (e.g., “Run 4tv Tumblrbot Daily”).
- Set the trigger to "Daily" and select the time (e.g., 2:00 AM).
- Choose "Start a program" as the action.
- In the Program/script field, enter the path to the batch file (
C:\Users\user\4tv-tumblrbot-1.0\run_4tv_tumblrbot.bat). - Click Finish to create the task.
-
Verify the Task
- Check the Task Scheduler Library to confirm your task is listed.
- Optionally, run the task manually to ensure it works by right-clicking the task and selecting Run.
- Make sure Python and the virtual environment are properly set up on both systems.
- Adjust paths and timings according to your preferences and system configurations.
- Ensure the script and virtual environment have appropriate permissions and configurations to run correctly.