TradeBot

Version 1.2 - AMD RX 7800 XT ROCm support

A proof-of-concept implementation demonstrating that a small LLM can learn to predict trading actions (BUY/SELL/HOLD) from tokenized market data sequences.

🆕 What's New in v1.1

✅ Fixed: NaN handling errors in indicator calculations
✅ Fixed: Deprecated transformers API compatibility
✅ Added: Accelerate library support for modern training
✅ Added: Comprehensive .gitignore for model files
✅ Optimized: AMD RX 7800 XT support with ROCm and FP16 mixed precision
✅ Improved: Better error messages and debugging

See CHANGELOG.md for complete details.

Project Goal

Demonstrate technical feasibility of using an LLM to learn trading patterns by:

Converting historical price data into a token-based trading language
Fine-tuning a small language model (distilgpt2) on these sequences
Testing if the model can predict trading actions better than random chance

This is a research/educational project, not investment advice.

Architecture Overview

Historical Price Data (SPY via configured market data provider)
    ↓
Token Generator (converts OHLCV → trading language tokens)
    ↓
Training Sequences (e.g., "<SYM_SPY> <TF_DAILY> ST_UpTrend Hi_Volume BUY")
    ↓
Fine-tuned distilgpt2 model
    ↓
Predictions (next token = BUY/SELL/HOLD)

Trading Language Definition

Each sequence is composed of discrete tokens representing market state:

Symbol: <SYM_SPY>, <SYM_QQQ>, etc.
Timeframe: <TF_DAILY>, <TF_WEEKLY>, <TF_60MIN>
State Trend: ST_UpTrend, ST_DownTrend, ST_Flat
Volume: Hi_Volume, Lo_Volume, Avg_Volume
Indicators: HA_UpCross, HA_DownCross, STO_Cross, STO_NoCross
Actions: BUY, SELL, HOLD

Example sequence:

<SYM_SPY> <TF_DAILY> ST_DownTrend Hi_Volume HA_UpCross STO_Cross BUY

Installation

Quick Start (AMD RX 7800 XT with ROCm)

# 1. Install PyTorch with ROCm
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.7

# 2. Install dependencies
pip install -r requirements.txt

# 3. Verify GPU
python -c "import torch; print(f'ROCm: {torch.cuda.is_available()}')"

See RX7800XT_SETUP.md for detailed setup guide.

Docker

# Build the image
docker build -t tradebot .

# Run the Streamlit app (mount your trained model)
docker run -p 8501:8501 \
  -v /path/to/your/models:/app/models \
  tradebot

# Run with live market data (set API keys via -e)
docker run -p 8501:8501 \
  -v /path/to/your/models:/app/models \
  -e ALPHA_VANTAGE_API_KEY=your_key \
  tradebot

# Run a different command (train, data gen, test, etc.)
docker run --rm \
  -v /path/to/your/models:/app/models \
  -v /path/to/your/data:/app/data \
  tradebot \
  python src/02_train_model.py

Note: The image does not bundle model or data files — they are mounted as volumes.

Usage

Step 1: Generate Training Data

python src/01_generate_training_data.py

This downloads SPY historical data and generates token sequences. Set ALPHA_VANTAGE_API_KEY or STOOQ_API_KEY to use live data; otherwise the generator falls back to the included sample data.

For the Streamlit demo, add a live data key in .env:

cp .env.example .env
# Edit .env and set ALPHA_VANTAGE_API_KEY
.venv/bin/streamlit run app.py

Step 2: Train the Model

python src/02_train_model.py

Fine-tunes distilgpt2 on the trading sequences.

Step 3: Test the Model

python src/03_test_model.py

Evaluates model accuracy on held-out test sequences.

Step 4: Interactive Inference

python src/04_interactive_inference.py

Test the model with custom market state inputs.

Project Structure

tradebot/
├── README.md
├── requirements.txt
├── data/
│   ├── raw/              # Downloaded price data
│   ├── processed/        # Token sequences
│   └── train_test_split/ # Training/test datasets
├── models/
│   └── tradebot/         # Saved model checkpoints
├── src/
│   ├── 01_generate_training_data.py
│   ├── 02_train_model.py
│   ├── 03_test_model.py
│   └── 04_interactive_inference.py
└── utils/
    ├── data_generator.py
    ├── token_definitions.py
    └── indicators.py

Success Metrics (Phase 1)

Primary: Model predicts correct action token >50% of the time (better than random 33%)
Secondary: Model learns symbol/timeframe context (different predictions for same pattern under different symbols)

Limitations & Risks

Correlation ≠ Causation: Model learns statistical patterns, not market dynamics
Overfitting: May memorize historical sequences without generalizing
No Risk Management: Predictions are binary (BUY/SELL), no position sizing or stop losses
Lookback Bias: All training data is "from the future" relative to prediction point
No Transaction Costs: Ignores fees, slippage, and execution delays

Next Steps (Beyond Hello World)

Add more sophisticated indicators (RSI, MACD, Bollinger Bands)
Implement proper backtesting with transaction costs
Test on multiple symbols and timeframes
Paper trading with live data
Explore reinforcement learning approach (reward = profit)
Adapt to prediction markets (Polymarket)

Timeline

Data Generation: 30 minutes
Model Training: 1-2 hours (CPU) / 15-30 minutes (GPU)
Testing & Evaluation: 30 minutes
Total: ~3-4 hours for complete Hello World

License

Educational/Research use only. Not financial advice.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
agent		agent
auth		auth
data		data
models		models
src		src
tests		tests
utils		utils
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
GETTING_STARTED.md		GETTING_STARTED.md
IMPLEMENTATION_GUIDE.md		IMPLEMENTATION_GUIDE.md
MANIFEST.md		MANIFEST.md
Makefile		Makefile
NETWORK_TROUBLESHOOTING.md		NETWORK_TROUBLESHOOTING.md
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
README.md		README.md
RX7800XT_SETUP.md		RX7800XT_SETUP.md
app.py		app.py
diagnose_model.py		diagnose_model.py
docker-compose.yml		docker-compose.yml
position_analysis.py		position_analysis.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
training_diagnostic.py		training_diagnostic.py
training_log.txt		training_log.txt
training_log_fixed.txt		training_log_fixed.txt
training_log_retrain.txt		training_log_retrain.txt
training_log_retrain_v2.txt		training_log_retrain_v2.txt
verify_setup.py		verify_setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TradeBot

🆕 What's New in v1.1

Project Goal

Architecture Overview

Trading Language Definition

Installation

Quick Start (AMD RX 7800 XT with ROCm)

Docker

Usage

Step 1: Generate Training Data

Step 2: Train the Model

Step 3: Test the Model

Step 4: Interactive Inference

Project Structure

Success Metrics (Phase 1)

Limitations & Risks

Next Steps (Beyond Hello World)

Timeline

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TradeBot

🆕 What's New in v1.1

Project Goal

Architecture Overview

Trading Language Definition

Installation

Quick Start (AMD RX 7800 XT with ROCm)

Docker

Usage

Step 1: Generate Training Data

Step 2: Train the Model

Step 3: Test the Model

Step 4: Interactive Inference

Project Structure

Success Metrics (Phase 1)

Limitations & Risks

Next Steps (Beyond Hello World)

Timeline

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages