Pokémon Battle Predictor — README

Brief description

Project to predict the outcome of Pokémon battles from the early phases of a match. Raw data (jsonl files) are loaded into a SQLite DB, preprocessed, and used to train/evaluate various ML models.

What load_data.py does

Creates a DB connection and registers a new Dataset (Train or Test).
For each match in the JSONL:
- Inserts match metadata via insert_battle.
- Inserts team Pokémon with load_pokemon and load_team.
- Inserts turns with insert_turn, which in turn calls insert_state_move to save states and moves.
Maintains consistency with INSERT OR IGNORE for reference tables (type, status, moves).
Final commit of the data into the DB.

data_analyzer folder

The data_analyzer/ folder contains code to extract, preprocess, select models and run experiments:

data_analyzer/init.py — exports main functions from lib.py.
data_analyzer/lib.py — utilities and preprocessing / I/O pipeline:
- get_datapoints — builds the dataset with all normalized features (as described in the report).
- load_datapoints — reads preprocessed tables (Input, Output, TestInput, TestOutput).
- create_submission — generates submission CSVs from predictions.
- load_best_model — reconstructs a model from the information in models.json.
data_analyzer/model_selection.py — classes and helpers for hyperparameter search and validation:
- ModelTrainer and various implementations (LogisticRegressionTrainer, RandomForestClassifierTrainer, XGBClassifierTrainer, etc.) for cross‑validation and hyperparameter search.
- plot_history — saves validation plots.
data_analyzer/main.py — CLI for analysis operations (save dataset, PCA, training, ensemble, etc.). Runs routines that use functions from lib.py and model_selection.py.

Typical execution

main.py contains the entire pipeline to reproduce the results; it was later converted into a notebook for the Kaggle challenge.

Useful resources in the repo

Main script: main.py
DB creation: analisi/create_db.sql
LaTeX report template: Latex/
Saved models info: models.json
PCA importance/features: pca.json

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
Latex		Latex
__pycache__		__pycache__
analisi		analisi
data_analyzer		data_analyzer
plt		plt
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
data_analyzer.py		data_analyzer.py
delta_status.csv		delta_status.csv
get_pokemons.py		get_pokemons.py
get_teams.py		get_teams.py
load_data.py		load_data.py
main.py		main.py
models.json		models.json
pca.json		pca.json
pokemon.db		pokemon.db
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pokémon Battle Predictor — README

Brief description

What load_data.py does

data_analyzer folder

Typical execution

Useful resources in the repo

About

Uh oh!

Releases 2

Packages

Contributors 3

Uh oh!

Languages

LucaSforza/Pokemon

Folders and files

Latest commit

History

Repository files navigation

Pokémon Battle Predictor — README

Brief description

What load_data.py does

data_analyzer folder

Typical execution

Useful resources in the repo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Uh oh!

Languages

Packages