Codestin Search App

Fix

Fix AverageMeter for n > 1 cases.

Breaking Changes

Delete batch object after iteration complete.
Don't store data loader in the state of engine.

New Features

Return metrics from fit method the same way as from from validate.
Use constructor from BuildModel to be able to user pass build_order.

Docs

New guides on:

Custom metrics.
Partial weights loading and manipulation.
Model export.
Custom callbacks.
LR schedulers.

Other improvements:

New solutions of competitions to examples.
Improve docstrings in many places.

Chore

Use pyproject.toml.
Update GitHub Actions versions.
Update dependencies.
Use ruff linter.

Full Changelog: v1.0.0...v1.1.0

New Features

Add mode argument to argus.Model.train (like in torch).

Docs

Add guides that provide an in-depth overview of how the framework works (link).
Fix minor typos in docstrings.

Examples

New example with sequential LR scheduler (link).
Transitioning from torch.distributed.launch to torchrun in cifar_advanced example.

Chore

Add __all__ for all modules.
Update CUDA 11.3.1.
Update PyTorch 1.10.0.

New Features

Share train and val states between phases with phase_states attribute of state.

 @argus.callbacks.on_epoch_complete
 def some_validation_callback(state: argus.engine.State):
     train_step_output = state.phase_states['train'].step_output
     ...

Option to use custom state load function for argus.load_model.

def state_load_from_dir(dir_path):
    file_path = pathlib.Path(dir_path) / 'some_model_name.pth'
    return torch.load(dir_path)

model = load_model(path_to_dir_with_model,
                   state_load_func=custom_state_load_func)

Docs

Argus logo!
Migrate to pydata-sphinx-theme.

Fix

Fix sdist package installation by adding MANIFEST.in with requirements.txt.

Examples

Use torch.cuda.amp instead Apex in advanced CIFAR example.
Add as an example solution for RANZCR CLiP - Catheter and Line Position Challenge.

Chore

setup.cfg with pytest and flake8 settings.
CI check code style with flake8.
Run tests on macOS and Windows.
Update Dockerfile and tests to PyTorch 1.8.0.
Update Dockerfile to CUDA 11.1.

New Features

Add saving of optimizer state for argus.Model and checkpoint callbacks.

model.save('models/model.pth', optimizer_state=True)

checkpoint = Checkpoint(dir_path='models/', optimizer_state=True)

Add get_device method to argus.Model.
Typing and fixing most cases of mypy errors.

Fix

Remove torch.optim._multi_tensor optimizers from defaults (torch >= 1.7.0).

Docs

Section argus.engine.
Section argus.metrics.
Section argus.utils with deep conversions.
Add docs for decorator callbacks.
Add docs for argus.Model methods: __init__, set_device, get_device, get_nn_module.
Update examples section.
Proofread and improve docs. Many small docstring fixes.

Internal changes

Use abstract container classes from collections.abc.
Now Engine and State only work with the argus.Model methods as a step_method. Phase name takes from the method name.
Simplify default logging.

Breaking Changes

Change optimizer state in argus.load_model. Now change_state_dict_func takes two arguments nn_state_dict and optimizer_state_dict (example).
Remove handler_kwargs_dict from the attach method of argus.callbacks.Callback.

New Features

Tests, 100% coverage (codecov).

Mechanism of params replacement while model loading (example).

# change optimizer params 
model = load_model(model_path, optimizer=('AdamW', {'lr': 0.001}))
# load model without optimizer and loss  
model = load_model(model_path, optimizer=None, loss=None)

Custom events for callbacks (example).

import argus
from argus.engine import EventEnum

class CustomEvents(EventEnum):
    BACKWARD_START = 'backward_start'
    BACKWARD_COMPLETE = 'backward_complete'

@argus.callbacks.on_event(CustomEvents.BACKWARD_START)
def before_backward(state):
    ...

class CustomEventModel(argus.Model):
    ...
    def train_step(self, batch, state):
        ...
        state.engine.raise_event(CustomEvents.BACKWARD_START)
        loss.backward()
        state.engine.raise_event(CustomEvents.BACKWARD_COMPLETE)
        ...

Typing.
Raise exceptions instead asserts.
Setup unique logger for each instance of argus.Model.
Check that params is a pickleble at model construction.
create_dir parameter for argus.callbacks.logging.LoggingToCSV.
Use instance of argus.utils.Identity as default for prediction_transform instead of lambda x: x.

Fix

Correctly save checkpoints with save_after_exception argument for argus.callbacks.checkpoints.

Breaking Changes

Change default append argument value to False for argus.callbacks.logging.LoggingToFile.
Rename attribute _scheduler of argus.callbacks.lr_schedulers.LRScheduler to scheduler.

Features

New mechanics building of attributes. It allows customizing the creation of model parts. Example here.
CIFAR example with Distributed Data Parallel, mixed precision, and gradient accumulation cifar_advanced.py.
Add save_model method to argus.callbacks.checkpoints. It allows customizing checkpoint saving.
Add logging time and LR to argus.callbacks.logging.LoggingToCSV.
argus.utils.deep_chunk similar to scatter function in PyTorch DataParallel.
Dockerfile and Makefile for developing.

Breaking Changes

Use argus.utils.deep_to function instead method argus.Model.prepare_batch. argus.Model.prepare_batch removed so if you use custom val_train or train_step you should change replace
```
input, target = self.prepare_batch(batch, self.device)
```
to
```
input, target = deep_to(batch, self.device, non_blocking=True)
```

Rename max_epochs to num_epochs of argus.Model.fit method.

model.fit(train_loader,
          val_loader=val_loader,
          num_epochs=1000)

Remove copy_last parameter from argus.callbacks.checkpoints.
Remove period parameter from argus.callbacks.checkpoints.MonitorCheckpoint.

New Features

Documentation https://pytorch-argus.readthedocs.io

Add step on iteration option for LR schedulers.

from argus.callbacks import CosineAnnealingLR

CosineAnnealingLR(10000, step_on_iteration=True)

New LR schedulers.
- argus.callbacks.lr_schedulers.MultiplicativeLR: Multiply learning rate by the factor given in the specified function.
- argus.callbacks.lr_schedulers.OneCycleLR: One Cycle learning rate policy.
Make LR scheduler step on epoch complete instead start.
Compute metric score with torch no grad.

Fix

Fix LR logging with several parameters group in optimizer.
Fix key error in redefine metric warning.

Breaking Changes

PyTorch requirements torch>=1.1.0.

New Features

CyclicLR and CosineAnnealingWarmRestarts LR schedulers.
- argus.callbacks.lr_schedulers.CyclicLR: Support for Cyclical Learning Rate and Momentum.
- argus.callbacks.lr_schedulers.CosineAnnealingWarmRestarts: Stochastic Gradient Descent with Warm Restarts.

argus.callbacks.logging.LoggingToCSV: add csv logger callback.

from argus.callbacks import LoggingToCSV

LoggingToCSV('path/to/log.csv', separator=',', append=False)

Add train and eval mode methods to argus.Model.
- model.train() sets the nn_module in training mode.
- model.eval() sets the nn_module in evaluation mode
Set step_output of State to None after each iteration for saving GPU memory.

Breaking Changes

Pass state to train and val step functions:

Before:

def train_step(self, batch):
   ...

Now:

def train_step(self, batch, state: State):
   print(state.epoch)
   ...

Scheduler step on epoch start, train epochs from 0 to max_epochs - 1. The scheduler callback uses the epoch param of a scheduler step function, so it now works like in 20124.
Remove deprecated to_device and detach_tensors utils functions

Data parallel for multi-gpu training.

Select gpu with device indexing:

model = load_model(model_path, device="cuda:1")
model.set_device("cuda:0")

For multi-gpu you can use list of devices:

params = {
    ...,
    'device': ['cuda:0', 'cuda:1']
}
model = CnnFinetune(params)

model = load_model(model_path, device=["cuda:1", "cuda:0"])
model.set_device(["cuda:0", "cuda:1"])

Batch tensors will be scattered on dim 0. First device in list is location of output.

By default device "cuda" is one gpu training on torch.cuda.current_device.

Releases: lRomul/argus

Maintenance release, new guides and updated docs

Fix

Breaking Changes

New Features

Docs

Chore

Uh oh!

Argus 1.0.0

New Features

Docs

Examples

Chore

Uh oh!

Logo, pydata sphinx theme, custom state loading, share train and val states

New Features

Docs

Fix

Examples

Chore

Uh oh!

Save optimizer state, improve docs and typing

New Features

Fix

Docs

Internal changes

Breaking Changes

Uh oh!

Tests, replace params while model loading, custom events

New Features

Fix

Breaking Changes

Uh oh!

Custom build methods, more examples

Features

Breaking Changes

Uh oh!

Documentation, LR scheduler step on iteration, new LR schedulers

New Features

Fix

Breaking Changes

Uh oh!

New LR schedulers, csv logger, state in step functions

New Features

Breaking Changes

Uh oh!

Data parallel

Uh oh!