Releases: ranamihir/pytorch_common
Releases · ranamihir/pytorch_common
v1.5.6
v1.5.5
v1.5.4
- [Breaking]
train_utils.perform_one_epoch()now returns a dictionary instead of list. - Model evaluation / prediction methods now accept
return_keysas an argument to pre-specify what items are to be returned.- This results in huge memory advantages by not having to store unnecessary objects.
- Added option to only perform training without any evaluation, by simply not providing any validation dataloader / logger arguments.
- [Breaking] As part of this change, for the sake of simplicity, the
ReduceLROnPlateauscheduler is now no longer supported (which requires validation loss in order to take each step).
- [Breaking] As part of this change, for the sake of simplicity, the
- Added feature to add sample weighting during training and evaluation.
- Added several unit tests in accordance with the aforementioned features.
- Changed default early stopping criterion to accuracy (instead of f1).
- Several other time, memory, and logging improvements.
- In sync with
c3809cf7
v1.5.3
- [Breaking]
train_model()intrain_utils.pynow supportscheckpoint_fileargument instead ofstart_epoch(which is now inferred) for resuming training.- The trained model (located at
checkpoint_file) is now loaded inside the function, rather than having to load it separately first.
- The trained model (located at
- Major improvement in computation of top-k accuracy scores.
- Instead of computing it separately for each
k, the computation is shared under the hood across allks as much as possible, which is a huge reduction in time, especially for problems with a high number of classes.
- Instead of computing it separately for each
- Added
create_dataloader()todatasets.pyfor creating a DataLoader from a Dataset. - Using
time.perf_counter()instead oftime.time()for measuring function execution time. - Other minor improvements and bug fixes.
- In sync with
1a95403b
v1.5.2
- Updated to pytorch=1.8.0 and cudatoolkit=10.1
- Overhauled metric computation.
- Much cleaner code
- Drastic reduction in metric computation time, since preprocessing is now shared for many metrics, e.g. getting the max-probability class for accuracy / precision / recall / f1, etc.
v1.5.1
v1.5
v1.4
This version primarily adds type annotations and makes aesthetic changes to conform to black and isort code quality guidelines but breaks backward compatibility in a few important places.
Breaking changes:
config.py: Deprecated support forbatch_size. Only per-GPU batch size is supported now. It may be specified as follows:batch_size_per_gpu(as before), which will use the same batch size for all modes{mode}_batch_size_per_gpu(mode =train/eval/test) for specifying different batch sizes for each mode
datasets_dl.py:- Renamed
print()->print_dataset()inBasePyTorchDataset oversample_class()now takesoversampling_factoras an argument instead of earlier setting it as a class attribute (similarly forundersample_class())- It additionally takes
columnas an argument to specify the column on which to perform sampling (which defaults toself.target_colto imitate existing behavior) - Added
sample_class()as a generic function for both over-/under-sampling
- Renamed
models_dl.py: Renamedprint()->print_model()inBasePyTorchModeltrain_utils.py:save_model():- Arguments
optimizer,train_logger, andval_loggerare all optional now to allow saving just the model and config - Takes arguments in a different order
- Arguments
utils.py:get_string_from_dict()now sorts andconfig_info_dictfirst before generating a unique string to ensure same string is obtained regardless of order of the keys
Other changes:
- Added type annotations everywhere
- Switched to double quotes everywhere to conform to PEP 8/257 guidelines
- Sticking to
blackandisortcode quality standards - Switched to max. line length of 119
- Added more tests. Updated existing ones to work with the aforementioned changes.
utils.py: Movedget_trainable_params()here (which is directly called inBasePyTorchModel) to allow support for non-BasePyTorchModel-type modelstypes.py: Additional file to define common (predefined and custom) data typespre-push.shnow assumes the appropriate environment is already enabled (instead of forcibly enabling one namedpytorch_common, which may not be available)- Minor performance improvements + code cleanup + bug fixes
- Upgraded
transformersversion to3.0.2andpytorch_commonto1.4
To run the code formatters, run the following commands from the main project directory:
black --line-length 119 --target-version py37 .
isort -rc -l 119 -tc -m 3 .v1.3
- Added unit tests for all files in the package
- The tests mostly revolve around ensuring correct setup of the config and making sure training/saving/loading/evaluating all models and (compatible) datasets with all (compatible) metrics for regression/classification works
- Added/fixed code for simple regression datasets and models (regression wasn't used / tested too much before)
- Added several util functions (mostly used only in unit testing for now)
- Stricter and better asserts in
config.py - Renamed
datasets.pytodatasets_dl.pyand createddatasets.pyonly for loading different datasets (for consistency withmodels.pyandmodels_dl.py - Added a pre-push hook for automatically running tests before each hook (pre-commit would've been too frequent and slowed down development)
- Minor code cleanup + docstring improvements + bug fixes + Readme updates
v1.2.1
- Upgraded to
transformers==2.9.0which has many performance improvements + bug fixes - Using common loop for training/evaluation/testing to remove duplicate code
- Added support for specifying decoupling function in
train_model()(andget_all_predictions()) to define how to extract the inputs (and targets) from a batch- This may be useful in case this process deviates from the typical behavior, but the training paradigm is otherwise the same, and hence
train_model()can still be used
- This may be useful in case this process deviates from the typical behavior, but the training paradigm is otherwise the same, and hence
- Removed dependency on
vrdscommon- The
timingdecorator was being imported fromvrdscommon; now one is defined in the package itself - As a result of this, added support for defining decorators
- The
- Added/improved util functions:
get_total_grad_norm()compare_tensors_or_arrays()is_batch_on_gpu()is_model_on_gpu()is_model_parallelized()remove_dir()save_object()now also supports saving YAML files
- Minor cleanup and linting