This is our implementation of our paper Dual-Domain Division Multiplexer for General Continual Learning: A Pseudo Causal Intervention Strategy, IEEE Transactions on Image Processing, 34, 2025, 1966-1979. Please cite our paper if you use the code.
@article{Wu2025dual,
author = {Wu, Jailu and Wang, Shaofan and Sun, Yanfeng and Yin, Baocai and Huang, Qingming},
title = {Dual-Domain Division Multiplexer for General Continual Learning: A Pseudo Causal Intervention Strategy},
journal = {IEEE Transactions on Image Processing},
volume = {34},
pages = {1966-1979},
year = {2025},
}
TL;DR: A causal intervention method for general continual learning
Abstract: As a continual learning paradigm where non-stationary data arrive in the form of streams and training occurs whenever a small batch of samples is accumulated, general continual learning (GCL) suffers from both inter-task bias and intra-task bias. Existing GCL methods can hardly simultaneously handle two issues since it requires models to avoid from lying into the spurious correlation trap of GCL. From a causal perspective, we formalize a structural causality model of GCL and conclude that spurious correlation exists not only between confounders and input, but also within multiple causal variables. Inspired by frequency transformation techniques which harbor intricate patterns of image comprehension, we propose a plug-and-play module: the Dual-Domain Division Multiplex (D3M) unit, which intervenes confounders and multiple causal factors over frequency and spatial domains with a two-stage pseudo causal intervention strategy. Typically, D3M consists of a frequency divi�sion multiplexer (FDM) module and a spatial division multiplexer (SDM) module, each of which prioritizes target-relevant causal features by dividing and multiplexing features over frequency domain and spatial domain, respectively. As a lightweight and model-agonistic unit, D3M can be seamlessly integrated into most current GCL methods. Extensive experiments on four popular datasets demonstrate that D3M significantly enhances accuracy and diminishes catastrophic forgetting compared to current methods.
- torch>=2.1.0
- numpy
- torchvision
- kornia>=0.7.0
- Pillow
- timm==0.9.8
- tqdm
- onedrivedownloader
- ftfy
- regex
- pyyaml
In many experiments, considering the original backbone of the method, it is very simple to insert the D3M module or a single FDM (SDM) module into the convolutional layer while maintaining the default settings of the source code. We provide a working example in the example. py file. To replicate a general continual learning experiment, we provide the source code of the basic derpp method, which can be found in the Mammoth folder. Training can be conducted through the following instructions.
pip install -r requirements.txt
The following command will run the model derpp on the dataset seq-cifar100 in a buffer of 500 samples with the best hyperparameters, use the --load_best_args argument:
python main.py --model derpp --dataset seq-cifar100 --load_best_args
New models can be added to the models/ folder. New datasets can be added to the datasets/ folder.