- ICCV2023: SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation
SPANet is a new backbone network which can handle the balance problem of high- and low-frequency components for optimal feature representations.
Please see image_classification for more details.
| Model | Pretrain | Resolution | Top-1 | #Param. | FLOPs |
|---|---|---|---|---|---|
| SPANet-S | ImageNet-1K | 224x224 | 83.1 | 28.7M | 4.6G |
| SPANet-M | ImageNet-1K | 224x224 | 83.5 | 41.8M | 6.8G |
| SPANet-MX | ImageNet-1K | 224x224 | 83.8 | 54.9M | 9.0G |
| SPANet-B | ImageNet-1K | 224x224 | 84.0 | 75.9M | 12.0G |
| SPANet-BX | ImageNet-1K | 224x224 | 84.4 | 99.8 M | 15.8G |
Please see object_detection for more details.
| Backbone | Lr Schd | box mAP | #params |
|---|---|---|---|
| SPANet-S | 1x | 43.3 | 38M |
| SPANet-M | 1x | 44.0 | 51M |
| Backbone | Lr Schd | box mAP | mask mAP | #params |
|---|---|---|---|---|
| SPANet-S | 1x | 44.7 | 40.6 | 48M |
| SPANet-M | 1x | 45.2 | 41.0 | 61M |
Please see semantic_segmentation for more details.
| Backbone | Lr Schd | mIoU | #params | FLOPs |
|---|---|---|---|---|
| SPANet-S | 80K | 45.4 | 32M | 46G |
| SPANet-M | 80K | 46.2 | 45M | 57G |
If you find this repository useful, please give us stars and use the following BibTeX entry for citation.
@inproceedings{yun2023spanet,
title={SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation},
author={Yun, Guhnoo and Yoo, Juhan and Kim, Kijung and Lee, Jeongho and Kim, Dong Hwan},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={6113--6124},
year={2023}
}This project is released under the MIT license. Please see the LICENSE file for more information.