Abstract

The official repo for the manuscript "D-Net: Dynamic Large Kernel with Dynamic Feature Fusion for Volumetric Medical Image Segmentation". Arxiv

Abstract

Hierarchical Vision Transformers (ViTs) have achieved significant success in medical image segmen- tation due to their large receptive field and ability to leverage long-range contextual information. Convolutional neural networks (CNNs) may also deliver a large receptive field by using large convolutional kernels. However, because they use fixed-sized kernels, CNNs with large kernels remain limited in their ability to adaptively capture multi-scale features from organs that vary greatly in shape and size. They are also unable to utilize global contextual information efficiently. To address these limitations, we propose lightweight Dynamic Large Kernel (DLK) and Dynamic Feature Fusion (DFF) modules. The DLK employs multiple large kernels with varying kernel sizes and dilation rates to capture multi-scale features. Subsequently, DLK utilizes a dynamic selection mechanism to adaptively highlight the most important channel and spatial features based on global information. The DFF is proposed to adaptively fuse multi-scale local feature maps based on their global information. We incorporated DLK and DFF into a hierarchical ViT architecture to leverage their scaling behavior, but they struggle to extract low-level features effectively due to feature embedding constraints in ViT architectures. To tackle this limitation, we propose a Salience layer to extract low-level features from images at their original dimensions without feature embedding. This Salience layer employs a Channel Mixer to capture global representations effectively. We further incorporated the Salience layer into the hierarchical ViT architecture to develop a novel network, termed D-Net. D-Net effectively utilizes a multi-scale large receptive field and adaptively harnesses global contextual information. Extensive experimental results demonstrate its superior segmentation performance compared to state-of-the-art models, with comparably lower computational complexity.

Methods

Citation

If you use D-net in your research, please consider to cite our work.

@article{YANG2026108837,
title = {D-Net: Dynamic large kernel with dynamic feature fusion for volumetric medical image segmentation},
journal = {Biomedical Signal Processing and Control},
volume = {113},
pages = {108837},
year = {2026},
issn = {1746-8094},
doi = {https://doi.org/10.1016/j.bspc.2025.108837},
url = {https://www.sciencedirect.com/science/article/pii/S1746809425013485},
author = {Jin Yang and Peijie Qiu and Yichi Zhang and Daniel S. Marcus and Aristeidis Sotiras},
keywords = {Large convolutional kernel, Dynamic convolution, Channel mixer, Vision transformer, Medical image segmentation}
}

Question

If you have any questions about our work, please contact us via email ([email protected]).

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Figures		Figures
networks		networks
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Abstract

Methods

Citation

Question

About

Uh oh!

Releases

Packages

Languages

sotiraslab/DLK

Folders and files

Latest commit

History

Repository files navigation

Abstract

Methods

Citation

Question

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages