PhD Student | Multimodal AI · Generative AI · Representation Learning
-
CVSSP, University of Surrey
- Guildford, Surrey, UK
-
05:28
(UTC) - in/tony-alex-203b5a114
- https://scholar.google.com/citations?user=CWFgnLoAAAAJ&hl=en
Highlights
- Pro
-
-
SSLAM Public
[ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
-
Awesome-Audio-LLM Public
Forked from AudioLLMs/Awesome-Audio-LLMAudio Large Language Models
Python UpdatedJul 4, 2025 -
MaxAST Public
[ICASSP 2024] Max-AST: Combining Convolution, Local and Global Self-Attentions for Audio Event Classification
-
DTFAT Public
[AAAI 2024] DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification
-
Compare the performance of MADGRAD against Adam and SGD for segmentation model training.