Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View anton-jeran's full-sized avatar
👋
👋

Organizations

@GAMMA-UMD

Block or report anton-jeran

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Sound event localization, detection, and tracking of multiple overlapping and moving sources in 2D spherical space using convolutional recurrent neural network

Python 376 70 Updated Nov 21, 2022

Code for voicing silent speech from EMG. Official repository for the papers "Digital Voicing of Silent Speech" at EMNLP 2020 and "An Improved Model for Voicing Silent Speech" at ACL 2021. Also incl…

Python 151 70 Updated Apr 30, 2024

Amazon Nova Act is an AWS service for building and deploying highly reliable AI agents that automate UI-based workflows at scale.

Python 879 140 Updated Jan 7, 2026

This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.

Python 51 6 Updated Mar 17, 2025

A Python Room Spatial Impulse Response Ray-Tracing Toolkit

C++ 75 10 Updated Dec 25, 2025

When given different views of an object as input, it can tell us if that specific object is present in a larger picture or not.

Python 6 Updated Jan 20, 2019

SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios

Python 260 23 Updated Jan 22, 2025

Fully open reproduction of DeepSeek-R1

Python 25,822 2,410 Updated Nov 24, 2025

A framework for few-shot evaluation of language models.

Python 11,200 2,966 Updated Jan 16, 2026

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,562 487 Updated Jan 16, 2026

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 20,155 1,692 Updated Nov 26, 2025

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Python 126 6 Updated Dec 9, 2024

Training and evaluation pipeline for MEG and EEG brain signal encoding and decoding using deep learning. Code for our paper "Decoding speech perception from non-invasive brain recordings" published…

Python 456 71 Updated Mar 12, 2024

Official implementation of NeurIPS 2024 paper "DiffusionPDE: Generative PDE-Solving Under Partial Observation"

Python 160 23 Updated Apr 29, 2025

Impulse Response measurement tool for MATLAB

MATLAB 38 9 Updated Sep 27, 2020

This is the official implementation of our mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh.

Python 102 13 Updated Jul 24, 2024

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Python 174 32 Updated Jul 24, 2024

This is the official implementation of reverberant speech to room impulse response estimator

Python 39 5 Updated Aug 7, 2024

Expressive Anechoic Recordings of Speech (EARS)

Python 206 13 Updated Jun 25, 2024

Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Python 136 10 Updated Mar 28, 2025

Official release of the Eyeful Tower dataset, a high-fidelity multi-view capture of 11 real-world scenes, from the paper “VR-NeRF High-Fidelity Virtualized Walkable Spaces” (Xu et al., SIGGRAPH Asi…

179 6 Updated Oct 9, 2024

This is the official implementation of our end-to-end binaural audio rendering approach (Listen2Scene) for virtual reality (VR) and augmented reality (AR) applications.

Python 5 3 Updated May 5, 2024

A Differentiable Room Acoustics Simulator

Python 4 1 Updated Jan 5, 2026

PyTorch Implementation of FastDiff (IJCAI'22)

Python 415 60 Updated Jun 20, 2024

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 6,091 525 Updated Jul 1, 2025

Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark

59 1 Updated Aug 29, 2024
JavaScript 1 Updated Jul 22, 2024

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 19,659 2,109 Updated Oct 21, 2025

Temporary anonymous version

Python 22 1 Updated Mar 20, 2024

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,455 345 Updated Feb 27, 2025
Next