wxy1988

Steven Wang wxy1988

5 followers · 20 following

Achievements

Starred repositories

zai-org / GLM-ASR

GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters

Python 584 51 Updated Dec 12, 2025

HKUDS / Paper2Slides

"Paper2Slides: From Paper to Presentation in One Click"

Python 2,387 325 Updated Dec 19, 2025

OpenMOSS / MOSS-Speech

MOSS-Speech is a true speech-to-speech large language model without text guidance.

Python 113 5 Updated Dec 4, 2025

OpenMOSS / MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…

Python 1,060 95 Updated Dec 8, 2025

yxduir / m2m-70

Python 14 1 Updated Dec 6, 2025

wenet-e2e / west

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python 167 11 Updated Dec 16, 2025

haven-jeon / PyKoSpacing

Automatic Korean word spacing with Python

Python 424 115 Updated Jul 4, 2024

ihmily / DouyinLiveRecorder

可循环值守和多人录制的直播录制软件，支持抖音、TikTok、Youtube、快手、虎牙、斗鱼、B站、小红书、pandatv、sooplive、flextv、popkontv、twitcasting、winktv、百度、微博、酷狗、17Live、Twitch、Acfun、CHZZK、shopee等40+平台直播录制

Python 8,935 1,176 Updated Nov 3, 2025

DataoceanAI / Dolphin

Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.

Python 680 60 Updated Nov 27, 2025

XinJingHao / DRL-Pytorch

Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 3,175 382 Updated Jun 11, 2025

Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Python 10,933 1,218 Updated Dec 20, 2025

boyu-ai / Hands-on-RL

https://hrl.boyuai.com/

Jupyter Notebook 4,314 775 Updated Nov 22, 2022

dariokonopatzki / Deep-Learning-Foundations-and-Concepts-Solutions

My solutions to DLFC - Deep Learning: Foundations and Concepts

94 17 Updated Mar 30, 2025

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,246 104 Updated Mar 2, 2025

HeCheng0625 / Diffusion-Speech-Tokenizer

This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Lan…

Python 196 13 Updated Sep 21, 2025