ishine

ishine

speech asr/speech-recognition tts/text-to-speech vc/voice-conversion ac/accent-conversion

159 followers · 244 following

gerzz.inc
shanghai
dubbing-ai.com dubbingai.io

Achievements

Stars

The-Swarm-Corporation / SpikeMamba

SpikeMamba presents a novel integration of spiking neural networks (SNNs) with the Mamba state space model architecture, investigating the potential for biologically-inspired temporal dynamics in l…

Python 3 Updated Sep 9, 2025

Mistrymm7 / AEC-Design-Technologist

Resources to develop programming and software development skills

HTML 28 11 Updated Sep 21, 2023

asgeirtj / system_prompts_leaks

Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini

JavaScript 23,079 3,534 Updated Oct 22, 2025

KeisukeImoto / LEAD_dataset

10 Updated Dec 4, 2024

Byaidu / PDFMathTranslate

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/MCP/Docker/Zotero

Python 29,224 2,582 Updated Oct 20, 2025

colaudiolab / AudioSet-R

Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"

Python 10 1 Updated Oct 9, 2025

liuhuadai / OmniAudio

[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"

Python 330 9 Updated Jun 27, 2025

FreedomIntelligence / FusionAudio

Towards Fine-grained Audio Captioning with Multimodal Contextual Cues

Python 81 5 Updated Sep 29, 2025

microsoft / PhiCookBook

This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…

Jupyter Notebook 3,557 461 Updated Oct 17, 2025

jjunak-yun / FLowHigh_code

[ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"

Python 80 10 Updated Jan 17, 2025

CoreaSpeech / sourcecode

Python 6 1 Updated May 30, 2025

yukiar / OTAlign

Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment

Python 38 5 Updated Sep 13, 2023

line / LibriTTS-P

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

152 3 Updated Jun 13, 2024

policy-gradient / GRPO-Zero

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,630 74 Updated Apr 18, 2025

SkyworkAI / SkyReels-V2

SkyReels-V2: Infinite-length Film Generative model

Python 4,790 669 Updated Aug 11, 2025

mmacosha / skills-introduction-to-github

My clone repository

1 Updated Sep 2, 2025

RoyJames / room-impulse-responses

A list of publicly available room impulse response datasets and scripts to download them.

Shell 514 46 Updated Oct 11, 2025

sarulab-speech / ml-audiocaps

Multi-lingual AudioCaps

11 Updated Nov 20, 2023

simplescaling / s1

s1: Simple test-time scaling

Python 6,581 766 Updated Jun 25, 2025

HCI-LAB-UGSPEECHDATA / speech_data_ghana_ug

The dataset comprises of 5000 hours speech corpus in Akan, Ewe, Dagbani, Daagare, and Ikposo. Each language includes 1000 hours of audio speech from indigenous speakers of the language. Of which 10…

HTML 9 4 Updated May 2, 2025

zhenglinpan / AnitaDataset

A free, licensed, and industrial animation dataset

69 5 Updated Jun 26, 2024

microsoft / markitdown

Python tool for converting files and office documents to Markdown.

Python 82,032 4,591 Updated Oct 20, 2025

zwhe99 / MAPS-mt

[TACL 2024] MAPS enables LLMs🤖 to mimic the human😁 translation process.

Python 144 7 Updated Jun 7, 2024

ydqmkkx / Respiro-en

Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech

Python 30 4 Updated Sep 18, 2024

yangchris11 / samurai

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,973 476 Updated Mar 18, 2025

py2many / py2many

Transpiler of Python to many other languages

Python 1,041 67 Updated Sep 9, 2025

huutuongtu / skd-ctc

Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation

Python 7 1 Updated Sep 25, 2024

lijin0120 / CELSDS

A Chinese Expressive Long-dialogue Speech Dataset with Scripts

Python 20 3 Updated Nov 11, 2024

Hayeonbang / PIAST

A piano music dataset with Audio, Symbolic and Text labels

Python 33 Updated Mar 6, 2025

google / sequence-layers

A neural network layer API and library for sequence modeling, designed for easy creation of sequence models that can be executed layerwise (training) and stepwise (sampling).

Python 44 7 Updated Aug 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ishine

Achievements

Achievements

Block or report ishine

Stars

The-Swarm-Corporation / SpikeMamba

Mistrymm7 / AEC-Design-Technologist

asgeirtj / system_prompts_leaks

KeisukeImoto / LEAD_dataset

Byaidu / PDFMathTranslate

colaudiolab / AudioSet-R

liuhuadai / OmniAudio

FreedomIntelligence / FusionAudio

microsoft / PhiCookBook

jjunak-yun / FLowHigh_code

CoreaSpeech / sourcecode

yukiar / OTAlign

line / LibriTTS-P

policy-gradient / GRPO-Zero

SkyworkAI / SkyReels-V2

mmacosha / skills-introduction-to-github

RoyJames / room-impulse-responses

sarulab-speech / ml-audiocaps

simplescaling / s1

HCI-LAB-UGSPEECHDATA / speech_data_ghana_ug

zhenglinpan / AnitaDataset

microsoft / markitdown

zwhe99 / MAPS-mt

ydqmkkx / Respiro-en

yangchris11 / samurai

py2many / py2many

huutuongtu / skd-ctc

lijin0120 / CELSDS

Hayeonbang / PIAST

google / sequence-layers