-
Tokyo Metropolitan University
- Tokyo
-
19:21
(UTC +09:00) - https://portfolio.ayutaso.com/about
- @aya172957
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Noise supression using deep filtering
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequences of interleaved semantic and acoustic tokens.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A little web app that helps you copy+paste syntax-highlighted code into slide decks.
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
Official Implementation of GLAP - General Language Audio Pretraining
Minimum Bayes Risk Decoding for Hugging Face Transformers
Long-form streaming TTS system for multi-speaker dialogue generation
Dependency injection framework for Python
OpenAPI Generator allows generation of API client libraries (SDK generation), server stubs, documentation and configuration automatically given an OpenAPI Spec (v2, v3)
orval is able to generate client with appropriate type-signatures (TypeScript) from any valid OpenAPI v3 or Swagger v2 specification, either in yaml or json formats. 🍺
A framework for managing and maintaining multi-language pre-commit hooks.
Google Cloud Storage emulator & testing library.
A lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.
🤗 smolagents: a barebones library for agents that think in code.