-
Inria-ALMAnaCH
- Paris-France
-
11:04
(UTC +01:00) - https://wissamantoun.github.io/
- @wissam_antoun
Highlights
- Pro
Stars
[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers
Sync links from Hacker News upvotes, Reddit Saves and more to Karakeep/Hoarder for centralized bookmark management
Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
"RAG-Anything: All-in-One RAG Framework"
LanguageTechnologiesUnit' s repository for tokenizer training and evaluation scripts.
Tooling for exact and MinHash deduplication of large-scale text datasets
🌩 Self-hosted file management and sharing system, supports multiple storage providers
Collection of handy online tools for developers, with great UX.
Personal website and blog written in FastHTML
🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.
CLEANANERCorp, a corrected version of the classic Arabic NER benchmark ANERcorp with updated and more consistent NER labels
A Multilingual Keyboard Layout-Based Typo Generator
A Site Reliability Engineer AI agent that can monitor application and infrastructure logs, diagnose issues, and report on diagnostics.
A library for efficient patching and automatic circuit discovery.
Tensorlake is a Document Ingestion API and a serverless platform for building data processing and orchestration APIs
Tunneling Internet traffic over Whatsapp
An open-source tool for LLM prompt optimization.
Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere
A Rust implementation of Token-Oriented Object Notation (TOON), a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage…