Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View sweta20's full-sized avatar

Block or report sweta20

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Cell2Sentence: Teaching Large Language Models the Language of Biology

Jupyter Notebook 620 87 Updated Oct 15, 2025
Python 283 22 Updated Jul 15, 2024

A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enabling layer-wise analysis of hidden states and predictions.

Jupyter Notebook 120 10 Updated Aug 14, 2025
Jupyter Notebook 6 Updated Nov 5, 2022

LLM-based QAG framework for MT Evaluation

Python 3 1 Updated May 13, 2025

YSDA course in Natural Language Processing

Jupyter Notebook 10,345 2,712 Updated Oct 23, 2025

A framework for evaluating Machine Translation models.

Python 10 4 Updated May 26, 2025

The most accurate natural language detection library for Python, suitable for short text and mixed-language text

Python 1,525 51 Updated Oct 24, 2025

Toolkit used to collect translations from various online providers and LLMs

Python 10 3 Updated Sep 16, 2025

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

2,054 250 Updated Jun 6, 2024
JavaScript 61 7 Updated Jul 19, 2022

A benchmark with locally sourced multilingual questions for 31 languages.

Python 14 3 Updated Sep 2, 2025

Example competitions for the CodaLab project.

Python 25 26 Updated Sep 19, 2025
Python 5 Updated Jun 9, 2025

Examples and guides for using the Gemini API

Jupyter Notebook 15,248 2,197 Updated Oct 20, 2025

Benchmark for evaluating open-ended generation

Python 50 7 Updated Nov 6, 2024

A curated list of research papers and resources on code-switching

327 39 Updated Dec 18, 2024

Quantifying Language Confusion in LLMs.

Jupyter Notebook 2 Updated Oct 17, 2024

Minimal set up to render text as images.

Python 1 Updated Sep 2, 2024

[NeurIPS 2025 D&B Track] Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"

Python 32 4 Updated May 22, 2025
Python 7 Updated Jan 23, 2025

Paper list for open-ended language generation

193 19 Updated Nov 17, 2022

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 43,121 5,713 Updated Aug 16, 2024

Generate text images for training deep learning ocr model

Python 1,450 388 Updated Jan 17, 2022

Render documents on a virtual paper with folds and other types of damage using blender geometry nodes.

Python 23 2 Updated Aug 14, 2023

A Large-scale Dataset for training and evaluating model's ability on Dense Text Image Generation

Python 81 Updated Sep 27, 2025

Font rendering, atlas generation and text shaping library written in C++

C++ 25 1 Updated Jul 27, 2025

Cross-platform single header text rendering library for OpenGL

C 192 16 Updated May 30, 2023

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

Python 29,245 2,585 Updated Oct 20, 2025
Next