Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View goodmike31's full-sized avatar

Block or report goodmike31

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Clean APIs for data cleaning. Python implementation of R package Janitor

Python 1,477 180 Updated Jan 26, 2026

Anthropic's Interactive Prompt Engineering Tutorial

Jupyter Notebook 29,182 2,851 Updated Jul 11, 2024

A Gym for Agentic LLMs

Python 437 29 Updated Jan 21, 2026

TrustJudge is a probabilistic evaluation framework that reduces score-comparison and pairwise transitivity inconsistencies in LLM-as-a-judge systems.

Python 38 1 Updated Sep 27, 2025

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 2,973 228 Updated Jun 19, 2025

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Python 127 6 Updated Dec 9, 2024

AudioBench: A Universal Benchmark for Audio Large Language Models

Python 293 14 Updated Jun 17, 2025

A toolkit for processing speech data and creating speech datasets

Python 196 42 Updated Sep 29, 2025

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Jupyter Notebook 295 20 Updated Sep 21, 2025

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Python 434 49 Updated Oct 7, 2025

GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.

Python 15,401 2,264 Updated Aug 15, 2025

Accessibility engine for automated Web UI testing

JavaScript 6,832 858 Updated Jan 23, 2026

Delightful JavaScript Testing.

TypeScript 45,276 6,626 Updated Jan 22, 2026

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

TypeScript 81,690 5,047 Updated Jan 26, 2026

☁️ 🚀 📊 📈 Evaluating state of the art in AI

Python 1,988 986 Updated Jan 26, 2026

Inference-time scaling for LLMs-as-a-judge.

Jupyter Notebook 327 24 Updated Nov 5, 2025
Jupyter Notebook 267 260 Updated Jan 8, 2026

Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vector…

TypeScript 1,171 119 Updated Nov 17, 2025

Building AI agents, atomically

Python 5,525 456 Updated Jan 3, 2026

Generuj nieskończony i zdywersyfikowany zbiór danych przy użyciu systemu agentowego!

Python 1 Updated Jun 13, 2025

Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.

Python 89,548 12,912 Updated Jan 24, 2026

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 34,511 5,475 Updated Jan 26, 2026

A collection of sample agents built with Agent Development Kit (ADK)

Python 8,225 2,205 Updated Jan 26, 2026

A Medical / Clinical Note Taking Demo Application using Deepgram Voice Agent API

TypeScript 12 13 Updated Jul 9, 2025

DataComp for Language Models

HTML 1,409 129 Updated Sep 9, 2025

Evaluation and Tracking for LLM Experiments and AI Agents

Python 3,057 246 Updated Jan 24, 2026

AI Observability & Evaluation

Jupyter Notebook 8,373 695 Updated Jan 26, 2026

Extendable toolkit for comprehensive evaluation of ASR systems. Currently supports benchmarking 29 system-models combination for Polish using BIGOS datasets.

Python 11 2 Updated Mar 2, 2025
Next