Thanks to visit codestin.com
Credit goes to github.com

Skip to content
#

llm-evaluation

Here are 333 public repositories matching this topic...

Agentic Workflow Evaluation: Text Summarization Agent. This project includes an AI agent evaluation workflow using a text summarization model with OpenAI API and Transformers library. It follows an iterative approach: generate summaries, analyze metrics, adjust parameters, and retest to refine AI agents for accuracy, readability, and performance.

  • Updated Feb 23, 2025
  • Python
celebrity-death-bot

Cloudflare Workers app that watches Wikipedia for newly reported notable deaths, LLM‑filters and de‑duplicates them, then publishes concise memorial posts (Telegram + X) via a lightweight public JSON API. Automates detection, verification, and multi‑platform distribution with low latency and minimal ops overhead.

  • Updated Sep 7, 2025
  • TypeScript

LLMBuilder is a production-ready framework for training and fine-tuning Large Language Models (LLMs) β€” not a model itself. Designed for developers, researchers, and AI engineers, LLMBuilder provides a full pipeline to go from raw text data to deployable, optimized LLMs, all running locally on CPUs or GPUs.

  • Updated Sep 2, 2025
  • Python

Comparative study of 23 LLMs for Brazilian Portuguese sentiment analysis via in-context learning. Evaluates multilingual vs Portuguese-specialized models across 12 datasets. Code and data included.

  • Updated Jun 30, 2025
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the llm-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more