Agenta’s cover photo
Agenta

Agenta

Technology, Information and Internet

The open-source LLMOps platform for AI teams to ship reliable AI apps

About us

Agenta is the open-source LLMOps platform that allows developers to quickly build production ready LLM-powered applications. Use Agenta to quickly build, iterate, and evaluate your LLM apps. Whether Langchain or Llama_index, GPT-4 or Falcon, Agenta seamlessly integrates with your code without limiting your selection of framework, library, or model.

Website
https://www.agenta.ai
Industry
Technology, Information and Internet
Company size
2-10 employees
Headquarters
Berlin
Type
Privately Held
Founded
2023

Locations

Employees at Agenta

Updates

  • Agenta reposted this

    Annotation Queues are now live in Agenta The hardest part of building AI applications is creating evals. Annotation queues make that simpler. Two ways to use them: 1. Send traces to a queue. Your team reviews them. Then turn them into test sets and evals. 2. Send test cases to a queue. Your team adds ground truth or rubrics. Then use those for evals. Check out the video below.

  • Agenta reposted this

    Sharing my talk from AI Engineer Europe. Most LLM judges used in production are either generic (e.g. hallucination) or vibe prompted. Both are worse than no evals. They make us feel confident when we should not be. In this talk, I go through how to build LLM judges that are useful (and correlate with human annotations). I do that through prompt optimization using GEPA. Check out the link in the comments.

    • No alternative text description for this image
  • Agenta reposted this

    Webhooks and GitHub automations are live in Agenta! You can now connect your Github repo (or any webhook) to Agenta so that each time a new prompt version is saved, you run an automation. I show in this video how to sync prompts between Github and Agenta. Each time the PM changes the prompt in Agenta, the engineers get a PR in the repo. Docs in the comments.

  • Agenta reposted this

    150+ tool integrations are live in the Agenta Playground. You can connect Gmail, Slack, Notion, Google Sheets, GitHub, and other services directly to your prompts. Authenticate with OAuth, pick the actions you want, and test everything from the playground. Some examples of what you can build without writing integration code: - Use a Google Sheet as a data source and build RAG from the UI - Have your prompt draft and send emails through Gmail - Post to Slack channels based on LLM output - Create GitHub issues from structured extraction prompts Check it out 👇🏼

  • Agenta reposted this

    Are you still writing your prompts by hand? AI is much better at writing prompts. When we talked to our users, we discovered that a lot of them copy-pasted the prompt into ChatGPT, edited it there, and then went back. So we improved the flow. Now you click the wand icon, type what you want to change, and get back your refined version. The prompt refiner itself was built with Agenta, and it includes all the best practices for writing clear prompts that work well. Since we started using it internally, my flow is just talking through what I want using Wispr Flow and hitting refine. Prompt iteration got way faster. Try it in the Playground: https://cloud.agenta.ai

  • Agenta reposted this

    Agenta has new enterprise features: - SSO with any OIDC provider: Okta, Azure AD, Auth0, OneLogin, Google Workspace. You can enforce SSO-only for an org and disable password login. - Domain verification: Verify your domain once and anyone who signs up with a matching email joins your org automatically. - A new US region for customers who need their data to stay in the US. More details below 👇🏼

    • No alternative text description for this image
  • Agenta reposted this

    Why prompt engineering will never die: Every few months, someone declares prompt engineering dead. First it was "context engineering" that replaced it. Then "agent harnesses." Now some argue models will get so smart you won't need prompts at all. I think this is wrong, and I think the reason is that people misunderstand what prompt engineering actually is. When people hear "prompt engineering," they picture alchemy. Special tricks to make the LLM behave. Phrases like "solve this or I'll die" that somehow improve outputs. With older models, some of that worked. New models don't need any of it. You can write plain English with typos and they'll understand you fine. So if that's your definition, then sure, it's dead. But that was never what prompt engineering was really about. An LLM, even a very powerful one, is like a genius on their first day at a new job. Brilliant, maybe smarter than everyone in the room. But they know nothing about your company, your product, or how you want things done. Your job is to explain all of that. What counts as a technical question versus a billing question. What tone to use. What your brand guidelines are. What coding conventions you follow. What's off-limits. Writing all of this down, iterating until the AI behaves the way you want, is prompt engineering. If you've ever written a PRD, this should sound familiar. You're describing how a system should behave, what the flows look like, what the edge cases are. Except the system is an AI. Context engineering still matters. But deciding what the AI should do with that context is the harder problem, and that's prompt engineering. Prompt engineering isn't dead. We just had the wrong definition.

    • Prompt Engineering Will Never Die
  • Agenta reposted this

    Comparing evaluations across time is broken if your test data changed. We added test set versioning to Agenta. Every edit creates a new version. Evaluations link to the version they used. Now you know if your prompt got worse or if someone just added harder test cases. Also rebuilt the test set UI. It handles 100K+ rows. Editing chat messages and JSON is also much easier.

  • Agenta reposted this

    Three quality-of-life improvements to the Agenta Playground. 𝗣𝗿𝗼𝘃𝗶𝗱𝗲𝗿 𝗰𝗼𝘀𝘁𝘀 𝘂𝗽𝗳𝗿𝗼𝗻𝘁. You can now see the cost per million tokens directly in the model selection dropdown. 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻𝘀 𝗳𝗿𝗼𝗺 𝘁𝗵𝗲 𝗣𝗹𝗮𝘆𝗴𝗿𝗼𝘂𝗻𝗱. Start an evaluation run without leaving the Playground. Click evaluate, and your current prompt configuration goes straight to evaluation. Keeps you in flow when iterating. 𝗖𝗼𝗹𝗹𝗮𝗽𝘀𝗶𝗯𝗹𝗲 𝘁𝗲𝘀𝘁 𝗰𝗮𝘀𝗲𝘀. When working with large test sets, collapse completed test cases to focus on what you're working on. You still see a preview of each one for context.

Similar pages

Browse jobs

Funding

Agenta 2 total rounds

Last Round

Seed

US$ 1.1M

See more info on crunchbase