Frags is a Go library that allows you to define and execute complex, multi-step data extraction plans using AI. It works by fragmenting a large JSON Schema into smaller, sequential "sessions" and "phases." This enables you to build robust, iterative interactions with AI models, making it ideal for tasks like structured data extraction from documents, progressive data collection, or any scenario where you need to manage a complex, stateful conversation with an AI.
When interacting with AI APIs that support structured JSON output, you may encounter scenarios where:
- The complete output schema is too large, potentially hitting output token limits in a single response
- You want to process information incrementally without maintaining multiple separate schemas
- You want to define multiple, distinct AI interactions that operate on the same overall data model, but without polluting a single global context and therefore poisoning the focus.
- Sequential data collection makes more sense than requesting everything at once
Frags allows you:
- Define a single, large schema that can be broken down into smaller, purpose-oriented chunks
- Define multiple, distinct AI sessions that operate on the same overall data model, each one with its own context
- Define sub-objectives in each session (phases) which get translated in sequential AI interactions, to reduce the the size of each output and ensure JSON schema compliance
- Define dependencies between sessions, allowing you to build workflows that run in sequence or in parallel
- Define custom components that can be reused across sessions
- Define custom AI clients that support file uploads and JSON schema responses
- Load resources from the file system or any other source
- Run multiple sessions in parallel to improve performance
And much more
go get github.com/theirish81/fragsThe plan is the central piece of the puzzle. It defines the overall data model, the AI interactions, and the dependencies between them. You can easily do that by code, but it's way more readable to define it in a YAML file.
The schema defines the overall data model. It's a standard JSON Schema, with the addition of a few tags that allow you to define the structure of the AI interactions. Example:
schema:
type: object
required:
- name
- email
- street_address
- city
- country
- product_name
- rating
- review_summary
- full_review_text
properties:
# Session 'user_profile'
name:
type: string
description: "The user's full name."
x-session: user_profile
x-phase: 0
email:
type: string
description: "The user's email address."
x-session: user_profile
x-phase: 0
street_address:
type: string
description: "The user's street address."
x-session: user_profile
x-phase: 1
city:
type: string
description: "The city."
x-session: user_profile
x-phase: 1
country:
type: string
description: "The country."
x-session: user_profile
x-phase: 1
# Session 'product_review'
product_name:
type: string
description: "The name of the product being reviewed."
x-session: product_review
x-phase: 0
rating:
type: number
description: "The user's rating, from 1 to 5."
minimum: 1
maximum: 5
x-session: product_review
x-phase: 0
review_summary:
type: string
description: "A one-sentence summary of the review."
x-session: product_review
x-phase: 1
full_review_text:
type: string
description: "The full text of the product review."
x-session: product_review
x-phase: 2Please Notice:
- The
x-sessiontag is used to define the name of the session. - The
x-phasetag is used to define the index of the phase within the session.
The sessions define the AI interactions. As before, you can define them by code, but it's much more readable to do it in YAML. As both the schema and the sessions unmarshal to the same Go struct (SessionManager), you will integrate the same YAML file.
Example:
sessions:
user_profile:
prompt: "Extract the user's primary details form the provided document"
nextPhasePrompt: "Also these secondary details"
resources:
- identifier: user_text.txt
product_review:
prompt: "Extract the required information from the provided document"
nextPhasePrompt: "Also extract these items"
resources:
- identifier: product_details.pdfPlease Notice:
- The
promptfield defines the prompt that will be sent to the AI. - The
nextPhasePromptfield defines the prompt that will be sent to the AI after each phase. - The
resourcesfield defines the files that will be uploaded to the AI. - The name of the session is the same as the
x-sessiontag value. - This example will not include features like dependencies or context carry-over
The Runner is the highest-level abstraction. It automates the entire workflow:
- Loads a
SessionManager. - Takes a
ResourceLoaderto load the files/data required by each session. - Takes an
Aiinterface implementation to make the actual calls to your AI. - Runs each session, automatically iterating through its local phases. It uses the session's
promptfor the first phase and thenextPhasePromptfor all subsequent phases. - Unmarshals the structured JSON results from the AI into a final Go struct.
Example:
func main() {
sessionData, _ := os.ReadFile("test_data/sessions.yaml")
mgr := NewSessionManager()
err := mgr.FromYAML(sessionData)
assert.Nil(t, err)
ai := NewDummyAi()
runner := NewRunner[T](mgr, NewDummyResourceLoader(), ai)
out, err := runner.Run(nil)
// do something with the output
}Session Parallelism
Sessions can run in parallel, which can significantly improve performance. As a default, however, the runner will run
sessions sequentially. To enable parallelism, use the WithSessionsWorkers(int) option when creating the runner.
Reusability
The same instance of a runner can be used multiple times, however, it can work on a task at a time and will return
an error if you call Run before the previous task has completed. Each instance of Runner is totally independent of
each other and thread safe.
In a plan you can define reusable components. These components can be used in multiple sessions.
The components section in the plan YAML file defines prompts and schemas.
Example:
components:
prompts:
user_profile_prompt: "Extract the user information"
schemas:
user_profile_schema:
type: object
required:
- name
- email
properties:
name:
type: string
email:
type: stringPrompts can be referenced using a Go template notation, while schemas can be referenced using the $ref notation.
Ideally a Frags plan does not solve a single problem, but rather a set of related problems. To make it easier to reuse the same plan for different scenarios, you can use all sorts of parameters and templating to customize the plan for each scenario.
The scope is a map of key-value pairs. Its root level is made of:
context: the accumulated data from all the previous sessionsparams: the parameters passed to the runner.vars: local variables defined in the sessioncomponents: reusable components defined in the plan You can reference these and any sub-property using the dot notation
All prompts support Go template syntax, allowing for dynamic prompts. In combination
with the scope, you can express prompts such as: Find info about the car {{ .params.car_model }} or
Obtain reviews for the product {{ .context.product_name }}
Always from the point of view of reusability, you can define session-bound variables. This is particularly useful in combination with reusable prompts.
Example:
components:
prompts:
user_profile_prompt: "Extract the user information for the user {{ .vars.user_id }}"
sessions:
user_profile:
prompt: "{{ .components.Prompts.user_profile_prompt }}"
vars:
user_id: 12345Important: notice the capital P in Prompts
When running a plan, you can pass parameters to the runner. These parameters are available in the scope as params.
runner := NewRunner[T](mgr, NewDummyResourceLoader(), ai)
runner.Run(map[string]interface{}{
"car_model": "BMW",
})and then refer to them in the prompt: Find info about the car {{ .params.car_model }}
You can define dependencies between sessions using the dependsOn property. This allows you to create workflows where
a session will only run after other sessions have successfully completed. dependsOn is a list of dependencies, and
all dependencies must be met for the session to run.
A dependency can be on another session, or it can be a conditional expression.
- Session Dependency: To make a session wait for another, specify the session name.
- Conditional Expression: You can add a
expressionto a dependency. This expression is evaluated against the data extracted so far, and the session will only run if the expression evaluates totrue. Expressions are written in a simple expression language (see antonmedv/expr).
A session may depend on another for two main reasons:
- Whether it will execute or not depends on the outcome of the previous session.
- The data extracted by the previous session may be required by the current session.
- Context carry-over
In this example, end_session depends on session_one and session_two. It will only run after both are complete,
and only if the animal1 field has been extracted in session_one.
sessions:
session_one:
prompt: answer the question
session_two:
prompt: answer the question
end_session:
context: true
dependsOn:
- session: session_one
expression: len(context.animal1) > 0
- session: session_two
prompt: answer the question based on the answers to the previous questionsWhen a session depends on others, it can be useful to access the data extracted by its dependencies. By setting
context: true on a session, you make the LLM aware of the already extracted data.
By default, a phase will be attempted once. You can add resilience to your workflow by specifying the attempts
property on a session. This will cause each phase within that session to be retried the specified number of times if
the AI call fails or returns invalid data.
sessions:
robust_session:
prompt: "This is an important prompt."
attempts: 3Each session can optionally define a prePrompt. This is a special prompt that runs before the first phase of the
session and its purpose is to enrich and prepare the context for the subsequent AI interactions. Unlike regular session
prompts, the prePrompt does not contribute to the structured output of the session. It's particularly useful for
guiding LLMs that may struggle with combining tool calling and structured output by priming them with necessary context
or instructions.
For example, you can use a prePrompt to instruct the LLM on its persona or specific rules it should follow during the
session without those instructions being part of the final data extraction.
sessions:
my_session:
prePrompt: "You are an expert data extraction agent. Always respond concisely."
prompt: "Extract the user's name and email."Notice: pre-prompts are also useful to overcome some LLM limitations. For example, some LLMs are incapable of calling certain tools and provide a structured output in the same interaction.
Frags allows LLMs that support tool calling to be used in your plan. Simply define the tools in the plan and reference them in the prompts. Tools must be enabled on a per-session basis.
Example:
sessions:
my_session:
prompt: search for "frags" in the database
tools:
- name: db_search
type: functionIn this specific example, we're enabling a custom tool called db_search that was implemented and injected into the
AI client.
But there are other options
Search is a special tool type and does not require a coded implementation. Instead it relies on the LLM's built-in search capabilities (if available). The Gemini AI client supports this out of the box. To enable it, simply add the internet search tool like this:
sessions:
my_session:
prePrompt: search for "frags" on the internet
prompt: adapt the discovered information to the schema
tools:
- type: internet_searchNOTICE: in this example I explicitly placed the internet search tool in the prePrompt section, because Gemini
does not work well with search and structured output in the same interaction. If your model does not have this problem,
you can place it in the prompt section instead.
Frags also support connection to MCP (Model Control Protocol) servers. The implementation is still quite naive, but it
should work for many use cases. To enable MCP servers, create an mcp.json file in the same directory as the .env
file. The format is similar if not identical to the one proposed by Anthropic.
Example:
{
"mcpServers": {
"bigQuery": {
"command": "toolbox",
"args": ["--prebuilt","bigquery","--stdio"],
"env": {
"BIGQUERY_PROJECT": "foobar-436917",
"GOOGLE_APPLICATION_CREDENTIALS": "bigQuery.json"
}
},
"lmx": {
"url": "http://localhost:8080/sse"
}
}
}The supported modes are stdio and SSE. You will then need to enable the MCP servers you wish to use in each session.
Example:
sessions:
my_session:
prePrompt: search for "frags" in bigQuery, table "services"
prompt: adapt the discovered information to the schema
tools:
- type: mcp
serverName: bigQuery
## continues with schema...Not a tool, but a way to employ tools, pre-calls allows you to directly instruct the Runner to call tools at the start of a session. The invocation is programmatic, requires to plan writer to know exactly what they're doing, and does not involve the LLM. The output of the pre-calls will be included in the context.
Example:
sessions:
my_session:
preCalls:
- name: list_files
description: "File list of the test_data directory"
args:
dir: "./test_data/"
prompt: What file types do we have?
## continues with schema...Notice:
- out of the box, there's no
list_filestool, this is just an example. - Pre-Calls can access any available tool, regardless of whether they've been enabled for the specific session or not.
Loading resources can take various forms based on the needs. The system implements a simple FileResourceLoader, but
you can implement your own to load resources from any source.
See the FileResourceLoader implementation for an example.
Frags is LLM agnostic. You can implement your own AI client by implementing the Ai interface. There are currently 2 implementations, distributed as separate packages:
Frags is also distributed with a CLI tool.