-
Notifications
You must be signed in to change notification settings - Fork 493
Description
--FEEDBACK WELCOME--
DocumentAgent will be a document-based agent, able to ingest documents/sources of information and have that knowledge accessible to achieve its given task.
Examples of use-cases:
- Document classification
- Document/Page summarisation
- Question Answering
- Identify missing information
- Invoice handling
The objective for this Phase is to provide a quick-start agent that developers can incorporate easily.
This DocumentAgent will include RAG capabilities and, so, it will be built progressively, with this Phase 1 implementation containing basic RAG capabilities such as being able to ingest and then embed into a vector database. Future implementations will include more advanced RAG capabilities and engines, as well as additional capabilities for document transformation.
Capabilities include:
- Input: Read one or more TXT, CSV, PDF, HTML, Markdown, PPTX, JSON
- Extract and store data, including into an intermediate format (such as Doclings DoclingDocument format)
- Developer determined handling (put in prompt, use vector database, use third party query engine)
- Query data, including support for 3rd party querying
- Support for Structured Outputs to control output format
Example code (not final API):
# Most basic
my_document_agent = DocumentAgent(
name="docagent",
llm_config=...,
sources="my_file.txt")
# Multiple sources, supporting different types
my_document_agent = DocumentAgent(
name="docagent",
llm_config=...,
sources=[my_file_name_with_path, "https://my.url.com"]
# Storage and Retrieval from a Vector database
my_document_agent = DocumentAgent(
name="docagent",
llm_config=...,
sources=[my_file_name_with_path, my_file_name_with_path],
handling_config = DocumentHandlingConfig(document_types=[DocType.Text, DocType.XLSX], storage=DocumentStore.Weaviate, settings={...})
# 3rd-party query engine (or this could be an agent built on DocumentAgent, e.g. DocumentAgentAgentQL)
my_document_agent = DocumentAgent(
name="docagent",
llm_config=None,
sources="https://my.url.com",
handling_config = None,
query_config = DocumentQueryConfig(document_types=[DocType.URL], provider=DocQueryProvider.AgentQL, settings={...})
Internal agent workflow:
- Load/Convert the document through handling configuration (defaulted for easy of use)
- Uses query configuration to respond to queries (e.g. inject full source into system message, query vector store and inject into system message, run external provider)
Notes:
- The use of a common intermediate format may be important, such as using Docling for document parsing and their Docling Document format for local storage. This could provide a good basis for standardised tools for this agent.
Deliverables:
- DocumentAgent code
- Documentation
- Blog
- Notebook
- Video script
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status