-
Notifications
You must be signed in to change notification settings - Fork 493
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem? Please describe.
The current query engine implementation (see docling_query_engine.py) leverages ChromaDB by wrapping its collection into a LlamaIndex ChromaVectorStore for indexing. Meanwhile, the VectorDBFactory class provides a mechanism to create vector database storage with various backends. To improve flexibility and meet our RAG objectives outlined in [Feature Request]: Docling data ingestion to RAG (#688), we need to extend this functionality.
Describe the solution you'd like
- Review Existing Implementation:
- Examine the current ChromaDB-based query engine implementation.
- Understand how the
VectorDBFactorymaps the user-selected vector DB to a corresponding LlamaIndexVectorStore.
- Implement Additional Support:
- Develop wrappers or integration logic for alternative vector databases, specifically PGVector, MongoDB, and Qdrant.
- Ensure that these new wrappers map configuration options correctly to the LlamaIndex-supported VectorStore interfaces.
- Integration & Testing:
- Integrate the new wrappers with the existing query engine interface.
- Test functionality within the context of the DocumentAgent (Phase 1 DocumentAgent (Phase 1) #438) and ensure compatibility with RAG capabilities.
- Update documentation and examples to reflect the extended support.
Additional context
This enhancement is part of our ongoing effort to make the agent more versatile and not limited to a single vector DB. It builds on recent work (e.g., the merged ChromaDB implementation) and aligns with upcoming changes in retrieve_user_proxy_agent.py to support multiple query engines.
wgong
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Type
Projects
Status
Waiting for merge