Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Feature Request]: Enhance Docling Query Engine: Add PGVector, MongoDB, and Qdrant Support via VectorDBFactory Wrapper #950

@sitloboi2012

Description

@sitloboi2012

Is your feature request related to a problem? Please describe.

The current query engine implementation (see docling_query_engine.py) leverages ChromaDB by wrapping its collection into a LlamaIndex ChromaVectorStore for indexing. Meanwhile, the VectorDBFactory class provides a mechanism to create vector database storage with various backends. To improve flexibility and meet our RAG objectives outlined in [Feature Request]: Docling data ingestion to RAG (#688), we need to extend this functionality.

Describe the solution you'd like

  1. Review Existing Implementation:
  • Examine the current ChromaDB-based query engine implementation.
  • Understand how the VectorDBFactory maps the user-selected vector DB to a corresponding LlamaIndex VectorStore.
  1. Implement Additional Support:
  • Develop wrappers or integration logic for alternative vector databases, specifically PGVector, MongoDB, and Qdrant.
  • Ensure that these new wrappers map configuration options correctly to the LlamaIndex-supported VectorStore interfaces.
  1. Integration & Testing:
  • Integrate the new wrappers with the existing query engine interface.
  • Test functionality within the context of the DocumentAgent (Phase 1 DocumentAgent (Phase 1) #438) and ensure compatibility with RAG capabilities.
  • Update documentation and examples to reflect the extended support.

Additional context

This enhancement is part of our ongoing effort to make the agent more versatile and not limited to a single vector DB. It builds on recent work (e.g., the merged ChromaDB implementation) and aligns with upcoming changes in retrieve_user_proxy_agent.py to support multiple query engines.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

Status

Waiting for merge

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions