Bug Description
The hierarchical retriever's recursive search path sends L2 (detail/content) abstracts to the reranker. L2 abstracts store full file content with no size limit. When the total input (query + all documents) exceeds the reranker model's batch size, the rerank call fails and falls back to vector scores, degrading retrieval quality.
The first rerank call (_prepare_initial_candidates) correctly filters out L2 before reranking. The second rerank call (_recursive_search) does not.
Steps to Reproduce
Have memory files with content > ~200 chars
Run a search query that triggers recursive search with mode=RetrieverMode.THINKING
Check reranker logs for: input (X tokens) is too large to process. increase the physical batch size (current batch size: 512)
Reranker falls back to vector scores
Expected Behavior
Reranker should only receive L0/L1 abstracts (which have size limits: 256/4000 chars). L2 content should be excluded from reranking, matching the behavior of the first rerank call.
Actual Behavior
Recursive search sends all levels (L0, L1, L2) to the reranker. L2 abstracts = full file content (observed 1,000–5,000+ chars). Total input exceeds batch size → rerank fails → falls back to vector scores.
Minimal Reproducible Example
# In hierarchical_retriever.py, line 479:
documents = [str(r.get("abstract", "")) for r in results] # Includes L2
# Should be:
documents = [str(r.get("abstract", "")) for r in results if r.get("level", 2) != 2]
Error Logs:
Error Logs
OpenAI API error: Error code: 500 - {'error': {'code': 500, 'message': 'input (7744 tokens) is too large to process. increase the physical batch size (current batch size: 512)', 'type': 'server_error'}}
Observed token ranges in failures: 515–7,744 tokens.
OpenViking Version
0.4.4
Python Version
3.12.0
Operating System
Linux
Model Backend
Other
Additional Context
Call 1 (_prepare_initial_candidates, line 312) — correctly filters L2:
Copy
global_results = [r for r in global_results if r.get("level", 2) != 2] # ✅ Filters L2
docs = [str(r.get("abstract", "")) for r in global_results]
Call 2 (_recursive_search, line 479) — does NOT filter L2:
Copy
documents = [str(r.get("abstract", "")) for r in results] # ❌ Includes L2
Additionally, L2 abstracts have no size limit. In memory_updater.py:1072:
Copy
abstract = LinkRenderer.strip_all_links(mf.content or "") # Full file content, no truncation
The _enforce_size_limits() in semantic_processor.py only applies to directory-level abstracts (L0/L1), not file-level abstracts (L2).
Bug Description
The hierarchical retriever's recursive search path sends L2 (detail/content) abstracts to the reranker. L2 abstracts store full file content with no size limit. When the total input (query + all documents) exceeds the reranker model's batch size, the rerank call fails and falls back to vector scores, degrading retrieval quality.
The first rerank call (_prepare_initial_candidates) correctly filters out L2 before reranking. The second rerank call (_recursive_search) does not.
Steps to Reproduce
Have memory files with content > ~200 chars
Run a search query that triggers recursive search with mode=RetrieverMode.THINKING
Check reranker logs for: input (X tokens) is too large to process. increase the physical batch size (current batch size: 512)
Reranker falls back to vector scores
Expected Behavior
Reranker should only receive L0/L1 abstracts (which have size limits: 256/4000 chars). L2 content should be excluded from reranking, matching the behavior of the first rerank call.
Actual Behavior
Recursive search sends all levels (L0, L1, L2) to the reranker. L2 abstracts = full file content (observed 1,000–5,000+ chars). Total input exceeds batch size → rerank fails → falls back to vector scores.
Minimal Reproducible Example
Error Logs
OpenAI API error: Error code: 500 - {'error': {'code': 500, 'message': 'input (7744 tokens) is too large to process. increase the physical batch size (current batch size: 512)', 'type': 'server_error'}} Observed token ranges in failures: 515–7,744 tokens.OpenViking Version
0.4.4
Python Version
3.12.0
Operating System
Linux
Model Backend
Other
Additional Context
Call 1 (_prepare_initial_candidates, line 312) — correctly filters L2:
Copy
global_results = [r for r in global_results if r.get("level", 2) != 2] # ✅ Filters L2
docs = [str(r.get("abstract", "")) for r in global_results]
Call 2 (_recursive_search, line 479) — does NOT filter L2:
Copy
documents = [str(r.get("abstract", "")) for r in results] # ❌ Includes L2
Additionally, L2 abstracts have no size limit. In memory_updater.py:1072:
Copy
abstract = LinkRenderer.strip_all_links(mf.content or "") # Full file content, no truncation
The _enforce_size_limits() in semantic_processor.py only applies to directory-level abstracts (L0/L1), not file-level abstracts (L2).