- create venv:
python3 -m venv .venv - activate
source .venv/bin/activate - install deps:
python3 -m pip install -r requirements.txt - run
python3 __main__.pyor./__main__.py
non-interactive usage: ./__main__.py --prompt "how to create a list"
Python documentation helper (e.g. "how do I split a string by spaces?")
- smolagents1 for Agentic RAG + light formatting of answer
- Real python docs (retrieved based on your
python3 --version) - Embed each .txt file in docs with some embedding method, store it somehow (maybe add cache)
- Query documentation through agent -> find relevant doc -> repeat -> output concise answer
- ease of use. Interactive chat mode with highlighting and non-interactive cli raw text output.
- speed. It has to be faster than googling and waiting for gemini to generate explanation
- quality. It has to have knowledge required to assist with Leetcode/Advent of Code type programs, no 100% compatibility, indigenious knowledge.
- local first / offline mode. Fallback to ollama (maybe solely focus on it if this does not sacrifice quality and speed too much)
- actually read references 3-6
- Compare complexity of task to LIMIT2 dataset
- Explore how HyDE3 increase quality
- Explore how Instruction-Trained Retrievers4 increase quality
- Compare with multi-vector56 and lexical search7
- Formalize task, find cherry picks and assemble dataset
- Multiple languages. Query in ru, documents in en VS query and documents in en. Assess ru embedding models.
@misc{weller2025theoreticallimitationsembeddingbasedretrieval,
title={On the Theoretical Limitations of Embedding-Based Retrieval},
author={Orion Weller and Michael Boratko and Iftekhar Naim and Jinhyuk Lee},
year={2025},
eprint={2508.21038},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2508.21038},
}@misc{gao2022precisezeroshotdenseretrieval,
title={Precise Zero-Shot Dense Retrieval without Relevance Labels},
author={Luyu Gao and Xueguang Ma and Jimmy Lin and Jamie Callan},
year={2022},
eprint={2212.10496},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2212.10496},
}@misc{weller2024promptrieverinstructiontrainedretrieversprompted,
title={Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models},
author={Orion Weller and Benjamin Van Durme and Dawn Lawrie and Ashwin Paranjape and Yuhao Zhang and Jack Hessel},
year={2024},
eprint={2409.11136},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2409.11136},
}@article{robertson1995okapi,
title={{Okapi at TREC-3}},
author={Robertson, Stephen E and Walker, Stephen and Hancock-Beaulieu, Micheline M and Gatford, Mark},
journal={Proceedings of the Third Text REtrieval Conference (TREC-3)},
year={1995},
pages={109--122}
}@misc{modernbert,
title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference},
author={Benjamin Warner and Antoine Chaffin and Benjamin Clavié and Orion Weller and Oskar Hallström and Said Taghadouini and Alexis Gallagher and Raja Biswas and Faisal Ladhak and Tom Aarsen and Nathan Cooper and Griffin Adams and Jeremy Howard and Iacopo Poli},
year={2024},
eprint={2412.13663},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.13663},
}
@misc{nussbaum2024nomic,
title={Nomic Embed: Training a Reproducible Long Context Text Embedder},
author={Zach Nussbaum and John X. Morris and Brandon Duderstadt and Andriy Mulyar},
year={2024},
eprint={2402.01613},
archivePrefix={arXiv},
primaryClass={cs.CL}
}@Misc{smolagents,
title = {`smolagents`: a smol library to build great agentic systems.},
author = {Aymeric Roucher and Albert Villanova del Moral and Thomas Wolf and Leandro von Werra and Erik Kaunismäki},
howpublished = {\url{https://github.com/huggingface/smolagents}},
year = {2025}
}Footnotes
-
smolagents: a smol library to build great agentic systems. ↩ -
On the Theoretical Limitations of Embedding-Based Retrieval ↩
-
Precise Zero-Shot Dense Retrieval without Relevance Labels ↩
-
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models ↩
-
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference ↩
-
Nomic Embed: Training a Reproducible Long Context Text Embedder ↩
-
Okapi at TREC-3 ↩