CodeSearch is natural language query tool for codebases. It uses the OpenAI API for getting the embedding vectors and indexes them using the FAISS library. When you query the system, it matches the query vector to the top 5 results and display in a web-based UI.
When you enter a project path or Github URL, the tool searches for all the code files inside the project (including subdirectories), generates its Abstract Syntax Tree (using libCST or TreeSitter library), indexes each of the function, along with its filepath, class name, function name, line number and embedding vector. It then stores all of this in a local PostgreSQL database. In case of Github repo, it first clones the repo and then does the indexing.
- Fast Setup: We have Docker support to quickly setup a working environment
- Interactive UI: We have created a interactive web-based UI for searching
- Language Support: Currently we support Python and Javascript. Adding new languages shortly
- GPU and CPU Support: The tool has support for GPU as well as CPU
- Github Support: Directly index and query a Github repo
- Multiple Project Support: Index and save multiple projects at a time
- Get your OpenAI API key from link
- Rename the
/backend/codesearch/.env.sampleto/backend/codesearch/.envand add your API key & Org ID in it
- Rename
/backend/Dockerfile_cputo/backend/Dockerfileand./docker-compose_cpu.ymlto./docker-compose.yml - Run the Docker container:
$ docker compose up > logs.txt- Install the NVIDIA container toolkit from their official website link
- Use the default
/backend/Dockerfileand./docker-compose.ymlfiles - Run the Docker container:
$ docker compose up > logs.txt- Once you run the
docker compose upcommand. The application should be up and running onhttp://localhost:4000/ - Open
http://localhost:4000/on the browser - Click on
Index Projectbutton and enter the absolute path to your project to be indexed - It should take about 30 secs to index the project. You can see the progress on
logs.txtwhich is created - Once completed, enter your query in the search box and it will return the top 5 results for your search