Second Brain

Local RAG Agent by Henry Daum

If you find any errors, please let me know.

Second Brain Official Website

Screenshots & GIFs

SecondBrainDemoVideo.mp4

SecondBrainDemoVideo2.mp4

Features

Semantic search: searches based on deep semantic meaning.
Lexical search: finds a needle in a haystack using keywords.
Supported files: .txt, .pdf, .docx, .gdoc, .png, .jpg, .jpeg, .gif, .webp
Multimodal in three ways:
- Can embed and search for both text and images.
- Can attach text documents and images, and use them to search.
- Can use AI language models with vision.
Optional AI Mode:
- LLM Integration: Supports local models via LM Studio and cloud models, like GPT-5, via OpenAI.
- Retrieval-Augmented Generation (RAG): Uses search results to provide high-quality AI responses rooted in the knowledge-base.
- The AI Mode is entirely optional; a search can be run without AI.
Sync 100,000+ files.
Security and privacy: local search ensures no data is sent to the internet.
Works with Google Drive.
Works offline with downloaded models.
Open source: all the code is right here.

Infographic of How it Works

Second Brain Search Flowchart Infographic

How It Works

Backend Code

SecondBrainBackend.py

This file handles two things: syncing the directory and performing searches.

Syncing: Scans the target directory, detects new, updated, or deleted files. Extracts text from files and splits it into manageable chunks, while also batching images together for the sake of speed. Converts text chunks and images into vector embeddings using sentence-transformer models. Stores embeddings in a persistent ChromaDB vector database. To enable lexical search, image captions are created and stored along with image embeddings. A lexical index of both text documents and image captions is created using BM25 for lexical search.

Retrieval: Performs both a semantic and lexical search, then combines the outputs, reranks all results using Maximal Marginal Relevance (MMR), takes the top N outputs, and finally filters for relevance using AI (last step optional).

Frontend Code

SecondBrainFrontend.py

The frontend code controls the user interface for the application. It controls threading for non-blocking backend operations, prompting patterns for the AI, and logging functions. Manages text and image attachments to construct complex prompts for the LLM. Allows the user to send messages in a chat-like format. Provides an interactive settings page. Performs many different functions simultaneously, and orchestrates blocking and re-enabling different buttons at different times to prevent crashes. Looks good, too, if I may say so myself.

Uniquely, image and text results can be attached in order to perform continous searches, like web-surfing but through your own filebase. There is no other experience exactly like it. Simply click on an image result or the four squares by a text result and then click "attach result." Other similar amenities are available in the app.

Settings

config.json

This file is now created automatically by Second Brain and controls various features. It is changed by going into the Settings page, changing a setting, and then clicking Save & Close, which reloads the backend to implement the changes.

Google Drive Authentication

credentials.json

This file must be added (see below) if using Google Drive (optional), as it allows the syncing of Google Doc (.gdoc) files.

Image Labels

image_labels.csv

This CSV is used as a pool for possible image labels. The image labels are chosen based on how close the image embedding is to each label embedding (a categorization task). It was constructed based on Google's Open Images dataset for object identification.

Image Labels

icon.ico

(Optional) The Second Brain logo, used as the window icon.

Hotkey to Run Second Brain

SecondBrainHotkey.ahk If you want, you can make a hotkey (I do ctrl+j) using AutoHotkey that automatically runs SecondBrainFrontend.py when you are anywhere on your computer. This makes it very convenient. To do this, you'll need to download AutoHotkey itself, and edit it so that the path points to where your SecondBrain folder is. To make the hotkey active when you start your computer, place the script inside your startup files, which you can find (on Windows) using Win+R and then typing shell:startup.

Getting Started

1. Prerequisites

Before running the application, ensure you have the following:

Python: Second Brain requires Python 3.9 or higher.
LM Studio: To use AI Mode with a local LLM, you must download and install LM Studio with a .gguf model, ideally with vision capabilities (unsloth/gemma-3-4b-it works well, but any model may be used).
Open AI API: To use a model from the OpenAI API, make sure you have an API key from platform.openai.com with sufficient funds.
Dependencies: Install the required Python libraries (see below - dependencies) using pip.
System requirements: The search engine can be used with GPU or CPU. It will automatically detect if a GPU exists and use it if available. Different text and image embedding models use different amounts of memory, and can be configured. For example, the default models use 2GB of VRAM/RAM.

2. Installation

Download SecondBrainBackend, SecondBrainFrontend, icon.ico, and image_labels.csv from this repository. Only these four files are needed. Not bad, right? Place them in a folder.

3. Configuration

On the welcome screen, or in the settings, update the Sync Directory to point to the folder you want to use as your knowledge base.

(Optional - skip if irrelevant) To enable Google Doc syncing, follow the Google Cloud API instructions to get a credentials.json file using OAuth. Either place the file in the project directory, or place it somewhere else and update Path to credentials.json in the settings. The first time you sync, a browser window will open for you to authorize the application. Authentication is very finnicky and it might be necessary to delete the authorization token (token.json) and then reauthorize to get Drive syncing to work. You can do this with the "Reauthorize Drive" button.

4. Running the application

To start the application, run SecondBrainFrontend.py from the project folder. During the first run, it may take a while to download the models.

Usage Guide

Syncing your files:

Click the Sync Directory button to start embedding the files from your Sync Directory. Depending on the size of the directory, it can take a while (hours), but is necessary to enable search. This is faster with CUDA. The sync can be cancelled midway through by clicking the Cancel Sync button with no consequences. If restarted, the sync will continue where it left off.

Searching:

Text results are based on relevant chunks of text from the embedding database. You can click on image results and file paths to open them directly, attach them, see their parent folder, or copy their path.

Using AI mode:

Toggle the AI Mode checkbox to enable or disable LLM augmented search. When enabled, Second Brain loads the selected LLM from LM Studio or the OpenAI API. Then, searches are enhanced with AI-generated queries, results are filtered by the AI based on relevance, and a final "AI Insights" summary is always streamed in the results container. It is optional but recommended to use vision-enabled models, since it is needed to filter images and give insights on them. When disabled, the LLM is unloaded to save system resources, and the app performs a direct vector/lexical search with no query expansion, filtering, or summary. It is possible to toggle query expansion and filtering (go to Settings: LLM Query Expansion and LLM Filtering).

Attaching files:

If you attach a text document (.pdf, .docx, etc.), the entire text will be added if it is below a certain size (see Settings: Max Attachment Size). If it is too large, it extracts several relevant chunks to provide focused context for the search. If you attach an image, you can send it to find visually similar images in your database and related documents. You can send an attachment without a message.

Saving Insights:

If you find that Second Brain has produced a good insight, you can save it for future use. When you click the "Save Insight" button, found after the AI Insights section of a response, the query and response get saved to a .txt file. These text files will be embedded, which gives Second Brain a long-term memory of its own contributions. (You can re-read or delete entries by going to the saved_insights folder, in the Second Brain directory.)

Configuration Details

config.json

Most of these settings can be controlled using the Settings page, but it is also possible to open the actual file to change some of the more obscure settings, like synthesize\_results\_prompt, if desired.

Parameter Name	Function	Range	Default
credentials_path	Path to credentials.json which is used to authenticate Google Drive downloads. By default it will look in the same folder as SecondBrainFrontend.py	Any path	"credentials.json"
target_directory	Which directory to sync to. Embeds all valid files in the directory.	Any directory	"C:\\Users\\user\\My Drive"
text_model_name	SentenceTransformers text embedding model name. "BAAI/bge-m3", "BAAI/bge-large-en-v1.5", and "BAAI/bge-small-en-v1.5" work well, with the small version using fewer system resources, and m3 using the most. Also, BAAI/bge-m3 is multilingual. If you change the text model name, you need to remake all of your embeddings with that model - every embedding model has its own collection of the same name.	Any valid name	"BAAI/bge-large-en-v1.5"
image_model_name	SentenceTransformers image embedding model name. "clip-ViT-L-14", "clip-ViT-B-16", and "clip-ViT-B-32" work well, with the 32 version using fewer system resources, and L-14 using the most. If you change the image model name, you need to remake all of your embeddings with that model. However, you will not lose your old collection, should you wish to turn back.	Any valid name	"clip-ViT-B-16"
embed_use_cuda	Turning this to false forces the embedding models to use CPU. If set to true, uses CUDA if possible. I recommend using CUDA for syncing, but CPU is more than enough for individual searches, which means you can allocate more VRAM for a local model if desired.	true or false	true
batch_size	How many text chunks/images are processed in parallel with the embedders. Embedding is faster with higher values, given enough compute.	1-50	20
chunk_size	Maximum size of text chunks created from the text splitter, in tokens; 200 and 0 chunk_overlap is shown in research to be very good.	50-2000	200
chunk_overlap	Used to preserve continuity when splitting text.	0-200	0
max_seq_length	Maximum input size for the text embedding model, in tokens. Lower values are faster. Must be larger than chunk_size.	Depends on model, 250-8192	512
mmr_lambda	Prioritize diversity or relevance in search results; 0 = prioritize diversity only, 1 = prioritize relevance only.	0.0-1.0	0.5
mmr_alpha	The MMR rerank uses a hybrid semantic-lexical diversity metric. An mmr_alpha of 0.0 prioritizes lexical diversity, while 1.0 is for using only semantic diversity in choosing how to rerank.	0.0-1.0	0.5
search_multiplier	How many results to find for each query.	At least 1	5
ai_mode	Whether to use AI to aid in searches.	true or false	true
ai_mode	Whether to show the top button bar.	true or false	true
show_buttons	Whether to show the button row at the top. Controlled by the button on the bottom left in the main screen.	true or false	true
llm_filter_results	Filtering results is somewhat slow. This gives the option to turn that part of ai_mode on/off.	true or false	false
llm_backend	Choose which AI backend to use. OpenAI is slower and not local but has smarter models.	“LM Studio” or "OpenAI"	"LM Studio"
lms_model_name	Can be any language model from LM Studio, but it must be already downloaded. Vision is preferred.	Any valid name	"unsloth/gemma-3-4b-it"
openai_model_name	Can be any OpenAI text model. (Some OpenAI models, like gpt-5, require additional verification to use.)	Any OpenAI model	"gpt-5-mini"
openai_api_key	When using OpenAI as a backend, an API key from platform.openai.com is required. Using it costs money, so the account must have enough funds. If this field is left blank (""), the OpenAI client will look for an environmental variable called OPENAI_API_KEY, and use that. Otherwise, the client will use the string found here.	Any API key	""
max_results	Sets the maximum for both text and image results.	1-30	6
search_prefix	Phrase prefixed to the start of text search queries; recommended with the text embedding models BAAI/bge-large-en-v1.5 and BAAI/bge-small-en-v1.5. Not needed for BAAI/bge-m3. Set to “” to disable.	-	"Represent this sentence for searching relevant passages: "
query_multiplier	How many queries the AI is asked to make to augment the search, based on the user’s attachment and user prompt. Set to 0 to turn off the feature.	0 or more	5
max_attachment_size	Will try to add an entire attachment to an AI as context if it is below this size, measured in tokens. Decrease if the context window is small. If an attachment is larger than this size, only the most relevant chunks of the attachment will be added.	256-8192	1024
n_attachment_chunks	How many relevant text chunks to extract from attachments. In addition to LLM context, these are used as additional queries for lexical and semantic search.	1-10	3
system_prompt	Special instructions for how the AI should do its job. Feel free to change the special instructions to anything.	Any string	"You are a personal search assistant, made to turn user prompts into accurate and relevant search results, using information from the user's database. Special instruction: Be concise and pragmatic. Use simple, structured analogies to illustrate relationships. Sound casually confident and lightly playful."
generate_queries_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\n{user_request}\n{attachment_context}\nBased on the user's prompt, generate {n} creative search queries that could retrieve relevant {content} to answer the user. These queries will go into a semantic search algorithm to retreive relevant {content} from the user's database. The queries should be broad enough to find a variety of related items. These queries will search a somewhat small and personal database (that is, the user's hard drive). Respond with a plain list with no supporting text or markdown.{reminder_suffix}"
evaluate_text_relevance_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\n{attachment_context}\n{user_request}\nDocument excerpt to evaluate:\n"{chunk}"\n\nIs this excerpt worth keeping? Respond only with YES or NO.\n\nRelevance is the most important thing. Does the snippet connect to the user's request?\n\nIf the excerpt is gibberish, respond with NO.\n\n(Again: respond only with YES or NO.)"
evaluate_image_relevance_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\n{attachment_context}\n{user_request}\nIs the provided image worth keeping? Respond only with YES or NO.\n\nRelevance is the most important thing. Does the photo connect to the user's request?\n\nIf the image is blank, corrupted, or unreadable, respond with NO.\n\nImage file path: {image_path}\n\nIf the user's query has an exact match within the file path, respond with YES.\n\n(Again: respond only with YES or NO.)"
synthesize_results_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\nIt is {date_time}.\n\n{user_request}\n{attachment_context}\n{database_results}\nYour Task:\nBased exclusively on the information provided above, write a concise and helpful response. Your primary goal is to synthesize the information to guide the user towards what they want.\n\nInstructions:\n- The text search results are snippets from larger documents and may be incomplete.\n- Do not assume or guess the author of a document unless the source text makes it absolutely clear.\n- The documents don't have timestamps; don't assume the age of a document unless the source text makes it absolutely clear.\n- Cite every piece of information you use from the search results with its source, like so: (source_name).\n- If the provided search results are not relevant to the user's request, state that you could not find any relevant information.\n- Use markdown formatting (e.g., bolding, bullet points) to make the response easy to read.\n- If there are images, make sure to consider them for your response."
use_drive	The main control for whether to use Drive or not. Disabling it will make the Reauthorize button disappear, as well as prevent the system from trying to embed .gdoc files or request authorization.	true or false	true

NOTE: is_final_microsoft_store_product is a control that is at the top of SecondBrainFrontend.py - setting it to True disables all print messages, removes CUDA availability, and removes all reference and functionality to do with Google Drive. It's best to leave this as False.

Dependencies

You can install the Python dependencies with the following command:

pip install requests pdfminer.six chromadb python-docx rank_bm25 flet numpy Pillow langchain_core transformers sentence-transformers scipy langchain-text-splitters openai lmstudio google-auth-oauthlib google-api-python-client torch keyring

requests
pdfminer.six
chromadb
python-docx
rank_bm25
flet
numpy
Pillow
langchain_core
transformers
sentence-transformers
scipy
langchain-text-splitters
openai
lmstudio
google-auth-oauthlib
google-api-python-client
torch (if using GPU, you need to download the version that works with your CUDA toolkit)
keyring

Hidden Variables

These can be found in the code to change minor features. IMG_THUMBNAIL, stop_words, jpeg_quality, min_len, low_compression_threshold, high_compression_threshold, temperature, openai_vision_keywords, image_preview_size

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Second Brain

Local RAG Agent by Henry Daum

Screenshots & GIFs

Features

Infographic of How it Works

How It Works

Backend Code

Frontend Code

Settings

Google Drive Authentication

Image Labels

Image Labels

Hotkey to Run Second Brain

Getting Started

1. Prerequisites

2. Installation

3. Configuration

4. Running the application

Usage Guide

Syncing your files:

Searching:

Using AI mode:

Attaching files:

Saving Insights:

Configuration Details

Dependencies

Hidden Variables

Further Reading / Sources

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 200 Commits
LICENSE		LICENSE
README.md		README.md
SecondBrainBackend.py		SecondBrainBackend.py
SecondBrainFrontend.py		SecondBrainFrontend.py
SecondBrainHotkey.ahk		SecondBrainHotkey.ahk
icon.ico		icon.ico
image_labels.csv		image_labels.csv

License

henrydaum/second-brain

Folders and files

Latest commit

History

Repository files navigation

Second Brain

Local RAG Agent by Henry Daum

Screenshots & GIFs

Features

Infographic of How it Works

How It Works

Backend Code

Frontend Code

Settings

Google Drive Authentication

Image Labels

Image Labels

Hotkey to Run Second Brain

Getting Started

1. Prerequisites

2. Installation

3. Configuration

4. Running the application

Usage Guide

Syncing your files:

Searching:

Using AI mode:

Attaching files:

Saving Insights:

Configuration Details

Dependencies

Hidden Variables

Further Reading / Sources

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages