This README is the documentation for a Home Assistant (HA) integration called home-generative-agent. This project uses LangChain and LangGraph to create a generative AI agent that interacts with and automates tasks within a HA smart home environment. The agent understands your home's context, learns your preferences, and interacts with you and your home to accomplish activities you find valuable. Key features include creating automations, analyzing images, and managing home states using various LLMs (Large Language Models). The architecture involves both cloud-based and edge-based models for optimal performance and cost-effectiveness. Installation instructions, configuration details, and information on the project's architecture and the different models used are included. The project is open-source and welcomes contributions.
These are some of the features currently supported:
- Create complex Home Assistant automations.
- Image scene analysis and understanding.
- Home state analysis of entities, devices, and areas.
- Full agent control of allowed entities in the home.
- Short- and long-term memory using semantic search.
- Automatic summarization of home state to manage LLM context length.
This integration will set up the conversation platform, allowing users to converse directly with the Home Generative Assistant, and the image and sensor platforms which create entities to display the latest camera image, the AI-generated summary, and recognized people in HA's UI or they can be used to create automations.
- Installation
- Configuration
- Sentinel (Proactive Anomaly Detection)
- Image and Sensor Entities
- Enroll People (Face Recognition)
- Architecture and Design
- Example Use Cases
- Makefile
- Contributions are welcome!
- Install the PostgreSQL with pgvector add-on by clicking the button below and configure it according to these directions. This allows for persistence storage of conversations and memories with vector similarity search.
- home-generative-agent is available in the default HACS repository. You can install it directly through HACS or click the button below to open it there.
- Add Home Generative Agent as an assistant in your Home Assistant installation by going to Settings → Voice Assistants. Use a configuration similar to the figure below.
-
Install all the Blueprints in the
blueprintsdirectory. You can manually create automations using these that converse directly with the Agent (the Agent can also create automations for you from your your conversations with it, see examples below.) -
(Optional) Install
ollamaon your edge device by following the instructions here.
- Pull
ollamamodelsgpt-oss,qwen3:8b,qwen3:1.7b,qwen2.5vl:7bandmxbai-embed-large.
- (Optional) Install face-service on your edge device if you want to use face recognition.
- Go to Developers tools -> Actions -> Enroll Person in the HA UI to enroll a new person into the face database from an image file.
- If you want the dashboard enrollment card, add the Lovelace resource after installing the integration:
- Settings -> Dashboards -> Resources -> Add
- URL:
/hga-card/hga-enroll-card.js - Type:
JavaScript Module
- If you want the Sentinel proposals dashboard card, add this resource as well:
- Settings -> Dashboards -> Resources -> Add
- URL:
/hga-card/hga-proposals-card.js - Type:
JavaScript Module
- Install PostgreSQL with pgvector as shown above in Step 1.
- Using the tool of choice, open your HA configuration's directory (where you find
configuration.yaml). - If you do not have a
custom_componentsdirectory, you must create it. - In the
custom_componentsdirectory, create a new sub-directory calledhome_generative_agent. - Download all the files from the
custom_components/home_generative_agent/directory in this repository. - Place the files you downloaded in the new directory you created.
- Restart Home Assistant
- In the HA UI, go to "Configuration" -> "Integrations" click "+," and search for "Home Generative Agent"
- Follow steps 3 to 6 above.
Configuration is done entirely in the Home Assistant UI using subentry flows. A "feature" is a discrete capability exposed by the integration (for example Conversation, Camera Image Analysis, or Conversation Summarization). Each feature is enabled separately and has its own model/provider configuration.
- Add the integration (instruction-only screen).
- If you previously configured the integration via the legacy flow, your settings are automatically migrated into the new subentry-based UI.
- Click + Setup on the integration page.
- Enable optional features.
- Configure each enabled feature’s model settings.
- Configure the database.
- If no model provider exists, you’ll see a reminder to add one.
- Default features include Conversation, Camera Image Analysis, and Conversation Summarization.
- Click + Model Provider to add a provider (Edge/Cloud → provider → settings).
- The first provider is automatically assigned to all features with default models.
- Use a feature’s gear icon to adjust that feature’s model settings later.
- Click + Sentinel to configure proactive Sentinel behavior.
- This is where Sentinel runtime, cooldowns, discovery, explanation, and optional notify service are configured.
Embedding model selection: the integration uses the first model provider that supports embeddings (or the feature’s provider when it advertises embedding capability). If you want a different embedding model, add a provider that supports embeddings and select the desired embedding model name in that provider’s defaults, then re-run Setup or reload the integration.
If you want separate Ollama servers per feature, add multiple Model Provider subentries and assign them in each feature’s settings. For example: create a “Primary Ollama” provider pointing at your chat server and a “Vision Ollama” provider pointing at your camera analysis server, then select the appropriate provider on the feature’s model settings step.
Global options (prompt, face recognition URL, context management, critical-action PIN, etc.) live in the integration’s Options flow. Sentinel settings are configured in the Sentinel subentry.
HGA can provide a built-in STT engine using the OpenAI Whisper API so you can use voice without a separate STT integration.
- Open Settings → Devices & Services → Home Generative Agent.
- Click + STT Provider.
- Choose OpenAI and give it a name.
- On the Credentials step, either:
- Reuse an existing OpenAI Model Provider subentry, or
- Select Use a separate key and enter a dedicated OpenAI API key.
- On Model & advanced options, pick a model (recommended:
gpt-4o-mini-transcribe) and set optional fields:language(optional): e.g.,enoren-USprompt(optional): hints for domain-specific vocabularytemperature(optional): 0–1translate: only supported bywhisper-1; other models will fall back to transcription
- Go to Settings → Voice assistants → Assist pipelines and select STT - OpenAI (or your chosen name) for Speech-to-text.
Schema-first JSON for YAML requests controls how the agent handles YAML-style requests (automations, dashboards, or “show me YAML”).
When it is ON:
- The agent returns strict JSON that the integration converts to YAML for display.
- Automations are not auto-registered; the YAML is shown in chat.
- If you want a file you can use in Home Assistant, explicitly ask the agent to save the YAML. It writes under
/config/www/and returns a/local/...URL.
When it is OFF:
- Dashboard generation is disabled; the agent will respond: “Please enable 'Schema-first JSON for YAML requests' in HGA's configuration and try again.”
- Automations are auto-registered; the YAML is not shown in chat.
- Other YAML-style requests follow the standard prompt behavior (no schema enforcement).
Note: the YAML rendered in the chat window may not preserve indentation due to UI rendering, so it may be invalid if copied directly. Use the saved file instead.
Example prompt: “Save this YAML to a file called garage-light.”
Keep unlocking and opening actions behind a second check. Open Home Assistant → Settings → Devices & Services → Home Generative Agent → Configure and toggle Require critical action PIN (on by default). Enter a 4-10 digit PIN to set or replace it; the value is stored as a salted hash. Leaving the field blank while the toggle is on clears the stored PIN, and turning the toggle off removes the guard entirely. In the conversation agent settings for HGA, disable Prefer handling commands locally for Critical Action PIN protection to work properly.
The agent will demand the PIN before it:
- Unlocks or opens locks.
- Opens covers whose entity_id includes door/gate/garage, or opens garage doors.
- Uses HA intent tools on locks. Alarm control panels use their own alarm code and never the PIN.
If you have an alarm control panel, the agent will ask for that alarm's code when arming or disarming; this code is separate from the critical-action PIN.
When you ask the agent to perform a protected action, it queues the request and asks for the PIN. Reply with the digits to complete the action; after five bad attempts or 10 minutes, the queued action expires and you must ask again. If the guard is enabled but no PIN is configured, the agent will reject the request until you set one in options.
Sentinel adds proactive, deterministic anomaly detection and a review pipeline for generated rule proposals.
Sentinel is a singleton service per Home Generative Agent config entry. Configure exactly one Sentinel subentry.
snapshot: Builds an authoritative JSON snapshot (entities, camera activity, derived context).sentinel: Runs deterministic rules on that snapshot.discovery(optional): Uses an LLM to suggest rule candidates (advisory only).proposalreview: User promotes/approves/rejects candidates.rule_registry: Stores approved generated rules (including active/inactive state) for deterministic runtime evaluation.audit: Persists findings and user action outcomes.
Important: The LLM never executes actions or directly decides runtime safety behavior. Detection and actuation remain deterministic.
When Sentinel notifications are enabled:
- Mobile push explanation text is compact and plain-language (targeted for small screens).
- Explanation text is normalized before send (markdown/backticks removed, whitespace collapsed).
- If explanation text is missing or too long, Sentinel uses a deterministic fallback message.
- Fallback urgency wording depends on severity:
high:Urgent: check and secure it now.medium:Check soon and secure it if unexpected.low:Review when convenient.
- Mobile action buttons are:
Acknowledge,Ignore,Later. Executeis shown for non-sensitive findings that include suggested actions.Ask Agentis shown for sensitive findings that include suggested actions. This hands the finding to the conversation agent, which can verify a PIN or alarm code before acting.
When a user taps an action button, Sentinel uses a two-tier dispatch strategy: it first attempts to call the HGA conversation agent directly via conversation.process; if no conversation entity is available it falls back to firing a Home Assistant event so blueprints/automations can handle the request.
- Agent available — calls the conversation agent with a natural-language prompt describing the finding and suggested actions. The agent checks live context, takes action, and its reply is pushed back as a mobile notification (when
notify_serviceis configured). - Agent unavailable — fires
hga_sentinel_execute_requestedso a blueprint or automation can handle it. - Sensitive finding — blocked with status
blocked.
- Agent available — calls the conversation agent with a security-focused prompt. The agent can verify a PIN or alarm code (if configured under Critical Action settings) before executing. Its reply is pushed back as a mobile notification.
- Agent unavailable — fires
hga_sentinel_ask_requested(includes asuggested_promptfield) so a blueprint can route it to the agent.
Both hga_sentinel_execute_requested and hga_sentinel_ask_requested share these fields:
requested_atanomaly_idtypeseverityconfidencetriggering_entitiessuggested_actionsis_sensitiveevidencemobile_action_payload
hga_sentinel_ask_requested additionally includes:
suggested_prompt— a ready-to-use natural-language prompt for the conversation agent.
Three draft blueprints are included in the blueprints/ folder:
hga_sentinel_execute_router.yamlhga_sentinel_execute_escalate_high.yamlhga_sentinel_ask_router.yaml
How to import in Home Assistant:
- Open
Settings->Automations & Scenes->Blueprints. - Import each YAML from this repository's
blueprints/directory. - Create automations from the imported blueprints and configure inputs.
What each blueprint does:
hga_sentinel_execute_router.yaml: routeshga_sentinel_execute_requestedbysuggested_actionsto scripts (check_appliance,check_camera,check_sensor,close_entry,lock_entity) with default fallback support.hga_sentinel_execute_escalate_high.yaml: handles onlyseverity: highexecute events and can send persistent notifications, mobile push, and optional TTS.hga_sentinel_ask_router.yaml: routeshga_sentinel_ask_requestedevents to the HGA conversation agent. The agent receives thesuggested_promptfrom the event, can verify a PIN if needed, and sends its response back as a notification.
Recommended usage:
- Start with
hga_sentinel_execute_escalate_high.yamlfor immediate high-priority visibility. - Add
hga_sentinel_execute_router.yamlwhen you have scripts ready for action-specific handling. - Add
hga_sentinel_ask_router.yamlas a fallback for sensitive findings when the built-in agent dispatch is not available (e.g., the conversation entity is not yet registered at startup).
Script contract for router targets:
- Router script calls pass one object in
data.sentinel_event. sentinel_eventmatches the execute event payload and includes:requested_atanomaly_idtypeseverityconfidencetriggering_entitiessuggested_actionsis_sensitiveevidencemobile_action_payload
Where to store these scripts in Home Assistant:
- Create them as regular HA scripts:
Settings->Automations & Scenes->Scripts->+ Create Script->Edit in YAML. - Save each with a stable script entity ID (for example
script.hga_check_camera_flow) so it can be selected inhga_sentinel_execute_router.yaml. - If you manage YAML directly, store them in
scripts.yaml(or an included scripts file) and reload scripts.
Example script target for check_appliance:
alias: HGA Check Appliance Flow
mode: queued
fields:
sentinel_event:
description: Sentinel execute event payload
sequence:
- action: persistent_notification.create
data:
title: "HGA Appliance Follow-up"
message: >
Type={{ sentinel_event.type }},
severity={{ sentinel_event.severity }},
entities={{ sentinel_event.triggering_entities | join(', ') }}.
- action: notify.mobile_app_phone
data:
title: "HGA Appliance Follow-up"
message: >
Suggested actions:
{{ sentinel_event.suggested_actions | join(', ') if sentinel_event.suggested_actions else 'none' }}Example script target for check_camera:
alias: HGA Check Camera Flow
mode: queued
fields:
sentinel_event:
description: Sentinel execute event payload
sequence:
- action: notify.mobile_app_phone
data:
title: "HGA Camera Follow-up"
message: >
Camera-related event {{ sentinel_event.type }}.
Entities={{ sentinel_event.triggering_entities | join(', ') if sentinel_event.triggering_entities else 'none' }}.
- action: persistent_notification.create
data:
title: "HGA Camera Follow-up"
message: >
Evidence: {{ sentinel_event.evidence }}Example script target for lock_entity:
alias: HGA Lock Entity Follow-up
mode: queued
fields:
sentinel_event:
description: Sentinel execute event payload
sequence:
- variables:
lock_id: >
{% set ids = sentinel_event.triggering_entities | default([], true) %}
{{ ids[0] if ids else '' }}
- choose:
- conditions:
- condition: template
value_template: "{{ lock_id.startswith('lock.') }}"
sequence:
- action: lock.lock
target:
entity_id: "{{ lock_id }}"
default:
- action: persistent_notification.create
data:
title: "HGA Lock Entity Follow-up"
message: >
Could not resolve lock entity from event:
{{ sentinel_event.triggering_entities | default([], true) }}Tip: if your script needs the raw mobile action callback details, read sentinel_event.mobile_action_payload.
unlocked_lock_when_homealarm_disarmed_open_entrylow_battery_sensorsunavailable_sensorsopen_entry_when_homeopen_entry_while_awayopen_entry_at_night_when_homeopen_entry_at_night_while_awayopen_any_window_at_night_while_awaymotion_detected_at_night_while_alarm_disarmedmotion_without_camera_activitymotion_while_alarm_disarmed_and_home_presentunknown_person_camera_no_home
Discovery suggestions are deduped in the backend using deterministic semantic keys.
Novelty checks compare candidates against:
- Active rules in
rule_registry - Existing proposal drafts
- Recent discovery records
Discovery records may include:
semantic_key: canonical normalized key for candidate meaningdedupe_reason: candidate disposition (novel,existing_semantic_key,batch_duplicate)filtered_candidates: candidates removed by dedupe with their reason
Discovery is configured in the Sentinel subentry:
- Home Assistant ->
Settings->Devices & Services - Open
Home Generative Agent - Select
+ Sentinel(or reconfigure the existing Sentinel subentry) - Set Sentinel discovery options:
sentinel_discovery_enabledsentinel_discovery_interval_secondssentinel_discovery_max_records
Discovery requires a configured chat model. If no model is available, the discovery loop is skipped.
Proposal draft statuses:
draftapprovedrejectedunsupportedcovered_by_existing_rule
covered_by_existing_rule means the candidate is semantically covered by an active rule and should not be approved as a separate rule. covered_rule_id is attached when available.
home_generative_agent.get_discovery_recordshome_generative_agent.promote_discovery_candidatehome_generative_agent.get_proposal_draftshome_generative_agent.approve_rule_proposalhome_generative_agent.reject_rule_proposalhome_generative_agent.get_dynamic_ruleshome_generative_agent.deactivate_dynamic_rulehome_generative_agent.reactivate_dynamic_rulehome_generative_agent.get_audit_records
Typical response fields:
statuscandidate_idrule_idcovered_rule_idrecordsenabled
If you install hga-proposals-card.js, the card can drive the full review flow:
- Discovery candidates
- Filtered discovery candidates (with dedupe reasons)
- Proposal drafts (pending)
- Proposal history
It also supports:
- Promote to draft
- Reject candidate (local dismiss in browser storage)
- Approve/reject proposal
- Collapsible sections (Proposal Drafts is expanded by default)
- "Request New Template" shortcut that opens a prefilled GitHub issue form
- Immediate "Template Requested" feedback after click (stored per candidate in browser local storage)
- Deactivate/reactivate controls for historical approved rules
Installation:
- Go to
Settings -> Dashboards -> Resources -> Add Resource. - Add:
- URL:
/hga-card/hga-proposals-card.js - Type:
JavaScript Module
- URL:
- Add the card to a dashboard using a manual card config:
type: custom:hga-proposals-card
title: Sentinel ProposalsNotes:
- The card type must include the
custom:prefix. - If the card shows as unknown after adding the resource, hard refresh the browser and reload frontend resources.
- Legacy resource URLs under
/hga-enroll-card/...still work for backward compatibility.
When updating card JS, bump the Lovelace resource query string (for example ?v=12) to avoid stale browser cache.
unsupported means the candidate could not be mapped to a supported deterministic template.
Preferred handling:
- Reject if not useful.
- If useful, request a new template via
.github/ISSUE_TEMPLATE/feature_rule_request.yml(the card pre-populates relevant fields from the proposal and marks the candidate as "Template Requested" locally in the browser). - After template support is added, re-approve the proposal to re-evaluate with current mapping logic.
Compatibility note: unavailable_sensors_while_home supports re-approving legacy drafts whose evidence_paths used domainless entity IDs (for example entities[entity_id=backyard_vmd3_0].state).
unavailable_sensors is also supported for candidates without explicit occupancy context (for example backyard_sensors_unavailable). It triggers only when all listed sensors are unavailable; if any required sensor is missing or not unavailable, no finding is produced.
motion_while_alarm_disarmed_and_home_present is supported for candidates that provide motion entities, an alarm entity, and one or more person.* entities in evidence paths. It triggers only when all required entities are present and states match exactly: alarm disarmed, motion on, and person home.
motion_detected_at_night_while_alarm_disarmed is supported for candidates that provide motion entities, an alarm entity, and derived.is_night evidence (for example candidate motion_at_night_disarmed). It triggers only when all required entities referenced by the rule are present, snapshot derived.is_night is true, alarm state is disarmed, and at least one motion entity is on. It returns no findings when required entities are missing.
low_battery_sensors is supported for battery entity candidates (for example sensor.elias_t_h_battery and sensor.girls_t_h_battery). It triggers when any listed sensor is at or below the configured threshold (default 40%) and produces no findings if any required entity is missing or has a non-numeric state.
- If card UI looks unchanged after an update, you are likely serving cached JS.
- If similar candidates keep appearing, inspect
dedupe_reasonandfiltered_candidatesin discovery records. - If a proposal appears duplicate, check logs for:
Rule registry ignored duplicate rule ...... covered_by_existing_rule ...
- Existing stored proposal drafts are not auto-migrated; statuses update when proposals are re-processed.
This section shows how to display the latest camera image, the AI-generated summary, and recognized people in Home Assistant or use in automations via the image and sensor platforms.
-
Image entities (1 per camera):
image.<camera_slug>_last_event. Shows the most recent snapshot published by the analyzer/service. -
Sensor entities (1 per camera):
sensor.<camera_slug>_recognized_people -
Attributes include:
- recognized_people: names from face recognition
- summary: AI description of the last frame
- latest_path: filesystem path of the published image
- count, last_event, camera_id: aux info
The analyzer publishes snapshots automatically on motion/recording, and you can also invoke a service to capture and analyze now. For this to work, HA needs to have write access to your snapshots location (default: /media/snapshots) and your camera entities must exist in HA (camera.*).
In the examples below, replace the entity names with the actuals from your HA installation.
A service is provided to capture a fresh snapshot, analyze it, and publish it as the latest event.
- Service:
home_generative_agent.save_and_analyze_snapshot- target: one or more camera.* entities
- fields:
- protect_minutes (optional, default: 30) — protect the new file from pruning
Example -> Developer Tools -> Services:
service: home_generative_agent.save_and_analyze_snapshot
target:
entity_id:
- camera.frontgate
- camera.backyard
data:
protect_minutes: 30Example Button Card in UI:
type: button
name: Refresh Frontgate
icon: mdi:camera
tap_action:
action: call-service
service: home_generative_agent.save_and_analyze_snapshot
target:
entity_id: camera.frontgate- Simple Image and Markdown (two cards per camera)
type: grid
columns: 2
square: false
cards:
- type: vertical-stack
cards:
- type: picture-entity
entity: image.frontgate_last_event
show_name: false
show_state: false
- type: markdown
content: |
**Summary**
{{ state_attr('image.frontgate_last_event', 'summary') or '—' }}
**Recognized**
{% set names = state_attr('image.frontgate_last_event', 'recognized_people') or [] %}
{{ names | join(', ') if names else 'None' }}Duplicate the stack for each camera’s image.<slug>_last_event.
- All in one cameras view
title: Cameras
path: cameras
cards:
- type: grid
columns: 2
square: false
cards:
# Repeat this block for each camera slug
- type: vertical-stack
cards:
- type: picture-entity
entity: image.frontgate_last_event
show_name: false
show_state: false
- type: markdown
content: |
**Summary**
{{ state_attr('image.frontgate_last_event', 'summary') or '—' }}
**Recognized**
{% set names = state_attr('image.frontgate_last_event', 'recognized_people') or [] %}
{{ names | join(', ') if names else 'None' }}- Overlay: Place Text on the Image
type: picture-elements
image: /api/image_proxy/image.frontgate_last_event
elements:
- type: markdown
content: >
{% set names = state_attr('image.frontgate_last_event', 'recognized_people') or [] %}
**{{ names | join(', ') if names else 'None' }}**
style:
top: 6%
left: 50%
width: 92%
color: white
text-shadow: 0 0 6px rgba(0,0,0,0.9)
transform: translateX(-50%)
- type: state-label
entity: image.frontgate_last_event
attribute: summary
style:
bottom: 6%
left: 50%
width: 92%
color: white
text-shadow: 0 0 6px rgba(0,0,0,0.9)
transform: translateX(-50%)Notify when people are recognized on any camera:
alias: Camera recognized people
mode: parallel
trigger:
- platform: state
entity_id:
- sensor.frontgate_recognized_people
- sensor.playroomdoor_recognized_people
- sensor.backyard_recognized_people
condition: []
action:
- variables:
ent: "{{ trigger.entity_id }}"
cam: "{{ state_attr(ent, 'camera_id') }}"
names: "{{ state_attr(ent, 'recognized_people') or [] }}"
summary: "{{ state_attr(ent, 'summary') or 'An event occurred.' }}"
image_entity: "image.{{ cam.split('.')[-1] }}_last_event"
- service: notify.mobile_app_phone
data:
title: "Camera: {{ cam }}"
message: >
{{ summary }}
{% if names %} Recognized: {{ names | join(', ') }}.{% endif %}
data:
image: >
{{ state_attr(image_entity, 'entity_picture') }}- Event
hga_last_event_frameis fired whenever a new “latest” frame is published.
{
"camera_id": "camera.frontgate",
"summary": "A person approaches the gate...",
"path": "/media/snapshots/camera_frontgate/snapshot_YYYYMMDD_HHMMSS.jpg",
"latest": "/media/snapshots/camera_frontgate/_latest/latest.jpg"
}-
Dispatcher signals (internal):
-
SIGNAL_HGA_NEW_LATEST-> updates image.*_last_event -
SIGNAL_HGA_RECOGNIZED-> updates sensor.*_recognized_people
-
Most users won’t need to consume these directly; the platform entities update automatically.
You can enroll faces either via a service call or through the dashboard card.
Service: home_generative_agent.enroll_person
service: home_generative_agent.enroll_person
data:
name: "Eva"
file_path: "/media/faces/eva_face.jpg"The file must be inside Home Assistant's /media folder so it is accessible to the integration.
Add the custom card to any dashboard after registering the resource in Installation step 6.
type: custom:hga-enroll-card
title: Enroll Person
endpoint: /api/home_generative_agent/enrollUse the file picker or drag-and-drop to upload one or more images. The card will enroll any images that contain a detectable face and skip those that do not.
Below is a high-level view of the architecture.
The general integration architecture follows the best practices as described in Home Assistant Core and is compliant with Home Assistant Community Store (HACS) publishing requirements.
The agent is built using LangGraph and uses the HA conversation component to interact with the user. The agent uses the Home Assistant LLM API to fetch the state of the home and understand the HA native tools it has at its disposal. I implemented all other tools available to the agent using LangChain. The agent employs several LLMs, a large and very accurate primary model for high-level reasoning, smaller specialized helper models for camera image analysis, primary model context summarization, and embedding generation for long-term semantic search. The models can be either cloud (best accuracy, highest cost) or edge-based (good accuracy, lowest cost). The edge models run under the Ollama framework on a computer located in the home. Recommended defaults and supported models are configurable in the integration UI, with defaults defined in const.py.
| Category | Provider | Default model | Purpose |
|---|---|---|---|
| Chat | OpenAI | gpt-5 | High-level reasoning and planning |
| Chat | Ollama | gpt-oss | High-level reasoning and planning |
| Chat | Gemini | gemini-2.5-flash-lite | High-level reasoning and planning |
| VLM | Ollama | qwen3-vl:8b | Image scene analysis |
| VLM | OpenAI | gpt-5-nano | Image scene analysis |
| VLM | Gemini | gemini-2.5-flash-lite | Image scene analysis |
| Summarization | Ollama | qwen3:1.7b | Primary model context summarization |
| Summarization | OpenAI | gpt-5-nano | Primary model context summarization |
| Summarization | Gemini | gemini-2.5-flash-lite | Primary model context summarization |
| Embeddings | Ollama | mxbai-embed-large | Embedding generation for semantic search |
| Embeddings | OpenAI | text-embedding-3-small | Embedding generation for semantic search |
| Embeddings | Gemini | gemini-embedding-001 | Embedding generation for semantic search |
LangGraph powers the conversation agent, enabling you to create stateful, multi-actor applications utilizing LLMs as quickly as possible. It extends LangChain's capabilities, introducing the ability to create and manage cyclical graphs essential for developing complex agent runtimes. A graph models the agent workflow, as seen in the image below.
The agent workflow has three nodes, each Python module modifying the agent's state, a shared data structure. The edges between the nodes represent the allowed transitions between them, with solid lines unconditional and dashed lines conditional. Nodes do the work, and edges tell what to do next.
The __start__ and __end__ nodes inform the graph where to start and stop. The agent node runs the primary LLM, and if it decides to use a tool, the action node runs the tool and then returns control to the agent. When the agent does not call a tool, control passes to summarize_and_remove_messages, which summarizes only when trimming is required to manage the LLM context.
You need to carefully manage the context length of LLMs to balance cost, accuracy, and latency and avoid triggering rate limits such as OpenAI's Tokens per Minute restriction. The system controls the context length of the primary model by trimming the messages in the context if they exceed a max parameter which can be expressed in either tokens or messages, and the trimmed messages are replaced by a shorter summary inserted into the system message. These parameters are configurable in the UI, with defaults defined in const.py; their description is below.
| Parameter | Description | Default |
|---|---|---|
max_messages_in_context |
Messages to keep in context before deletion | 60 |
max_tokens_in_context |
Tokens to keep in context before deletion | 32000 |
manage_context_with_tokens |
If "true", use tokens to manage context, else use messages | "true" |
The latency between user requests or the agent taking timely action on the user's behalf is critical for you to consider in the design. I used several techniques to reduce latency, including using specialized, smaller helper LLMs running on the edge and facilitating primary model prompt caching by structuring the prompts to put static content, such as instructions and examples, upfront and variable content, such as user-specific information at the end. These techniques also reduce primary model usage costs considerably.
You can see the typical latency performance in the table below.
| Action | Latency (s) | Remark |
|---|---|---|
| HA intents | < 1 | e.g., turn on a light |
| Analyze camera image | < 3 | initial request |
| Add automation | < 1 | |
| Memory operations | < 1 |
The agent can use HA tools as specified in the LLM API and other tools built in the LangChain framework as defined in tools.py. Additionally, you can extend the LLM API with tools of your own as well. The code gives the primary LLM the list of tools it can call, along with instructions on using them in its system message and in the docstring of the tool's Python function definition. If the agent decides to use a tool, the LangGraph node action is entered, and the node's code runs the tool. The node uses a simple error recovery mechanism that will ask the agent to try calling the tool again with corrected parameters in the event of making a mistake.
The agent can call HA LLM API tools, including built-in intents like HassTurnOn and HassTurnOff. The integration normalizes lock intents to lock/unlock services and routes alarm intents to the alarm_control tool.
You can see the list of LangChain tools that the agent can use in the table below.
| Langchain Tool | Purpose |
|---|---|
get_and_analyze_camera_image |
run scene analysis on the image from a camera |
upsert_memory |
add or update a memory |
add_automation |
create and register a HA automation (available when Schema-first YAML mode is disabled) |
write_yaml_file |
write YAML to /config/www/ and return a /local/... URL |
confirm_sensitive_action |
confirm and execute a pending critical action with a PIN |
alarm_control |
arm or disarm an alarm control panel with the alarm code |
get_entity_history |
query HA database for entity history |
resolve_entity_ids |
resolve entity IDs from friendly names, areas, labels, and domains |
get_current_device_state |
I built the HA installation on a Raspberry Pi 5 with SSD storage, Zigbee, and LAN connectivity. I deployed the edge models under Ollama on an Ubuntu-based server with an AMD 64-bit 3.4 GHz CPU, Nvidia 3090 GPU, and 64 GB system RAM. The server is on the same LAN as the Raspberry Pi.
The snippet below shows that the agent is fluent in yaml based on what it generated and registered as an HA automation (this is disabled when Schema-first YAML mode is enabled).
alias: Check Litter Box Waste Drawer
triggers:
- minutes: /30
trigger: time_pattern
conditions:
- condition: numeric_state
entity_id: sensor.litter_robot_4_waste_drawer
above: 90
actions:
- data:
message: The Litter Box waste drawer is more than 90% full!
action: notify.notifyall_cameras.mov
state_of_home.mov
You can create an automation of the home state summary that runs periodically from the HA Blueprint hga_summary.yaml located in the blueprints folder.
You can see that the agent correctly generates the automation below.
alias: Prepare Home for Arrival
description: Turn on front porch light and unlock garage door lock at 7:30 PM
mode: single
triggers:
- at: "19:30:00"
trigger: time
actions:
- target:
entity_id: light.front_porch_light
action: light.turn_on
data: {}
- target:
entity_id: lock.garage_door_lock
action: lock.unlock
data: {}Below is the camera image the agent analyzed, you can see that two packages are visible.
Below is an example notification from this automation if any boxes or packages are visible.
The agent uses a tool that in turn uses the HA Blueprint hga_scene_analysis.yaml for these requests and so the Blueprint needs to be installed in your HA installation.
You can enable proactive video scene analysis from cameras visible to Home Assistant. When enabled, motion detection will trigger the analysis which will be stored in a database for use by the agent, and optionally, notifications of the analysis will be sent to the mobile app. You can also enable anomaly detection which will only send notifications based on semantic search of the current analysis vis-a-vis the database. These options are set in the integration's config UI.
The image below is an example of a notification sent to the mobile app.
The Makefile provides a repeatable local dev workflow. It creates a hga venv using Python 3.13 and wires common tasks (deps, lint, tests, type checking).
Common commands:
make venv # create venv with pip/setuptools/wheel
make devdeps # install dev-only deps
make testdeps # install test deps
make runtimedeps # regenerate + install runtime deps from manifestChecks and formatting:
make lint # regenerate runtime deps + ruff check (non-mutating)
make format # ruff format (mutating)
make fix # ruff --fix (mutating)
make typecheck # pyrightTests and cleanup:
make test # pytest with runtime deps installed
make all # devdeps + testdeps + runtimedeps + lint + test + check + typecheck
make clean # remove the venvNote: make lint will fail if requirements_runtime_manifest.txt is out of date. Run make runtimedeps or make lint to regenerate it.
If you want to contribute to this, please read the Contribution guidelines