This project explores and validates an innovative paradigm for developing Human-Machine Interfaces (HMIs) for embedded systems. The core idea is to leverage the natural language understanding and generation capabilities of Large Language Models (LLMs) to directly translate high-level user requirements into structured, declarative JSON configuration files. This configuration is then transmitted to a resource-constrained embedded device (e.g., an ESP32), where a lightweight rendering engine parses it and dynamically constructs the user interface using a graphics library such as LVGL.
This approach aims to address persistent challenges in traditional embedded UI development, including efficiency bottlenecks, high technical barriers, and long iteration cycles. By defining a strict JSON Schema to act as a "communication contract" between the LLM's output and the device-side parser, we ensure the controllability, stability, and predictability of the AI-generated UI. This repository includes a web-based simulator, deployed on Cloudflare Workers, for rapid prototyping and intuitive visualization of the JSON-driven rendering effect.
The system consists of two primary components: the Host and the Device, which are decoupled and coordinated via a standardized JSON data stream.
graph TD
A[User] -->|Natural Language Input|B_HostApp[Host Application]
subgraph "Host"
B_HostApp --> C_NLP[NLP Pre-processing]
C_NLP --> D_LLM["Large Language Model Core"]
D_LLM -->|Structured Intent|E_JSON_Gen[JSON Generation & Validation]
E_JSON_Gen -->|Schema Validation|F_ValidJSON[JSON Configuration]
F_ValidJSON --> G_Comm_Host[Communication Module]
end
subgraph "Embedded Device"
H_Comm_MCU[Communication Module] --> I_JSON_Buffer[JSON Data Buffer]
I_JSON_Buffer --> J_JSON_Parser[JSON Parser]
J_JSON_Parser -->|UI Directives|K_UI_Engine["UI Rendering Engine"]
K_UI_Engine -->|LVGL Calls|L_LVGL_Display[Screen Display]
M_Data_Sources["Device Data Sources"] <--> N_Data_Binding[Data Binding]
N_Data_Binding <--> K_UI_Engine
O_Action_Lib["Pre-defined Action Library"] <--> P_Interaction_Handler[Interaction Handler]
P_Interaction_Handler <--> K_UI_Engine
end
G_Comm_Host -->|JSON Data Stream|H_Comm_MCU
style A fill:#f0f0f0,stroke:#333,stroke-width:2px
style L_LVGL_Display fill:#f0f0f0,stroke:#333,stroke-width:2px
Workflow Description:
- Requirement Input: The user describes the UI requirements in natural language.
- Host-side Processing: The LLM, guided by its internal knowledge of the UI Schema, converts the natural language into a structured JSON configuration and validates it.
- Data Transmission: The validated JSON is sent to the target device via protocols like Wi-Fi, BLE, or serial communication.
- Embedded-side Processing:
- The device receives and parses the JSON data using a lightweight parser (e.g., cJSON).
- The UI rendering engine translates the parsed directives into LVGL API calls to create widgets, arrange layouts, and apply styles.
- Data bindings are established between UI elements and internal device data sources, and user interactions (e.g., button clicks) are mapped to pre-defined actions in the firmware.
- The final UI is presented on the screen.
The JSON Schema is the cornerstone of this architecture, serving as the "protocol" between the host and the embedded device. Its key advantages are:
- Constraint: Provides a clear and strict framework for the LLM's output, preventing the generation of invalid or unsafe UI configurations.
- Expressiveness: Capable of clearly describing the UI's hierarchical structure, layout, styling, data bindings, and interactive behaviors.
- Decoupling: Completely separates the definition of the UI's "presentation layer" from the "logic layer" code on the device.
Core Components of the Schema:
- Widgets: Defines UI element types (e.g.,
label
,button
,container
) and their properties. - Layout: Describes how elements are arranged (e.g.,
row
,column
), including spacing (gap
) and alignment (item_alignment
), as a simplified implementation of the Flexbox model. - Styling: Defines visual attributes like colors, fonts, and borders. Using pre-defined style references (
style_ref
) is recommended to reduce JSON payload size. - Data Binding: Associates UI element content with device variables through template placeholders (e.g.,
{{data.value}}
). - Actions: Maps UI events (e.g., a button click) to pre-defined action identifiers (
action_id
) in the firmware.
To rapidly validate the feasibility of this architecture and intuitively evaluate the expressiveness of the Schema, we have built a purely front-end UI preview tool and deployed it on Cloudflare Workers.
- Access URL: https://hmi-dev.w0x7ce.eu/
This tool includes a JavaScript implementation of the JSON rendering engine that can:
- Accept JSON configuration text input from the user.
- Parse the JSON in real-time and render the corresponding UI in a simulated device screen area (320x240).
- Provide immediate feedback on JSON formatting errors or rendering issues.
This enables developers to quickly test and debug the UI Schema without requiring physical hardware, significantly accelerating the design iteration process.
- Visit the simulator link provided above.
- Copy the content of any JSON file from the
examples/
directory into the "JSON Config Input" text area on the left. - Click the "Render/Refresh Preview" button.
- Observe the rendered output in the "UI Preview" area on the right.
- Increased Development Efficiency: Reduces hours or even days of embedded UI coding to minutes of describing and refining in natural language.
- Lowered Technical Barrier: Enables team members without deep embedded programming or GUI library experience, such as product managers and UI/UX designers, to participate in HMI construction.
- Guaranteed System Controllability: The Schema acts as a "firewall," ensuring that all UI rendering occurs within a pre-defined, safe, and controllable scope.
- Promotes Separation of Concerns (SoC): The UI's visual presentation is completely decoupled from the device's underlying business logic, allowing both parts to be developed, tested, and iterated on independently, thus improving code maintainability.
The LLM-driven declarative UI generation architecture proposed in this study presents a promising new paradigm for HMI development in embedded systems. Through the proof-of-concept web simulator, we have demonstrated the technical feasibility and expressive power of this approach.
Future work will focus on:
- Implementing an efficient, low-memory-footprint JSON parsing and rendering engine on a real embedded platform (ESP32 + LVGL).
- Fine-tuning an LLM specifically for this task to improve its accuracy in generating high-quality, complex UI configurations.
- Investigating binary-alternative formats for JSON (e.g., CBOR, MessagePack) to optimize transmission efficiency and parsing performance.