Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat(core): 初始化 v0.1 工程基础#9

Merged
DXL-0702 merged 7 commits into
mainfrom
develop
Jun 22, 2026
Merged

feat(core): 初始化 v0.1 工程基础#9
DXL-0702 merged 7 commits into
mainfrom
develop

Conversation

@DXL-0702

@DXL-0702 DXL-0702 commented Jun 22, 2026

Copy link
Copy Markdown
Owner

close #1 , close #2

Summary by CodeRabbit

  • New Features

    • Added a /health endpoint that returns public service status, version, environment, and whether core dependencies are configured (sensitive fields omitted).
    • Added a web UI that polls /health and displays backend status, service name, and version.
    • Added a basic worker “ping” task.
  • Documentation

    • Updated local dev setup/run commands for backend and worker.
    • Refreshed documentation navigation by removing outdated issue links.
  • Chores

    • Added environment templates and expanded ignore rules.
    • Added/updated tooling and dependency configurations (linting, typing, tests).

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 591fe0fb-a25c-4e7c-95a8-59e55ca4010e

📥 Commits

Reviewing files that changed from the base of the PR and between e2d5e66 and db7e56f.

📒 Files selected for processing (12)
  • requirements.txt
  • server/api/health.py
  • server/app.py
  • server/config.py
  • tests/__init__.py
  • tests/conftest.py
  • tests/settings_helpers.py
  • tests/test_config.py
  • tests/test_health.py
  • tests/test_worker_app.py
  • web/src/App.tsx
  • web/src/styles.css

📝 Walkthrough

Walkthrough

This PR bootstraps the AgentClef v0.1 monorepo from scratch. It adds Python project tooling (pyproject.toml, requirements.txt, requirements-dev.txt), a Pydantic-based settings module with environment variable loading and validation, a FastAPI application factory with a GET /health endpoint, a Celery worker with a ping task, a React/Vite/TypeScript frontend scaffold with ESLint tooling, and a health-polling React UI displaying service status. Documentation links to a removed v0.1/issues.md file are also updated.

Changes

AgentClef v0.1 Scaffold, Backend Health, and Frontend

Layer / File(s) Summary
Project tooling, environment, and package stubs
pyproject.toml, requirements.txt, requirements-dev.txt, .gitignore, .env.example, README.md, README.zh-CN.md, docs/README.md, server/__init__.py, server/api/__init__.py, server/domain/__init__.py, server/models/__init__.py, server/schemas/__init__.py, shared/__init__.py, worker/__init__.py, worker/pipeline/__init__.py, worker/tasks/__init__.py, web/.env.example, tests/__init__.py
Adds pytest/ruff/mypy configuration, pinned dependencies (FastAPI, Uvicorn, Pydantic, SQLAlchemy, Alembic, psycopg, Redis, Celery, python-multipart, pytest, httpx, ruff, mypy), expanded .gitignore for environment/build/cache artifacts, .env.example with baseline app metadata and service configuration, and root-relative dev command updates in both READMEs. Removes v0.1/issues.md references throughout documentation. Package stubs include module-level docstrings.
Backend Settings class, validation, and test helpers
server/config.py, tests/test_config.py, tests/settings_helpers.py, tests/conftest.py
Introduces Settings (Pydantic BaseSettings) loading AGENTCLEF_-prefixed environment variables with custom EnvSettingsSource and DotEnvSettingsSource subclasses parsing cors_origins from comma-separated or JSON-array strings. Field validators enforce required string trimming/non-empty, environment set membership, CORS origin trimming/no-wildcard-with-credentials, positive upload limits, and LLM API key requirement when provider is enabled. get_settings() wrapped with lru_cache for singleton access. Test helpers (clear_agentclef_env, make_settings, make_settings_from_env) isolate Settings construction. Tests assert defaults, field validation, environment rejection, upload limits, CORS parsing variants, and llm_api_key enforcement. Pytest fixture clears settings cache before/after each test.
FastAPI app factory and /health endpoint
server/app.py, server/main.py, server/api/health.py, tests/test_health.py
create_app() in server/app.py builds FastAPI with app metadata from settings, stores settings in app.state, installs CORS middleware, and mounts health router. server/main.py reduces to simple bootstrap calling create_app() at import. server/api/health.py defines UploadLimits and HealthResponse Pydantic models and GET /health handler deriving dependency *_configured booleans from settings fields (postgres_dsn, redis_url, file_storage_path truthiness) and returning service metadata, upload limits, llm_provider (excluding sensitive fields). Integration test verifies HTTP 200 with full public JSON schema and absent sensitive data.
Celery worker entrypoint and configuration
worker/app.py, tests/test_worker_app.py
create_celery_app(settings: Settings | None = None) loads settings (from parameter or get_settings()), wires broker_url and result_backend to same redis_url, enables task_track_started, and includes "worker.tasks" in Celery configuration. Module-level celery_app instantiated and agentclef.ping task registered returning {"status": "ok"}. Tests verify redis URL propagation and task module inclusion.
Frontend Vite/React/TypeScript and build tooling scaffold
web/package.json, web/tsconfig.json, web/tsconfig.node.json, web/vite.config.ts, web/eslint.config.js, web/index.html, web/src/vite-env.d.ts
Defines project metadata and npm scripts (dev/build/preview/test/typecheck/lint) in package.json. Declares runtime dependencies (React 19, TanStack Query 5, Zustand 5, wavesurfer.js 7) and dev dependencies (Vite React plugin, TypeScript 5, ESLint with React plugins, Vitest 2). TypeScript configuration enables strict mode, React JSX transform, ES2020/ESNext targeting, and project compositing. Vite configuration adds React plugin and sets dev server port 5173. ESLint flat configuration applies JS/TypeScript/React presets and React Hooks/Refresh plugins. HTML entrypoint includes charset/viewport metadata, #root mount point, and /src/main.tsx module script.
Frontend health query and status card UI
web/src/lib/api.ts, web/src/App.tsx, web/src/main.tsx, web/src/styles.css
web/src/lib/api.ts exports HealthResponse type and fetchHealth() function deriving API_BASE_URL from VITE_API_BASE_URL env var (default localhost:8000), using AbortController with 7,500ms timeout, and returning parsed JSON from GET /health. web/src/App.tsx polls health via useQuery with 10-second refetch interval, rendering Backend status (Checking/Offline/Online), Service, and Version cards with "-" fallbacks. web/src/main.tsx mounts App under React.StrictMode and QueryClientProvider. web/src/styles.css provides global base styling, .app-shell centered grid, .hero-panel border/background, typography (.eyebrow, h1, .lede), .status-grid 3-column CSS grid, .status-card styles, and responsive collapse to 1 column at max-width: 720px.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(144, 238, 144, 0.5)
    Note over Browser,get_settings: Full health polling cycle
  end
  participant Browser
  participant App as App (React)
  participant useQuery as useQuery<br/>(React Query)
  participant fetchHealth as fetchHealth()
  participant FastAPI as FastAPI<br/>/health
  participant get_settings as get_settings()
  
  Browser->>App: mount
  App->>useQuery: init (queryKey=["health"],<br/>refetchInterval=10s)
  useQuery->>fetchHealth: invoke
  fetchHealth->>FastAPI: GET /health<br/>(timeout: 7.5s)
  FastAPI->>get_settings: load Settings
  get_settings-->>FastAPI: Settings (cached)
  FastAPI-->>fetchHealth: HealthResponse JSON
  fetchHealth-->>useQuery: HealthResponse
  useQuery-->>App: data + isLoading/isError
  App-->>Browser: Backend/Service/Version<br/>status cards
  Note over Browser,get_settings: refetch every 10 seconds
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 A scaffold from nothing, a burrow brand new,
/health checks the warren — all systems come through.
The Celery ping hops, the React cards glow,
Settings from .env, with defaults below.
The rabbit has planted the seeds of v0.1 — watch it grow! 🌱

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive Title is vague and uses generic terminology ('初始化 v0.1 工程基础' = 'Initialize v0.1 engineering foundation') that does not clearly convey the specific engineering work being performed. Consider a more descriptive title such as 'feat: Set up core infrastructure for v0.1 (backend, frontend, config, health)' or similar to better reflect the multi-component initialization effort.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All acceptance criteria from issues #1 and #2 are met: backend directory structure and app factory created, frontend React+TypeScript+Vite project initialized, configuration loading system implemented with settings validation, GET /health endpoint implemented with proper response model, comprehensive test coverage added for config and health functionality.
Out of Scope Changes check ✅ Passed All changes align with the scope defined in linked issues #1 and #2. The PR establishes project structure, configuration management, and health check capabilities without implementing business APIs, model integration, or production deployment.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch develop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the initial project structure for AgentClef, setting up a FastAPI backend, a Celery worker, and a React/Vite frontend. Key feedback focuses on improving test isolation by preventing local environment files from loading during tests, adding validation to prevent CORS wildcard crashes when credentials are enabled, ensuring Celery auto-discovers tasks, and removing a documentation file from the gitignore to prevent broken links.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread tests/test_health.py
Comment on lines +1 to +19
from fastapi.testclient import TestClient

from server.main import app


def test_health_check_returns_public_runtime_state() -> None:
client = TestClient(app)

response = client.get("/health")

assert response.status_code == 200
payload = response.json()
assert payload["status"] == "ok"
assert payload["service"] == "AgentClef"
assert payload["postgres_configured"] is True
assert payload["redis_configured"] is True
assert payload["file_storage_configured"] is True
assert "file_storage_path" not in payload
assert "llm_api_key" not in payload

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

当前测试直接导入了全局的 app 实例,这会导致测试运行时加载本地的 .env 配置文件。如果本地配置了不同的环境变量(例如修改了 AGENTCLEF_APP_NAME),测试将会失败。为了保证测试的隔离性(Test Isolation),建议在测试中 mock get_settings 并重新创建 app 实例,从而避免对本地环境的依赖。

from unittest.mock import patch
from fastapi.testclient import TestClient

from server.config import Settings
from server.main import create_app


def test_health_check_returns_public_runtime_state() -> None:
    default_settings = Settings(_env_file=None)
    with patch("server.main.get_settings", return_value=default_settings), \
         patch("server.api.health.get_settings", return_value=default_settings):
        app = create_app()
        client = TestClient(app)
        response = client.get("/health")

        assert response.status_code == 200
        payload = response.json()
        assert payload["status"] == "ok"
        assert payload["service"] == "AgentClef"
        assert payload["postgres_configured"] is True
        assert payload["redis_configured"] is True
        assert payload["file_storage_configured"] is True
        assert "file_storage_path" not in payload
        assert "llm_api_key" not in payload

Comment thread .gitignore
Comment on lines 4 to +5
# Local execution issues
/docs/v0.1/issues.md
docs/v0.1/issues.md

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

README.md 中有多处链接指向了 docs/v0.1/issues.md(例如 Status badge 和文档链接)。然而,在 .gitignore 中却将该文件忽略了。这会导致该文件无法被提交到 Git 仓库中,从而导致 GitHub 上的 README.md 链接出现 404 错误。如果该文件是公共文档的一部分,建议将其从 .gitignore 中移除。

Comment thread server/config.py
Comment on lines +30 to +37
@field_validator("environment")
@classmethod
def validate_environment(cls, value: str) -> str:
allowed = {"development", "test", "staging", "production"}
normalized = value.lower()
if normalized not in allowed:
raise ValueError(f"environment must be one of: {', '.join(sorted(allowed))}")
return normalized

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

由于 server/main.py 中配置了 CORSMiddlewareallow_credentials=True,如果 cors_origins 中包含了通配符 *,FastAPI/Starlette 会在运行时抛出 RuntimeError 导致服务崩溃。建议在 Settings 中添加对 cors_origins 的校验器,确保配置中不包含 *,以提高系统的健壮性。

Suggested change
@field_validator("environment")
@classmethod
def validate_environment(cls, value: str) -> str:
allowed = {"development", "test", "staging", "production"}
normalized = value.lower()
if normalized not in allowed:
raise ValueError(f"environment must be one of: {', '.join(sorted(allowed))}")
return normalized
@field_validator("environment")
@classmethod
def validate_environment(cls, value: str) -> str:
allowed = {"development", "test", "staging", "production"}
normalized = value.lower()
if normalized not in allowed:
raise ValueError(f"environment must be one of: {', '.join(sorted(allowed))}")
return normalized
@field_validator("cors_origins")
@classmethod
def validate_cors_origins(cls, value: list[str]) -> list[str]:
if "*" in value:
raise ValueError("Cannot use wildcard '*' in cors_origins when allow_credentials is True")
return value

Comment thread tests/test_config.py Outdated
Comment on lines +4 to +11
def test_settings_defaults_are_valid() -> None:
settings = Settings()

assert settings.app_name == "AgentClef"
assert settings.api_port == 8000
assert settings.environment == "development"
assert settings.upload_max_mb > 0
assert settings.upload_max_seconds > 0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

在测试中直接实例化 Settings() 会默认加载本地的 .env 文件(如果存在的话),这会导致测试结果依赖于本地环境配置,从而使测试变得不可靠(non-deterministic)。建议在测试中显式传入 _env_file=None,以确保测试使用的是代码中定义的默认值。

Suggested change
def test_settings_defaults_are_valid() -> None:
settings = Settings()
assert settings.app_name == "AgentClef"
assert settings.api_port == 8000
assert settings.environment == "development"
assert settings.upload_max_mb > 0
assert settings.upload_max_seconds > 0
def test_settings_defaults_are_valid() -> None:
settings = Settings(_env_file=None)
assert settings.app_name == "AgentClef"
assert settings.api_port == 8000
assert settings.environment == "development"
assert settings.upload_max_mb > 0
assert settings.upload_max_seconds > 0

Comment thread worker/app.py Outdated
Comment on lines +6 to +10
def create_celery_app() -> Celery:
settings = get_settings()
app = Celery("agentclef_worker", broker=settings.redis_url, backend=settings.redis_url)
app.conf.update(task_track_started=True)
return app

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

在初始化 Celery 实例时,建议显式配置 include 参数(例如 include=['worker.tasks']),以便 Celery 能够自动发现和注册其他模块中定义的任务。当前结构中已经创建了 worker/tasks/ 目录,如果不进行配置,写在这些目录下的任务将不会被 Celery 识别和加载。

Suggested change
def create_celery_app() -> Celery:
settings = get_settings()
app = Celery("agentclef_worker", broker=settings.redis_url, backend=settings.redis_url)
app.conf.update(task_track_started=True)
return app
def create_celery_app() -> Celery:
settings = get_settings()
app = Celery(
"agentclef_worker",
broker=settings.redis_url,
backend=settings.redis_url,
include=["worker.tasks"]
)
app.conf.update(task_track_started=True)
return app

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
requirements-dev.txt (1)

1-14: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Separate runtime and development dependencies.

The current approach lists all dependencies (runtime and dev) in a single requirements-dev.txt file. Standard Python practice is to separate them:

  • Define runtime dependencies in requirements.txt (or pyproject.toml [project] dependencies).
  • Have requirements-dev.txt include the runtime dependencies and add only dev/test tools (pytest, ruff, mypy, httpx).

This improves clarity for maintainers and simplifies CI/production deployments.

📋 Example refactor: Split into two files

requirements.txt (runtime only):

fastapi>=0.115.0
uvicorn[standard]>=0.30.0
pydantic>=2.8.0
pydantic-settings>=2.4.0
sqlalchemy>=2.0.0
alembic>=1.13.0
psycopg[binary]>=3.2.0
redis>=5.0.0
celery>=5.4.0
python-multipart>=0.0.9

requirements-dev.txt (dev + test):

-r requirements.txt
pytest>=8.0.0
httpx>=0.27.0
ruff>=0.6.0
mypy>=1.10.0

Alternatively, define [project] dependencies in pyproject.toml and use [project.optional-dependencies] dev = [...] for the dev-only tools.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@requirements-dev.txt` around lines 1 - 14, Create two separate requirements
files to follow Python best practices: create a new requirements.txt file
containing only the runtime dependencies (fastapi, uvicorn, pydantic,
pydantic-settings, sqlalchemy, alembic, psycopg, redis, celery,
python-multipart) and update requirements-dev.txt to include a reference to
requirements.txt using -r requirements.txt at the top, followed only by the
development and testing tools (pytest, httpx, ruff, mypy). This separation
improves clarity for maintainers and simplifies production deployments by
clearly distinguishing which packages are needed at runtime versus only during
development.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/config.py`:
- Around line 18-29: Add validation to the configuration class to prevent
invalid runtime states. For the string fields postgres_dsn, redis_url, and
file_storage_path, add constraints using Pydantic's field validators to reject
empty or whitespace-only values. For llm_provider, add an enum constraint or
validator to only accept predefined valid provider values instead of arbitrary
strings. Additionally, add a model-level validator that ensures llm_api_key is
provided and non-empty when llm_provider is set to anything other than
"disabled", preventing the configuration from passing health checks with
incomplete LLM setup.

In `@server/main.py`:
- Around line 13-16: The CORS configuration in the diff allows an invalid
combination of wildcard origins with credentials enabled, which violates the W3C
CORS specification. Add a field validator to the Settings class in
server/config.py that checks the cors_origins field and rejects any
configuration containing the wildcard "*" when credentials are required. This
validator should run before the CORSMiddleware configuration is applied,
providing clear validation error messaging during the settings initialization
phase rather than relying on FastAPI's runtime ValueError.

In `@tests/test_config.py`:
- Around line 4-11: The test_settings_defaults_are_valid function only validates
the happy path where default settings are correct. Add additional test cases to
verify that the Settings class properly validates and rejects invalid
configurations by testing scenarios with missing required fields, invalid values
(such as negative upload_max_mb or upload_max_seconds), and invalid environment
values. Each negative test should assert that the Settings constructor or
validation raises appropriate exceptions with clear error messages that help
developers understand what configuration is invalid and why.

In `@tests/test_health.py`:
- Around line 3-19: The test_health_check_returns_public_runtime_state function
relies on a prebuilt global app and hardcoded service name, which makes it
brittle when environment overrides are present. Additionally, the test does not
assert the complete response contract - it's missing assertions for fields that
are consumed by the web client (version, environment, upload_limits, and
llm_provider). To fix this, refactor the test to set up the app state locally
within the test rather than relying on global state to ensure deterministic
behavior, and add assertions for all missing response fields (version,
environment, upload_limits, llm_provider) to fully validate the health check
endpoint's contract.

In `@web/package.json`:
- Around line 10-12: The lint script is currently a duplicate of the typecheck
script, both running tsc --noEmit for type checking only, which does not provide
actual linting coverage. Replace the lint script with a real linter command such
as ESLint (e.g., eslint . or eslint src/) to provide proper code linting, while
keeping the typecheck script unchanged for TypeScript type checking purposes.
- Around line 25-27: The `@vitejs/plugin-react` package at version ^4.0.0 is not
compatible with vite@^6.0.0. Versions 4.0.0-4.3.3 only support vite ^4.2.0 or
^5.0.0, while version 4.3.4 and later support vite 6. In the package.json file,
update the `@vitejs/plugin-react` dependency to at least ^4.3.4 to ensure
compatibility with vite@^6.0.0 and avoid peer dependency conflicts.

In `@web/src/App.tsx`:
- Around line 6-10: The useQuery hook in App.tsx is missing polling behavior. To
fix this, add the refetchInterval option to the useQuery configuration object to
specify how often the health check should be executed (in milliseconds), and
also add refetchIntervalInBackground set to true to ensure polling continues
even when the application window is not in focus. These options should be added
alongside the existing queryKey, queryFn, and retry properties.

In `@web/src/lib/api.ts`:
- Around line 18-23: The fetchHealth function lacks timeout protection, which
can cause requests to hang indefinitely during network issues. Implement an
AbortController-based timeout mechanism by creating an AbortController instance
before the fetch call, setting up a timeout that calls abort() on the controller
after a reasonable duration (such as 5-10 seconds), passing the controller's
signal to the fetch request options, and ensuring the timeout is cleaned up
after the response is received (in both success and error paths) to prevent
memory leaks.

In `@web/src/styles.css`:
- Line 8: The text-rendering property value has mixed-case keyword
`optimizeLegibility` which violates the stylelint `value-keyword-case` rule.
Locate the text-rendering property in the styles.css file and change the keyword
from `optimizeLegibility` to all lowercase `optimizelegibility` to satisfy the
configured linting rules.

---

Nitpick comments:
In `@requirements-dev.txt`:
- Around line 1-14: Create two separate requirements files to follow Python best
practices: create a new requirements.txt file containing only the runtime
dependencies (fastapi, uvicorn, pydantic, pydantic-settings, sqlalchemy,
alembic, psycopg, redis, celery, python-multipart) and update
requirements-dev.txt to include a reference to requirements.txt using -r
requirements.txt at the top, followed only by the development and testing tools
(pytest, httpx, ruff, mypy). This separation improves clarity for maintainers
and simplifies production deployments by clearly distinguishing which packages
are needed at runtime versus only during development.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 95c0fbc2-49c8-48e8-8e24-e056114c1e31

📥 Commits

Reviewing files that changed from the base of the PR and between a7c9137 and b28ad7d.

⛔ Files ignored due to path filters (1)
  • web/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (32)
  • .env.example
  • .gitignore
  • README.md
  • README.zh-CN.md
  • pyproject.toml
  • requirements-dev.txt
  • server/__init__.py
  • server/api/__init__.py
  • server/api/health.py
  • server/config.py
  • server/domain/__init__.py
  • server/main.py
  • server/models/__init__.py
  • server/schemas/__init__.py
  • shared/__init__.py
  • tests/test_config.py
  • tests/test_health.py
  • web/.env.example
  • web/index.html
  • web/package.json
  • web/src/App.tsx
  • web/src/lib/api.ts
  • web/src/main.tsx
  • web/src/styles.css
  • web/src/vite-env.d.ts
  • web/tsconfig.json
  • web/tsconfig.node.json
  • web/vite.config.ts
  • worker/__init__.py
  • worker/app.py
  • worker/pipeline/__init__.py
  • worker/tasks/__init__.py

Comment thread server/config.py
Comment on lines +18 to +29
postgres_dsn: str = "postgresql+psycopg://agentclef:agentclef@localhost:5432/agentclef"
redis_url: str = "redis://localhost:6379/0"
file_storage_path: str = "./storage"
cors_origins: list[str] = [
"http://localhost:5173",
"http://127.0.0.1:5173",
]
upload_max_mb: int = Field(default=50, gt=0)
upload_max_seconds: int = Field(default=300, gt=0)
llm_provider: str = "disabled"
llm_api_key: str | None = None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files | grep -i "config" | head -20

Repository: DXL-0702/AgentClef

Length of output: 160


🏁 Script executed:

find . -name "config.py" -type f

Repository: DXL-0702/AgentClef

Length of output: 81


🏁 Script executed:

cat server/config.py 2>/dev/null | head -50

Repository: DXL-0702/AgentClef

Length of output: 1408


🏁 Script executed:

rg "llm_provider|llm_api_key" --type py -A 3 -B 1 | head -50

Repository: DXL-0702/AgentClef

Length of output: 819


🏁 Script executed:

cat pyproject.toml 2>/dev/null || cat requirements.txt 2>/dev/null || find . -name "*.txt" -o -name "pyproject.toml" | grep -E "(requirements|pyproject)" | head -5

Repository: DXL-0702/AgentClef

Length of output: 260


🏁 Script executed:

rg "def.*health|@.*health" --type py -A 5 | head -40

Repository: DXL-0702/AgentClef

Length of output: 750


🏁 Script executed:

cat server/api/health.py

Repository: DXL-0702/AgentClef

Length of output: 1175


🏁 Script executed:

rg "postgres_dsn|redis_url" --type py -B 2 -A 2 | head -50

Repository: DXL-0702/AgentClef

Length of output: 1179


🏁 Script executed:

find . -name "pyproject.toml" -o -name "setup.py" | xargs cat 2>/dev/null | grep -A 20 "\[project\]" | head -30

Repository: DXL-0702/AgentClef

Length of output: 44


🏁 Script executed:

cat pyproject.toml

Repository: DXL-0702/AgentClef

Length of output: 260


🏁 Script executed:

rg "from pydantic|import pydantic" server/config.py -A 1

Repository: DXL-0702/AgentClef

Length of output: 170


🏁 Script executed:

rg "postgres_dsn|redis_url|file_storage_path" --type py -B 1 -A 5 | grep -v test | head -80

Repository: DXL-0702/AgentClef

Length of output: 1521


Add validation for critical configuration fields to prevent invalid runtime states.

The configuration lacks validation for postgres_dsn, redis_url, file_storage_path, and llm_provider. These allow whitespace-only or invalid values to pass the health check's bool() tests, resulting in misleading health responses. Additionally, llm_provider accepts arbitrary strings without validation against allowed values, and llm_api_key is not enforced as required when the provider is enabled. These gaps can cause confusing runtime errors or hide configuration problems behind a passing health check.

Proposed fix
-from pydantic import Field, field_validator
+from pydantic import Field, field_validator, model_validator

 class Settings(BaseSettings):
     # ... existing fields ...
 
+    `@field_validator`("postgres_dsn", "redis_url", "file_storage_path", mode="before")
+    `@classmethod`
+    def validate_non_empty(cls, value: str) -> str:
+        normalized = value.strip()
+        if not normalized:
+            raise ValueError("must not be empty")
+        return normalized
+
+    `@field_validator`("llm_provider")
+    `@classmethod`
+    def validate_llm_provider(cls, value: str) -> str:
+        allowed = {"disabled", "openai", "anthropic"}
+        normalized = value.strip().lower()
+        if normalized not in allowed:
+            raise ValueError(f"llm_provider must be one of: {', '.join(sorted(allowed))}")
+        return normalized
+
+    `@model_validator`(mode="after")
+    def validate_llm_key_requirement(self) -> "Settings":
+        if self.llm_provider != "disabled" and not (self.llm_api_key or "").strip():
+            raise ValueError("llm_api_key is required when llm_provider is enabled")
+        return self
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/config.py` around lines 18 - 29, Add validation to the configuration
class to prevent invalid runtime states. For the string fields postgres_dsn,
redis_url, and file_storage_path, add constraints using Pydantic's field
validators to reject empty or whitespace-only values. For llm_provider, add an
enum constraint or validator to only accept predefined valid provider values
instead of arbitrary strings. Additionally, add a model-level validator that
ensures llm_api_key is provided and non-empty when llm_provider is set to
anything other than "disabled", preventing the configuration from passing health
checks with incomplete LLM setup.

Comment thread server/main.py Outdated
Comment thread tests/test_config.py Outdated
Comment thread tests/test_health.py Outdated
Comment thread web/package.json Outdated
Comment thread web/package.json Outdated
Comment thread web/src/App.tsx
Comment thread web/src/lib/api.ts Outdated
Comment thread web/src/styles.css Outdated
@DXL-0702

Copy link
Copy Markdown
Owner Author

@gemini-code-assist review

@DXL-0702

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the foundational architecture for AgentClef, introducing a FastAPI backend, a Celery worker, and a React frontend that displays service health. The review feedback suggests several key improvements: enhancing the robustness of CORS origin parsing in the configuration to support comma-separated strings, allowing settings injection in the Celery app factory to improve testability, using environment-safe global timers in the frontend API client to prevent SSR issues, and disabling background query polling to optimize client and server resources.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread server/config.py
Comment on lines +49 to +57
@field_validator("cors_origins")
@classmethod
def validate_cors_origins(cls, value: list[str]) -> list[str]:
origins = [origin.strip() for origin in value]
if any(not origin for origin in origins):
raise ValueError("cors_origins must not contain empty values")
if "*" in origins:
raise ValueError("cors_origins must not contain '*' when credentials are enabled")
return origins

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

在配置 CORS 跨域域名时,直接将 cors_origins 定义为 list[str] 并在 validate_cors_origins 中处理。但在实际部署(如 Docker、Kubernetes 等环境)中,环境变量通常以逗号分隔的字符串形式传递(例如 AGENTCLEF_CORS_ORIGINS="http://localhost:5173,http://127.0.0.1:5173"),而不是标准的 JSON 数组格式。\n\n为了提高配置的健壮性,建议增加一个 before 验证器,同时支持解析 JSON 数组格式和逗号分隔的字符串格式。

    @field_validator("cors_origins", mode="before")
    @classmethod
    def parse_cors_origins(cls, value: Any) -> list[str]:
        if isinstance(value, str):
            value = value.strip()
            if value.startswith("[") and value.endswith("]"):
                try:
                    import json
                    return json.loads(value)
                except Exception:
                    pass
            return [item.strip() for item in value.split(",") if item.strip()]
        return value

    @field_validator("cors_origins")
    @classmethod
    def validate_cors_origins(cls, value: list[str]) -> list[str]:
        origins = [origin.strip() for origin in value]
        if any(not origin for origin in origins):
            raise ValueError("cors_origins must not contain empty values")
        if "*" in origins:
            raise ValueError("cors_origins must not contain '*' when credentials are enabled")
        return origins

Comment thread worker/app.py
Comment on lines +1 to +15
from celery import Celery

from server.config import get_settings


def create_celery_app() -> Celery:
settings = get_settings()
app = Celery(
"agentclef_worker",
broker=settings.redis_url,
backend=settings.redis_url,
include=["worker.tasks"],
)
app.conf.update(task_track_started=True)
return app

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

create_celery_app 中,目前是直接调用 get_settings() 获取全局配置。这在测试或需要动态调整配置的场景下(例如使用不同的 Redis 测试实例)不够灵活。\n\n建议像 server/app.py 中的 create_app 一样,允许通过参数传入 settings,从而提高代码的可测试性和灵活性。

Suggested change
from celery import Celery
from server.config import get_settings
def create_celery_app() -> Celery:
settings = get_settings()
app = Celery(
"agentclef_worker",
broker=settings.redis_url,
backend=settings.redis_url,
include=["worker.tasks"],
)
app.conf.update(task_track_started=True)
return app
from celery import Celery
from server.config import Settings, get_settings
def create_celery_app(settings: Settings | None = None) -> Celery:
runtime_settings = settings or get_settings()
app = Celery(
"agentclef_worker",
broker=runtime_settings.redis_url,
backend=runtime_settings.redis_url,
include=["worker.tasks"],
)
app.conf.update(task_track_started=True)
return app

Comment thread web/src/lib/api.ts
Comment on lines +19 to +36
export async function fetchHealth(): Promise<HealthResponse> {
const controller = new AbortController();
const timeoutId = window.setTimeout(() => {
controller.abort();
}, HEALTH_TIMEOUT_MS);

try {
const response = await fetch(`${API_BASE_URL}/health`, {
signal: controller.signal,
});
if (!response.ok) {
throw new Error(`Health check failed: ${response.status}`);
}
return response.json() as Promise<HealthResponse>;
} finally {
window.clearTimeout(timeoutId);
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

使用 window.setTimeoutwindow.clearTimeout 在非浏览器环境(例如 Node.js 运行单元测试或服务端渲染 SSR 时)可能会因为 window 未定义而抛出 ReferenceError。\n\n建议直接使用全局的 setTimeoutclearTimeout,它们在浏览器和 Node.js 环境中都是通用且安全的。此外,将 response.json() as Promise<HealthResponse> 改为等待解析后再转换类型,代码会更加清晰。

Suggested change
export async function fetchHealth(): Promise<HealthResponse> {
const controller = new AbortController();
const timeoutId = window.setTimeout(() => {
controller.abort();
}, HEALTH_TIMEOUT_MS);
try {
const response = await fetch(`${API_BASE_URL}/health`, {
signal: controller.signal,
});
if (!response.ok) {
throw new Error(`Health check failed: ${response.status}`);
}
return response.json() as Promise<HealthResponse>;
} finally {
window.clearTimeout(timeoutId);
}
}
export async function fetchHealth(): Promise<HealthResponse> {
const controller = new AbortController();
const timeoutId = setTimeout(() => {
controller.abort();
}, HEALTH_TIMEOUT_MS);
try {
const response = await fetch(`${API_BASE_URL}/health`, {
signal: controller.signal,
});
if (!response.ok) {
throw new Error(`Health check failed: ${response.status}`);
}
const data = await response.json();
return data as HealthResponse;
} finally {
clearTimeout(timeoutId);
}
}

Comment thread web/src/App.tsx
Comment on lines +6 to +12
const { data, isLoading, isError } = useQuery({
queryKey: ["health"],
queryFn: fetchHealth,
retry: false,
refetchInterval: 10_000,
refetchIntervalInBackground: true,
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

配置 refetchIntervalInBackground: true 会导致当用户切换到其他标签页或浏览器最小化时,前端依然每 10 秒向后端发送一次健康检查请求。这会造成不必要的客户端电池消耗和服务器负载。\n\n对于通用的健康检查或非实时性极高的后台任务,建议移除 refetchIntervalInBackground: true,仅在页面处于活跃状态时进行轮询。

  const { data, isLoading, isError } = useQuery({
    queryKey: ["health"],
    queryFn: fetchHealth,
    retry: false,
    refetchInterval: 10_000,
  });

@DXL-0702

Copy link
Copy Markdown
Owner Author

@gemini-code-assist review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the foundational structure for the AgentClef transcription review workbench, introducing a FastAPI backend, a Celery worker, and a React frontend. It includes configuration management via Pydantic Settings, a health check endpoint, and a basic status dashboard on the frontend. The review feedback highlights two key areas for improvement: first, updating the frontend status dashboard to display appropriate loading or offline placeholders instead of misleading fallback values when the backend is unreachable; second, refining the backend health check logic, as the current boolean checks on configuration strings will always evaluate to true due to default values and non-empty validation.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread web/src/App.tsx
Comment on lines +29 to +36
<div className="status-card">
<span>Service</span>
<strong>{data?.service ?? "AgentClef"}</strong>
</div>
<div className="status-card">
<span>Version</span>
<strong>{data?.version ?? "0.1.0"}</strong>
</div>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

当后端服务离线(isErrortrue)或正在加载时,dataundefined,此时 UI 会回退显示默认的静态 Mock 数据 "AgentClef""0.1.0"。这可能会误导用户,让他们以为后端已在线并成功返回了这些信息。建议在服务离线或加载中时,显示占位符(如 "-""Checking..."),以提供更准确的视觉反馈。

Suggested change
<div className="status-card">
<span>Service</span>
<strong>{data?.service ?? "AgentClef"}</strong>
</div>
<div className="status-card">
<span>Version</span>
<strong>{data?.version ?? "0.1.0"}</strong>
</div>
<div className="status-card">
<span>Service</span>
<strong>{isLoading ? "Checking..." : isError ? "-" : data?.service}</strong>
</div>
<div className="status-card">
<span>Version</span>
<strong>{isLoading ? "Checking..." : isError ? "-" : data?.version}</strong>
</div>

Comment thread server/api/health.py
Comment on lines +41 to +43
postgres_configured=bool(settings.postgres_dsn),
redis_configured=bool(settings.redis_url),
file_storage_configured=bool(settings.file_storage_path),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Settings 配置类中,postgres_dsnredis_urlfile_storage_path 都是带有默认值的必填字段,并且有非空校验。因此,bool(settings.postgres_dsn)bool(settings.redis_url)bool(settings.file_storage_path) 在运行时将永远True。这意味着 postgres_configuredredis_configuredfile_storage_configured 字段目前是冗余的,无法反映真实的配置缺失或服务连通性。建议在未来的版本中,将这些指标改为实际的连通性检查(例如对 Redis 执行 PING,对 Postgres 执行简单的 SELECT 1),或者如果允许不配置这些服务,则将相应的配置项改为可选(Optional)。

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/test_health.py (1)

8-26: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make this test fully environment-isolated.

Disabling only env_file is not enough; Settings() still reads AGENTCLEF_* environment variables, so these hardcoded assertions can be flaky across CI/dev environments.

Suggested fix
 def test_health_check_returns_public_runtime_state(monkeypatch: MonkeyPatch) -> None:
+    for field_name in Settings.model_fields:
+        monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
     monkeypatch.setattr(Settings, "model_config", {**Settings.model_config, "env_file": None})
     settings = Settings()
     app = create_app(settings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_health.py` around lines 8 - 26, The test
test_health_check_returns_public_runtime_state needs to be fully
environment-isolated by mocking not just the Settings model_config env_file, but
also the actual environment variables that Settings reads. In addition to the
existing monkeypatch.setattr call for Settings.model_config, use
monkeypatch.delenv or monkeypatch.setenv to control all AGENTCLEF_* environment
variables (or mock the entire environment) before instantiating Settings() so
that the hardcoded assertions about the response payload are not affected by
whatever environment variables exist in the CI/dev environment where tests run.
🧹 Nitpick comments (1)
server/config.py (1)

22-49: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Extract duplicate logic to eliminate code duplication.

AgentClefEnvSettingsSource and AgentClefDotEnvSettingsSource have identical prepare_field_value implementations. Refactoring to a shared mixin or helper method would eliminate this duplication and ensure future changes to CORS parsing logic need only be made in one place.

♻️ Refactor using a mixin
+class CorsOriginsParseMixin:
+    """Mixin that adds cors_origins string parsing to settings sources."""
+    
+    def prepare_field_value(
+        self,
+        field_name: str,
+        field: FieldInfo,
+        value: Any,
+        value_is_complex: bool,
+    ) -> Any:
+        if field_name == "cors_origins" and isinstance(value, str):
+            parsed = parse_cors_origins_env(value)
+            if parsed is not None:
+                return parsed
+        return super().prepare_field_value(field_name, field, value, value_is_complex)
+
+
-class AgentClefEnvSettingsSource(EnvSettingsSource):
-    def prepare_field_value(
-        self,
-        field_name: str,
-        field: FieldInfo,
-        value: Any,
-        value_is_complex: bool,
-    ) -> Any:
-        if field_name == "cors_origins" and isinstance(value, str):
-            parsed = parse_cors_origins_env(value)
-            if parsed is not None:
-                return parsed
-        return super().prepare_field_value(field_name, field, value, value_is_complex)
+class AgentClefEnvSettingsSource(CorsOriginsParseMixin, EnvSettingsSource):
+    pass


-class AgentClefDotEnvSettingsSource(DotEnvSettingsSource):
-    def prepare_field_value(
-        self,
-        field_name: str,
-        field: FieldInfo,
-        value: Any,
-        value_is_complex: bool,
-    ) -> Any:
-        if field_name == "cors_origins" and isinstance(value, str):
-            parsed = parse_cors_origins_env(value)
-            if parsed is not None:
-                return parsed
-        return super().prepare_field_value(field_name, field, value, value_is_complex)
+class AgentClefDotEnvSettingsSource(CorsOriginsParseMixin, DotEnvSettingsSource):
+    pass
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/config.py` around lines 22 - 49, The prepare_field_value method
implementation is duplicated identically in both AgentClefEnvSettingsSource and
AgentClefDotEnvSettingsSource classes. Extract this common logic into a mixin
class that contains the prepare_field_value method with the CORS origins parsing
logic, then have both AgentClefEnvSettingsSource and
AgentClefDotEnvSettingsSource inherit from this mixin in addition to their
respective parent classes (EnvSettingsSource and DotEnvSettingsSource). Remove
the prepare_field_value method definition from both classes so they inherit the
implementation from the mixin instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@requirements.txt`:
- Line 10: The python-multipart dependency in requirements.txt currently has a
version constraint of >=0.0.9 which allows vulnerable versions. Update the
python-multipart requirement to specify a minimum version of 0.0.30 or later to
exclude the versions containing HIGH-severity vulnerabilities (CVE-2026-53538,
GHSA-vffw-93wf-4j4q, and CVE-2026-42561). After updating the constraint in the
requirements.txt file, regenerate your lockfile to ensure all transitive
dependencies are resolved with the updated constraints.

---

Outside diff comments:
In `@tests/test_health.py`:
- Around line 8-26: The test test_health_check_returns_public_runtime_state
needs to be fully environment-isolated by mocking not just the Settings
model_config env_file, but also the actual environment variables that Settings
reads. In addition to the existing monkeypatch.setattr call for
Settings.model_config, use monkeypatch.delenv or monkeypatch.setenv to control
all AGENTCLEF_* environment variables (or mock the entire environment) before
instantiating Settings() so that the hardcoded assertions about the response
payload are not affected by whatever environment variables exist in the CI/dev
environment where tests run.

---

Nitpick comments:
In `@server/config.py`:
- Around line 22-49: The prepare_field_value method implementation is duplicated
identically in both AgentClefEnvSettingsSource and AgentClefDotEnvSettingsSource
classes. Extract this common logic into a mixin class that contains the
prepare_field_value method with the CORS origins parsing logic, then have both
AgentClefEnvSettingsSource and AgentClefDotEnvSettingsSource inherit from this
mixin in addition to their respective parent classes (EnvSettingsSource and
DotEnvSettingsSource). Remove the prepare_field_value method definition from
both classes so they inherit the implementation from the mixin instead.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 007252e4-cd9d-4bc3-8921-1c54994cb8c0

📥 Commits

Reviewing files that changed from the base of the PR and between b28ad7d and e2d5e66.

⛔ Files ignored due to path filters (1)
  • web/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (17)
  • README.md
  • README.zh-CN.md
  • docs/README.md
  • requirements-dev.txt
  • requirements.txt
  • server/api/health.py
  • server/app.py
  • server/config.py
  • server/main.py
  • tests/test_config.py
  • tests/test_health.py
  • tests/test_worker_app.py
  • web/eslint.config.js
  • web/package.json
  • web/src/App.tsx
  • web/src/lib/api.ts
  • worker/app.py
✅ Files skipped from review due to trivial changes (3)
  • web/package.json
  • web/eslint.config.js
  • docs/README.md
🚧 Files skipped from review as they are similar to previous changes (5)
  • requirements-dev.txt
  • web/src/lib/api.ts
  • server/api/health.py
  • worker/app.py
  • web/src/App.tsx

Comment thread requirements.txt Outdated
@DXL-0702

Copy link
Copy Markdown
Owner Author

@gemini-code-assist review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the foundational structure for the AgentClef project, introducing a FastAPI backend, a Celery worker, and a React/Vite frontend dashboard. Feedback focuses on improving test reliability and code quality: specifically, addressing potential test flakiness from global caching of settings, enhancing the robustness of the CORS environment variable parser, adopting idiomatic FastAPI dependency injection for settings, and avoiding global state mutation in tests by passing _env_file=None directly to the Settings constructor.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread server/config.py
Comment on lines +138 to +140
@lru_cache
def get_settings() -> Settings:
return Settings()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using @lru_cache on get_settings can lead to flaky tests because settings are cached globally. When worker.app is imported in tests, it calls get_settings() at module level, caching the default settings before any monkeypatch can take effect in individual tests. To prevent this, you should add an autouse pytest fixture to clear the cache before/after each test, or provide a way to reset it.

Comment thread server/config.py Outdated
Comment on lines +15 to +19
def parse_cors_origins_env(value: str) -> list[str] | None:
normalized = value.strip()
if not normalized or normalized.startswith("["):
return None
return [origin.strip() for origin in normalized.split(",") if origin.strip()]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the environment variable AGENTCLEF_CORS_ORIGINS is wrapped in single or double quotes (which is common in some deployment environments or shell exports), normalized will start with a quote character rather than [. This causes the parser to split the JSON array string by commas, resulting in malformed origins. Stripping outer quotes first makes the parser much more robust.

def parse_cors_origins_env(value: str) -> list[str] | None:
    normalized = value.strip()
    if (normalized.startswith("'") and normalized.endswith("'")) or (
        normalized.startswith('"') and normalized.endswith('"')
    ):
        normalized = normalized[1:-1].strip()
    if not normalized or normalized.startswith("["):
        return None
    return [origin.strip() for origin in normalized.split(",") if origin.strip()]

Comment thread server/api/health.py Outdated
Comment on lines +30 to +35
@router.get("/health", response_model=HealthResponse)
def health_check(request: Request) -> HealthResponse:
settings = getattr(request.app.state, "settings", None)
if settings is None:
settings = get_settings()
settings = cast(Settings, settings)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of manually retrieving settings from request.app.state and casting them, use FastAPI's dependency injection with Depends(get_settings). This is the idiomatic FastAPI way, improves type safety, and allows you to easily override settings in tests using app.dependency_overrides.

from fastapi import Depends

@router.get("/health", response_model=HealthResponse)
def health_check(settings: Settings = Depends(get_settings)) -> HealthResponse:

Comment thread server/app.py Outdated
Comment on lines +8 to +11
def create_app(settings: Settings | None = None) -> FastAPI:
runtime_settings = settings or get_settings()
app = FastAPI(title=runtime_settings.app_name, version=runtime_settings.app_version)
app.state.settings = runtime_settings

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If you adopt dependency injection for settings in your API routes, you can register the settings override directly in app.dependency_overrides instead of storing it in app.state.settings. This is cleaner and more idiomatic.

def create_app(settings: Settings | None = None) -> FastAPI:
    runtime_settings = settings or get_settings()
    app = FastAPI(title=runtime_settings.app_name, version=runtime_settings.app_version)
    if settings:
        app.dependency_overrides[get_settings] = lambda: settings

Comment thread tests/test_config.py Outdated
Comment on lines +8 to +16
def isolate_settings_sources(monkeypatch: MonkeyPatch) -> None:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
monkeypatch.setattr(Settings, "model_config", {**Settings.model_config, "env_file": None})


def isolated_settings(monkeypatch: MonkeyPatch, **overrides: object) -> Settings:
isolate_settings_sources(monkeypatch)
return Settings(**overrides) # type: ignore[arg-type]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of mutating the global Settings.model_config class attribute via monkeypatch.setattr (which can cause side effects and flaky tests across the test suite), you can pass _env_file=None directly to the Settings constructor to prevent it from loading the .env file.

Suggested change
def isolate_settings_sources(monkeypatch: MonkeyPatch) -> None:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
monkeypatch.setattr(Settings, "model_config", {**Settings.model_config, "env_file": None})
def isolated_settings(monkeypatch: MonkeyPatch, **overrides: object) -> Settings:
isolate_settings_sources(monkeypatch)
return Settings(**overrides) # type: ignore[arg-type]
def isolate_settings_sources(monkeypatch: MonkeyPatch) -> None:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
def isolated_settings(monkeypatch: MonkeyPatch, **overrides: object) -> Settings:
isolate_settings_sources(monkeypatch)
return Settings(_env_file=None, **overrides) # type: ignore[arg-type]

Comment thread tests/test_health.py Outdated
Comment on lines +8 to +12
def test_health_check_returns_public_runtime_state(monkeypatch: MonkeyPatch) -> None:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
monkeypatch.setattr(Settings, "model_config", {**Settings.model_config, "env_file": None})
settings = Settings()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid mutating the global Settings.model_config class attribute. Pass _env_file=None directly to the Settings constructor instead.

Suggested change
def test_health_check_returns_public_runtime_state(monkeypatch: MonkeyPatch) -> None:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
monkeypatch.setattr(Settings, "model_config", {**Settings.model_config, "env_file": None})
settings = Settings()
def test_health_check_returns_public_runtime_state(monkeypatch: MonkeyPatch) -> None:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
settings = Settings(_env_file=None)

Comment thread tests/test_worker_app.py Outdated
Comment on lines +7 to +11
def isolated_settings(monkeypatch: MonkeyPatch, **overrides: object) -> Settings:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
monkeypatch.setattr(Settings, "model_config", {**Settings.model_config, "env_file": None})
return Settings(**overrides) # type: ignore[arg-type]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid mutating the global Settings.model_config class attribute. Pass _env_file=None directly to the Settings constructor instead.

Suggested change
def isolated_settings(monkeypatch: MonkeyPatch, **overrides: object) -> Settings:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
monkeypatch.setattr(Settings, "model_config", {**Settings.model_config, "env_file": None})
return Settings(**overrides) # type: ignore[arg-type]
def isolated_settings(monkeypatch: MonkeyPatch, **overrides: object) -> Settings:
for field_name in Settings.model_fields:
monkeypatch.delenv(f"AGENTCLEF_{field_name.upper()}", raising=False)
return Settings(_env_file=None, **overrides) # type: ignore[arg-type]

@DXL-0702

Copy link
Copy Markdown
Owner Author

@gemini-code-assist review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request initializes the AgentClef project structure, establishing the FastAPI backend, Celery worker, and Vite/React frontend. It introduces configuration management via Pydantic, a basic health check endpoint, and initial test suites. Feedback focuses on fixing a parsing bug in parse_cors_origins_env for comma-separated CORS origins with individual quotes, and improving the health check endpoint by performing actual connectivity checks for PostgreSQL and Redis instead of merely verifying that their configuration strings are non-empty.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread server/config.py Outdated
Comment on lines +15 to +24
def parse_cors_origins_env(value: str) -> list[str] | None:
normalized = value.strip()
if (
(normalized.startswith("'") and normalized.endswith("'"))
or (normalized.startswith('"') and normalized.endswith('"'))
):
normalized = normalized[1:-1].strip()
if not normalized or normalized.startswith("["):
return None
return [origin.strip() for origin in normalized.split(",") if origin.strip()]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a bug in parse_cors_origins_env when parsing comma-separated CORS origins that contain individual quotes (e.g., AGENTCLEF_CORS_ORIGINS="http://localhost:5173", "http://127.0.0.1:5173").\n\nBecause the outer quotes are stripped first, the resulting string becomes http://localhost:5173", "http://127.0.0.1:5173. When split by ,, the individual elements retain the internal quotes (e.g., http://localhost:5173" and "http://127.0.0.1:5173), which are invalid origins and will silently break CORS.\n\nWe can fix this by stripping quotes from each individual element after splitting.

def parse_cors_origins_env(value: str) -> list[str] | None:\n    normalized = value.strip()\n    if (\n        (normalized.startswith("'") and normalized.endswith("'"))\n        or (normalized.startswith('"') and normalized.endswith('"'))\n    ):\n        normalized = normalized[1:-1].strip()\n    if not normalized or normalized.startswith("["):\n        return None\n    \n    origins = []\n    for part in normalized.split(","):\n        cleaned = part.strip().strip("'\"").strip()\n        if cleaned:\n            origins.append(cleaned)\n    return origins

Comment thread server/api/health.py
Comment on lines +33 to +35
postgres_configured=bool(settings.postgres_dsn),
redis_configured=bool(settings.redis_url),
file_storage_configured=bool(settings.file_storage_path),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fields postgres_configured, redis_configured, and file_storage_configured currently only check if the respective configuration strings are non-empty.\n\nHowever, because postgres_dsn, redis_url, and file_storage_path are validated as required non-empty strings in Settings (via validate_required_string), these fields will always evaluate to True at runtime. This makes them redundant as configuration flags.\n\nTo make the health check more useful, consider performing actual lightweight connectivity checks (e.g., executing a simple SELECT 1 query on Postgres, or sending a PING command to Redis) to verify that these services are not only configured but also reachable and healthy.

@DXL-0702

Copy link
Copy Markdown
Owner Author

@gemini-code-assist full review,还有没有新的安
全/测试失败/API 契约类问题

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the foundational structure for the AgentClef transcription review workbench, introducing a FastAPI backend, a Celery worker, and a React frontend. Key additions include application configuration management, a health check endpoint, and corresponding unit tests. The review feedback highlights two main areas for improvement: first, a parsing bug in parse_cors_origins_env that causes startup failures when handling single-quoted JSON arrays for CORS origins; second, the redundancy of the configuration flags in the health check endpoint, which should ideally perform active connectivity checks rather than just verifying non-empty configuration strings.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread server/config.py
Comment on lines +15 to +29
def parse_cors_origins_env(value: str) -> list[str] | None:
normalized = value.strip()
if (
(normalized.startswith("'") and normalized.endswith("'"))
or (normalized.startswith('"') and normalized.endswith('"'))
):
normalized = normalized[1:-1].strip()
if not normalized or normalized.startswith("["):
return None
origins: list[str] = []
for origin in normalized.split(","):
cleaned = origin.strip().strip("'\"").strip()
if cleaned:
origins.append(cleaned)
return origins

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a bug in how parse_cors_origins_env handles JSON arrays wrapped in single quotes (e.g., '["http://localhost:5173"]'), which is the format used in .env.example.

When a JSON array has outer single quotes, parse_cors_origins_env strips them, sees that the stripped string starts with [ (line 22), and returns None to delegate parsing to Pydantic. However, prepare_field_value then passes the original, unstripped value (with the single quotes) to super().prepare_field_value. Pydantic's built-in JSON parser fails to parse strings starting with single quotes, raising a ValidationError and causing the application to fail on startup.

To fix this, we should parse the JSON array directly using the standard json library after stripping the outer quotes.

def parse_cors_origins_env(value: str) -> list[str] | None:
    normalized = value.strip()
    if (
        (normalized.startswith("'") and normalized.endswith("'"))
        or (normalized.startswith('"') and normalized.endswith('"'))
    ):
        normalized = normalized[1:-1].strip()
    if not normalized:
        return None
    if normalized.startswith("["):
        import json
        try:
            parsed = json.loads(normalized)
            if isinstance(parsed, list):
                return [str(item) for item in parsed]
        except json.JSONDecodeError:
            return None
    origins: list[str] = []
    for origin in normalized.split(","):
        cleaned = origin.strip().strip("'\"").strip()
        if cleaned:
            origins.append(cleaned)
    return origins

Comment thread server/api/health.py
Comment on lines +33 to +35
postgres_configured=bool(settings.postgres_dsn),
redis_configured=bool(settings.redis_url),
file_storage_configured=bool(settings.file_storage_path),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fields postgres_configured, redis_configured, and file_storage_configured will always evaluate to True in a running application, making these checks redundant.

In server/config.py, the validate_required_string validator ensures that postgres_dsn, redis_url, and file_storage_path cannot be empty or whitespace-only strings. If any of these are empty, a ValidationError is raised and the application fails to start. Therefore, if the application is running and this endpoint is reachable, these settings are guaranteed to be non-empty.

To make these health indicators meaningful, consider either:

  1. Allowing these configuration fields to be optional/nullable in Settings (if the services are indeed optional).
  2. Performing actual lightweight connectivity checks (e.g., executing a simple query on PostgreSQL or a PING command on Redis) to verify that the services are not only configured but also reachable and healthy.

@DXL-0702 DXL-0702 merged commit 6ed7285 into main Jun 22, 2026
1 check was pending
@coderabbitai coderabbitai Bot mentioned this pull request Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AC-V01-002: 实现后端配置加载与健康检查 AC-V01-001: 初始化项目工程骨架

1 participant