Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat(ai): add Google Gemini LLM provider support#3968

Open
rajnisk wants to merge 20 commits intoOWASP:feature/nestbot-ai-assistantfrom
rajnisk:feature/nestbot-ai-gemini-provider
Open

feat(ai): add Google Gemini LLM provider support#3968
rajnisk wants to merge 20 commits intoOWASP:feature/nestbot-ai-assistantfrom
rajnisk:feature/nestbot-ai-gemini-provider

Conversation

@rajnisk
Copy link

@rajnisk rajnisk commented Feb 16, 2026

Resolves #3693
Related to #908

Proposed change

This PR adds Google Gemini as an alternative LLM provider for NestBot AI Assistant, enabling contributors to run NestBot locally using a free Google Gemini API key as an alternative to OpenAI.

This is part of the larger NestBot AI Assistant feature work tracked in #908.

Changes

  • New: GoogleEmbedder class for Google embeddings using gemini-embedding-001 model (768 dimensions)
  • Updated: llm_config.py to support provider selection (OpenAI/Google) with automatic fallback
  • Updated: embeddings/factory.py to return Google embedder when LLM_PROVIDER=google
  • Added: Django settings for Gemini (GOOGLE_API_KEY, GOOGLE_MODEL_NAME, LLM_PROVIDER)
  • Added: Tests for Google LLM provider configuration
  • Updated: pyproject.toml to include google-genai extra for CrewAI
  • Updated: Custom dictionary to include genai and generativeai
  • Updated: .env.example with new Gemini environment variables

Technical Details

  • Uses google.genai SDK (with fallback to deprecated google.generativeai for compatibility)
  • Implements REST API fallback if SDK is not available
  • Default embedding model: gemini-embedding-001 (recommended, replaces deprecated models)
  • Default LLM model: gemini-2.0-flash
  • Maintains backward compatibility with OpenAI provider (default)

Configuration

Set these environment variables to use Google provider:

DJANGO_LLM_PROVIDER=google
DJANGO_GOOGLE_API_KEY=<your-google-api-key>
DJANGO_GOOGLE_MODEL_NAME=gemini-2.0-flash  # Optional, has default

For OpenAI provider (default):

DJANGO_LLM_PROVIDER=openai  # or omit (defaults to openai)
DJANGO_OPEN_AI_SECRET_KEY=<your-openai-key>

Testing

  • ✅ All existing tests pass
  • ✅ New tests added for Google provider configuration
  • make check-test-backend passes locally
  • ✅ Tested with Google Gemini API key locally
  • ✅ Verified embedding generation works correctly
  • ✅ Verified LLM provider selection works as expected

Files Changed

  • backend/apps/ai/common/llm_config.py - Provider selection logic
  • backend/apps/ai/embeddings/google.py - New Google embedder implementation
  • backend/apps/ai/embeddings/factory.py - Factory updates for Google provider
  • backend/settings/base.py - New Django settings
  • backend/tests/apps/ai/common/llm_config_test.py - Test coverage
  • backend/pyproject.toml - Dependency updates
  • backend/.env.example - Environment variable documentation
  • cspell/custom-dict.txt - Spell checker updates

Checklist

  • Required: I followed the contributing workflow
  • Required: I verified that my code works as intended and resolves the issue as described
  • Required: I ran make check-test-backend locally: all warnings addressed, tests passed
  • I used AI for code, documentation, tests, or communication related to this PR

- Add GoogleEmbedder implementation for embeddings
- Update LLM config to support provider selection (OpenAI/Google)
- Add Django settings for Gemini (GOOGLE_API_KEY, GOOGLE_MODEL_NAME, LLM_PROVIDER)
- Update embedding factory to support Google provider
- Add tests for Google LLM provider configuration
- Update pyproject.toml to include google-genai extra
- Add genai and generativeai to custom dictionary

This enables contributors to run NestBot locally using a free Google Gemini API key
as an alternative to OpenAI.

Resolves part of OWASP#3693
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Switches LLM/embedder configuration from environment-variable reads to Django settings, adds a Google GenAI embedder and factory path, updates settings/pyproject/.env example, extends tests for Google provider and fallback logging, and updates spell-check dictionary entries.

Changes

Cohort / File(s) Summary
Environment example
backend/.env.example
Added DJANGO_GOOGLE_API_KEY and DJANGO_LLM_PROVIDER placeholder entries.
Settings
backend/settings/base.py
Added LLM-related settings: OPEN_AI_MODEL_NAME, GOOGLE_API_KEY, GOOGLE_MODEL_NAME, LLM_PROVIDER; changed OPEN_AI_SECRET_KEY declaration to a generic SecretValue().
LLM configuration
backend/apps/ai/common/llm_config.py
Replaced os env reads with django.conf.settings, added module logger, switched provider selection to settings.LLM_PROVIDER (openai/google), removed Anthropic branch, and added fallback-to-OpenAI with error logging for unrecognized providers.
Embeddings factory
backend/apps/ai/embeddings/factory.py
Factory now selects GoogleEmbedder when settings.LLM_PROVIDER == "google", otherwise returns OpenAIEmbedder.
Google embedder (new)
backend/apps/ai/embeddings/google.py
New GoogleEmbedder(Embedder) implementing embed_query, embed_documents, and get_dimensions, using google.genai.Client and settings.GOOGLE_API_KEY with fixed embedding dimension (1536).
Dependency manifest
backend/pyproject.toml
Updated crewai dependency to include extras = ["google-genai"].
Tests
backend/tests/apps/ai/common/llm_config_test.py
Tests updated to use DJANGO_-prefixed env keys, new default models (gpt-4o-mini, gemini-2.0-flash), added Google-provider cases, and assert fallback/logging for unsupported providers.
Tooling / Spell-check
cspell/custom-dict.txt
Added genai and generativeai entries.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR resolves issue #3693 but does not implement the coding requirements listed there. Issue #3693 requires reaction management fixes, RAG search robustness improvements, greeting handling detection, and test coverage expansion—none of which are present in the code changes. Implement the required features from #3693: reaction management logic, RAG search retry strategies, greeting detection, and comprehensive test coverage for all intent categories and edge cases.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding Google Gemini as an LLM provider, which is the core purpose of this PR.
Description check ✅ Passed The description is detailed and clearly related to the changeset, explaining the Google Gemini provider implementation, configuration, and testing performed.
Out of Scope Changes check ✅ Passed All changes are focused on adding Google Gemini LLM provider support, which aligns with the PR objectives. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rajnisk rajnisk marked this pull request as draft February 16, 2026 13:36
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 8 files

Confidence score: 2/5

  • High risk of runtime failure in backend/apps/ai/embeddings/google.py: using deprecated google.generativeai APIs with the new google.genai SDK will raise AttributeError when deployed.
  • Silent fallback to OpenAI in backend/apps/ai/common/llm_config.py can mask misconfiguration and change provider behavior without warning, which is user-impacting.
  • Pay close attention to backend/apps/ai/embeddings/google.py, backend/apps/ai/common/llm_config.py, backend/tests/apps/ai/common/llm_config_test.py, backend/settings/base.py - SDK API mismatch, provider fallback behavior, test not covering defaults, and inconsistent env var naming.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/apps/ai/embeddings/google.py">

<violation number="1" location="backend/apps/ai/embeddings/google.py:51">
P0: Bug: `genai.configure()` and `genai.embed_content()` are APIs from the deprecated `google.generativeai` package, not the new `google.genai` SDK imported on line 5. When the new SDK is installed, this will raise `AttributeError` at runtime.

The new `google.genai` SDK uses a `Client` object pattern:
```python
client = genai.Client(api_key=self.api_key)
result = client.models.embed_content(model=self.model, contents=text)

The genai.configure() + genai.embed_content() pattern only works with the old deprecated google.generativeai package. Since the primary import on line 5 is the new SDK, this code will fail in the happy path.

P2: Inconsistent naming convention for OpenAI settings: `OPEN_AI_SECRET_KEY` (with underscore) vs `OPENAI_MODEL_NAME` (without). This means the env vars are `DJANGO_OPEN_AI_SECRET_KEY` and `DJANGO_OPENAI_MODEL_NAME` — two different patterns for the same provider. Consider using `OPEN_AI_MODEL_NAME` to be consistent with the existing `OPEN_AI_SECRET_KEY`.

(Based on your team's feedback about maintaining consistent and accurate naming.) [FEEDBACK_USED]

P2: This test claims to verify the default model behavior but explicitly sets `DJANGO_GOOGLE_MODEL_NAME` in the environment, so it never exercises the actual default from `settings.GOOGLE_MODEL_NAME = values.Value(default="gemini-2.0-flash")`. Compare with `test_get_llm_openai_default` which omits `DJANGO_OPENAI_MODEL_NAME` to truly test the default path. Remove `DJANGO_GOOGLE_MODEL_NAME` from the patched env to actually test default model resolution. P1: Silent fallback to OpenAI on unrecognized `LLM_PROVIDER` can mask configuration errors. If a user typos the provider name (e.g., `gogle`), they'll silently get OpenAI instead of a clear error. The previous behavior of raising `ValueError` was safer. Consider raising an error for unrecognized providers rather than falling back silently, or at minimum using `logger.error` instead of `logger.warning`. ```

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@backend/.env.example`:
- Around line 24-25: The two new .env keys are out of alphabetical order; move
the lines for DJANGO_GOOGLE_API_KEY=None and DJANGO_LLM_PROVIDER=None into the
DJANGO_* block sorted alphabetically: place DJANGO_GOOGLE_API_KEY=None
immediately after DJANGO_ELEVENLABS_API_KEY and then place
DJANGO_LLM_PROVIDER=None directly after DJANGO_GOOGLE_API_KEY and before
DJANGO_OPEN_AI_SECRET_KEY so all DJANGO_* entries remain alphabetized.

In `@backend/apps/ai/embeddings/google.py`:
- Around line 109-119: The comment claiming "SDK handles batching automatically"
is inaccurate because the code in the self.use_sdk branch calls
genai.embed_content in a per-text loop over texts; update the code to either (A)
use the SDK's actual batch embedding API (replace the per-text loop calling
genai.embed_content with the SDK batch method and extract embeddings) or (B) if
no batch API is available, change the comment to accurately state that
embeddings are requested sequentially and keep the loop; locate the branch using
self.use_sdk, genai, genai.embed_content, model, and texts to implement the
chosen fix.
- Line 86: The code constructs the REST fallback URL with the API key embedded
(endpoint =
f"{self.base_url}/models/{self.model}:embedContent?key={self.api_key}"), which
exposes credentials in logs; change the request to omit the key from the URL and
instead send it in an HTTP header (use 'x-goog-api-key': self.api_key) when
building the embedding request in the Google embeddings class/method that
constructs endpoint and performs the REST call, and also ensure any logging or
error handlers that might log the full URL redact query parameters (remove or
mask self.api_key) before logging.
- Around line 34-46: The constructor (__init__) currently ignores the passed
model and hardcodes self._dimensions = 768 which can lead to incorrect
get_dimensions() values; update __init__ in the Google embedder to derive
_dimensions from the requested model by either (a) adding a model→dimensions
mapping and setting self._dimensions = MODEL_DIMENSIONS.get(self.model) with a
sensible default, or (b) validating the provided self.model against supported
models and raising a clear error if it's unsupported (or forcing it to
"gemini-embedding-001"); ensure references to self.model and get_dimensions()
remain consistent and include a clear fallback/validation path so downstream
vector sizes are correct.
- Around line 5-23: The import logic is wrong because google.genai (new package)
lacks genai.configure() and genai.embed_content(); update the import and runtime
branching so the code supports both APIs: try importing google.generativeai
first (the deprecated package) and use its genai.configure() and
genai.embed_content() paths when available, otherwise import google.genai and
instantiate google.genai.Client and call the Client-based embedding methods
(e.g., Client embeddings/embedding creation calls) instead of
genai.configure()/genai.embed_content(); add a runtime feature check (hasattr or
try/except) around genai.configure and embed_content to choose the correct code
path and preserve existing behavior for both google.generativeai and
google.genai.
- Around line 75-97: The SDK call genai.embed_content returns an
EmbedContentResponse object (with an embeddings list), not a dict; update
embed_query (function embed_query) to read the vector from
result.embeddings[0].values instead of result["embedding"], and update
embed_documents (function embed_documents) to collect vectors with [e.values for
e in result.embeddings] (and perform the batching as a single SDK call instead
of iterating per-item). Ensure both code paths (use_sdk branch) use these
attribute accesses and keep the REST fallback unchanged.
🧹 Nitpick comments (3)
backend/apps/ai/common/llm_config.py (1)

36-47: Simplify the fallback guard — "openai" and "google" are unreachable here.

Since both known providers already return on lines 23 and 29, the provider not in ("openai", "google") check on line 37 is always true when reached. You can simplify to just if provider:.

♻️ Suggested simplification
-    if provider and provider not in ("openai", "google"):
+    if provider:
backend/settings/base.py (1)

221-232: LGTM — good documentation on the Value() vs SecretValue() trade-off for GOOGLE_API_KEY.

The comment clearly explains why SecretValue() isn't used. One thing to note: when LLM_PROVIDER=google but GOOGLE_API_KEY is unset, it will be None at runtime, which will surface as an opaque API error. Consider adding an early startup check (e.g., in AppConfig.ready() or in get_llm()) that raises a clear ImproperlyConfigured error. Based on learnings, the redundant environ_name removal on OPEN_AI_SECRET_KEY is correct.

backend/tests/apps/ai/common/llm_config_test.py (1)

54-78: Consider adding a test for unset/empty LLM_PROVIDER to cover the silent fallback path.

The current unsupported-provider test verifies warning + fallback, but there's no test for when LLM_PROVIDER is None or "" (the "not specified" path in llm_config.py that falls back to OpenAI without logging a warning). This is a distinct code path worth covering.

- Fix import order: use google.generativeai first (has configure/embed_content)
- Add support for new google.genai Client API with runtime feature detection
- Fix return value access: use result.embeddings[0].values (object, not dict)
- Make dimensions configurable: module-level MODEL_DIMENSIONS with validation
- Update comments: accurate sequential processing descriptions
- Fix linting errors: noqa comments, specific exceptions, refactored raise

Addresses CodeRabbit review suggestions for better API compatibility
and correct embedding response handling.
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 2 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/apps/ai/embeddings/google.py">

<violation number="1" location="backend/apps/ai/embeddings/google.py:117">
P1: Bug: `self.base_url` is never set when the new SDK path is selected, but the fallback to REST API (triggered when the SDK call fails) uses `self.base_url`, causing an `AttributeError`. Set `self.base_url` unconditionally in `__init__` so the REST fallback always works.</violation>

<violation number="2" location="backend/apps/ai/embeddings/google.py:157">
P1: Bug: Wrong result access pattern for the deprecated `google.generativeai` API. The deprecated `embed_content()` returns a **dictionary** with an `"embedding"` key, not an object with `.embeddings` attribute. The original code (`result["embedding"]`) was correct. This will crash at runtime with `AttributeError` when the deprecated SDK is used.</violation>

<violation number="3" location="backend/apps/ai/embeddings/google.py:165">
P1: Bug: Wrong parameter name for the new `google.genai` Client API. The official docs use `contents` (plural), not `content` (singular). This will likely raise a `TypeError` or silently fail.</violation>

<violation number="4" location="backend/apps/ai/embeddings/google.py:221">
P1: Bug: Same incorrect result access for deprecated API in `embed_documents`. Should use `result["embedding"]` instead of `result.embeddings[0].values`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@backend/apps/ai/embeddings/google.py`:
- Around line 159-176: The current parsing in the self.use_new_sdk branch of the
embed extraction (around client.models.embed_content and variable result) can
silently return an empty list; change it so that if no embedding is found from
any checked paths (result.embeddings[0].values, result.embedding.values, or
dict-like result.get("embedding")...), you raise a clear exception (e.g.,
ValueError) that includes contextual information (model name, input length, and
a safe repr(result) or result keys) instead of returning []; this makes callers
aware of malformed SDK responses and aids debugging.
- Around line 11-29: The deprecation warning for google.generativeai is emitted
at import time even when the Google provider isn't selected; move the
warnings.warn call out of the top-level import block and instead emit it inside
the embedder's initializer (e.g., the class __init__ where the embedder is
instantiated) only when use_deprecated_api is True and genai_deprecated was
chosen; keep the import/hasattr checks in the module to set
genai_deprecated/use_deprecated_api, but defer calling warnings.warn until the
embedder's __init__ so the warning appears only when the deprecated SDK is
actually used.
- Around line 112-127: Initialize self.base_url unconditionally in the class
constructor before the genai_client_module branching so it exists whether using
the new SDK, deprecated SDK, or REST fallback; specifically set self.base_url =
"https://generativelanguage.googleapis.com/v1beta" prior to the try/except where
you instantiate genai_client_module.Client (and keep the existing else fallback
unchanged), so embed_query and embed_documents can safely reference
self.base_url even if the SDK path later fails at runtime.
- Around line 269-281: The batch REST payload omits the per-item "model" field
required by Google's batchEmbedContents API; update the requests JSON
comprehension to include "model": self.model in each request object (the code
that builds json={"requests": [...]}) so each entry is {"model": self.model,
"content": {"parts": [{"text": text}]}}; keep using self.base_url, endpoint, and
self.api_key as before and ensure response handling
(response.raise_for_status(), response.json(), and returning embeddings) remains
unchanged.
- Around line 149-157: The deprecated google.generativeai.embed_content() call
in the code path using self.use_deprecated_sdk returns a dict (e.g.
{"embedding": [...]}) not an object with .embeddings, so change the embed_query
and embed_documents handling to read the embedding from the dict safely: when
genai.embed_content(...) returns, check for dict keys ("embedding" or
"embeddings") and extract the vector accordingly (e.g. result.get("embedding")
or result.get("embeddings")[0] if needed), while preserving the existing return
shape; update the branches in the methods that call genai.embed_content (the
code guarded by self.use_deprecated_sdk and using genai) to support both dict
and object shapes to avoid AttributeError.
🧹 Nitpick comments (1)
backend/apps/ai/embeddings/google.py (1)

224-267: Per-item REST fallback in the new SDK path defeats batching and duplicates logic.

When the new SDK path fails for individual documents (lines 252-266), each item falls back to individual REST calls instead of collecting failures and using the batch endpoint. Additionally, the REST call logic and SDK response parsing are duplicated across embed_query and embed_documents.

Consider extracting a _embed_single_rest(text) helper and a _parse_sdk_result(result) helper to reduce duplication and centralize error handling.

- Fix import order: use google.generativeai first (has configure/embed_content)
- Add support for new google.genai Client API with runtime feature detection
- Fix return value access: use result.embeddings[0].values (object, not dict)
- Make dimensions configurable: module-level MODEL_DIMENSIONS with validation
- Update comments: accurate sequential processing descriptions
- Fix linting errors: noqa comments, specific exceptions, helper function for raises
- Defer deprecation warning to __init__ (only fires when embedder is used)
- Initialize base_url early to prevent AttributeError in SDK fallback paths
- Raise error instead of returning empty data for unrecognized API structures
- Use x-goog-api-key header instead of query parameter for API key security
- Add required model field to batchEmbedContents requests per API docs

Addresses CodeRabbit review suggestions for better API compatibility,
correct embedding response handling, improved security, and API compliance.
- Fix inconsistent naming: OPENAI_MODEL_NAME → OPEN_AI_MODEL_NAME
- Fix test_get_llm_google to actually test default model
- Change silent fallback to logger.error for better visibility
- Verify SDK usage is correct (already handles both APIs)
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/apps/ai/embeddings/google.py">

<violation number="1" location="backend/apps/ai/embeddings/google.py:310">
P1: Bug: The `model` field in the batch embed request body is missing the required `models/` prefix. According to the Google Generative AI API documentation, the model field in each `EmbedContentRequest` must use the format `models/{model}`. Without this prefix, the batch REST API call will fail.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/apps/ai/embeddings/google.py`:
- Around line 174-204: The try/except around the new SDK path currently catches
ValueError, which causes _raise_embedding_extraction_error() to be swallowed and
silently fall back to REST; update the handler so the helper's error is not
caught (either remove ValueError from the except tuple in the block surrounding
client.models.embed_content, or change _raise_embedding_extraction_error() to
raise a distinct exception type and let the except catch only
AttributeError/TypeError), and also remove or replace the placeholder comment
about the implementation; locate symbols use_new_sdk,
client.models.embed_content, and _raise_embedding_extraction_error to apply the
fix.
- Around line 298-320: The batchEmbedContents request body is setting "model":
self.model without the required "models/" prefix; update the list comprehension
that builds the request items (the dict with "model" in the REST fallback block
that posts to endpoint and uses batchEmbedContents) to set "model" to
f"models/{self.model}" (or equivalent string concatenation) so each item's model
matches the URL format; keep all other fields the same and ensure the generated
JSON keys still match the API schema.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
backend/settings/base.py (1)

221-232: Good cleanup on OPEN_AI_SECRET_KEY and well-documented rationale for GOOGLE_API_KEY.

The removal of the redundant environ_name is correct. The comment explaining why Value() is used instead of SecretValue() for GOOGLE_API_KEY is helpful.

One consideration: since OPEN_AI_SECRET_KEY uses SecretValue(), it must be set in the environment at startup — even for Google-only deployments — because SecretValue raises ImproperlyConfigured during class resolution. This is consistent with the fallback-to-OpenAI design, but it means a pure Google-only setup isn't possible without also providing an OpenAI key (or a dummy value). If that's intentional, it might be worth a brief comment. If Google-only setups should be supported in the future, OPEN_AI_SECRET_KEY would need the same Value(default=None) treatment.

Based on learnings, the redundant environ_name removal on OPEN_AI_SECRET_KEY aligns with the established pattern that django-configurations automatically prefixes with DJANGO_ and uses the setting name by default.

backend/apps/ai/common/llm_config.py (2)

36-47: Simplify the redundant guard on line 37.

At this point in the function, provider is guaranteed to be neither "openai" nor "google" (both returned early). The provider not in ("openai", "google") check is always True here. Simplify to just if provider:.

♻️ Suggested simplification
     # Fallback to OpenAI if provider not recognized or not specified
-    if provider and provider not in ("openai", "google"):
+    if provider:
         logger.error(
             "Unrecognized LLM_PROVIDER '%s'. Falling back to OpenAI. "
             "Supported providers: 'openai', 'google'",
             provider,
         )

22-47: Consider extracting the repeated OpenAI LLM construction.

The OpenAI LLM(...) call on lines 23–27 is duplicated in the fallback on lines 43–47. If this grows (e.g., adding more kwargs), you'd need to update both. A small helper or variable could reduce drift risk.

♻️ One possible approach
 def get_llm() -> LLM:
-    """Get configured LLM instance.
-
-    Returns:
-        LLM: Configured LLM instance based on settings.
-
-    """
+    """Get configured LLM instance based on settings."""
     provider = settings.LLM_PROVIDER
 
+    def _openai_llm() -> LLM:
+        return LLM(
+            model=settings.OPEN_AI_MODEL_NAME,
+            api_key=settings.OPEN_AI_SECRET_KEY,
+            temperature=0.1,
+        )
+
     if provider == "openai":
-        return LLM(
-            model=settings.OPEN_AI_MODEL_NAME,
-            api_key=settings.OPEN_AI_SECRET_KEY,
-            temperature=0.1,
-        )
+        return _openai_llm()
     if provider == "google":
         return LLM(
             model=settings.GOOGLE_MODEL_NAME,
             base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
             api_key=settings.GOOGLE_API_KEY,
             temperature=0.1,
         )
 
     # Fallback to OpenAI if provider not recognized or not specified
-    if provider and provider not in ("openai", "google"):
+    if provider:
         logger.error(
             "Unrecognized LLM_PROVIDER '%s'. Falling back to OpenAI. "
             "Supported providers: 'openai', 'google'",
             provider,
         )
-    return LLM(
-        model=settings.OPEN_AI_MODEL_NAME,
-        api_key=settings.OPEN_AI_SECRET_KEY,
-        temperature=0.1,
-    )
+    return _openai_llm()
backend/tests/apps/ai/common/llm_config_test.py (2)

54-78: Good coverage of the fallback path with logger verification.

The decorator ordering and parameter mapping (mock_llm, mock_logger) are correct. The test properly verifies both the error logging and the OpenAI fallback.

One gap worth noting: there's no test for when LLM_PROVIDER is empty/falsy (e.g., ""). In that case the code silently falls back to OpenAI without logging an error — a subtly different path from the unsupported-provider case. Consider adding a test for that edge case to lock down the behavior.


80-125: Good coverage for Google provider paths.

Both default and custom model tests for the Google provider look correct. The expected base_url, api_key, and temperature values match the implementation in llm_config.py.

One edge case to consider: what happens when DJANGO_LLM_PROVIDER=google but DJANGO_GOOGLE_API_KEY is not set? The setting defaults to None, so get_llm() would pass api_key=None to LLM(...), which would likely fail at call time. A test covering this scenario could document whether that's an expected failure mode or whether get_llm() should validate the key and fall back.

- Add NoReturn type hints to raise functions
- Remove unreachable return statements

Fixes type checker errors about missing return statements.
- Remove placeholder comment from new SDK implementation
- Fix new SDK parameter: use 'contents' (plural) instead of 'content'
- Verify ValueError handling is correct (not caught, propagates)
- Verify batch model format is correct (models/{model})

All CodeRabbit AI review issues resolved.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/apps/ai/embeddings/google.py`:
- Around line 248-267: The code currently raises a ValueError via
_raise_embedding_extraction_error() which propagates out and prevents the
intended REST fallback; update the exception handling around the SDK extraction
block (the except that currently reads except (AttributeError, TypeError) as e)
to also catch ValueError (e.g., except (AttributeError, TypeError, ValueError)
as e) so that the code warns and continues to the REST fallback, or
alternatively change _raise_embedding_extraction_error() to emit a warning and
return/continue instead of raising so the REST fallback code after the
try/except is reached.
- Around line 110-118: The function _extract_embedding_from_result currently
assumes embedding values are dicts with a "values" key and will crash for
deprecated responses where "embedding" or "embeddings"[0] is a plain list;
update _extract_embedding_from_result to check the type of result["embedding"]
and result["embeddings"][0] and, if they are lists (or tuples), return them
directly instead of calling .get("values"), otherwise fall back to accessing the
"values" key or raising via
_raise_deprecated_api_dict_error/_raise_deprecated_api_error as appropriate so
both new and deprecated dict shapes are handled safely.
🧹 Nitpick comments (3)
backend/apps/ai/embeddings/google.py (3)

57-62: Return type should be NoReturn, not implicit None.

The other two helper functions (_raise_deprecated_api_error, _raise_deprecated_api_dict_error) correctly use -> NoReturn. This one always raises but declares no return type, which means type checkers infer -> None.

Fix
-def _raise_embedding_extraction_error() -> None:
+def _raise_embedding_extraction_error() -> NoReturn:

234-235: Remove "placeholder" comment before merging.

Line 235 says "this is a placeholder implementation" — this should be removed or replaced with accurate documentation, as it undermines confidence in the code path.


232-253: Duplicated new-SDK extraction logic — extract to a helper.

The embedding extraction pattern (lines 244-253 and 322-333) is duplicated between embed_query and embed_documents. Consider extracting it to a helper function similar to _extract_embedding_from_result for the deprecated path, to keep the methods focused and reduce duplication.

Also applies to: 310-333

@rajnisk
Copy link
Author

rajnisk commented Feb 16, 2026

@coderabbitai resume

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 16, 2026

✅ Actions performed

Reviews resumed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
backend/apps/ai/embeddings/google.py (3)

57-62: Return type should be NoReturn, not None.

This function unconditionally raises. The other two helpers (_raise_deprecated_api_error, _raise_deprecated_api_dict_error) correctly annotate -> NoReturn. This one should match for consistency and to help static analysis.

Suggested fix
-def _raise_embedding_extraction_error() -> None:
+def _raise_embedding_extraction_error() -> NoReturn:

282-357: embed_documents mixes per-item SDK calls with per-item REST fallback — consider batch REST as the fallback.

Currently, in the new SDK path (lines 306-357), if the SDK fails for one text, the code falls back to individual REST calls per item. However, the pure REST path (lines 359-381) already uses the batch batchEmbedContents endpoint, which is more efficient. If the SDK consistently fails (e.g., incompatible version), every item will trigger a separate REST call instead of a single batch request.

Consider collecting failed items and batch-calling the REST endpoint once, rather than issuing per-item REST requests inside the loop.


337-337: Falsy check if embedding_values: will treat a valid empty embedding [] as failure.

While unlikely in practice (embeddings should always be non-empty), using if embedding_values is not None: is semantically more correct and avoids a subtle bug if the API ever returns an empty list.

Suggested fix
-                if embedding_values:
+                if embedding_values is not None:

- Fix ValueError handling: catch ValueError to allow REST fallback
- Fix embedding extraction: handle plain lists in addition to dicts with 'values' key
- Ensure all SDK extraction failures gracefully fall back to REST API

All CodeRabbit AI review issues resolved for LGTM approval.
- Remove _raise_embedding_extraction_error() calls
- Let code fall through naturally to REST API when extraction fails
- Remove unused function and constant
- Remove ValueError from except clauses
- Fix line length linting error

Fixes CodeRabbit issue about ValueError being caught immediately.
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/apps/ai/embeddings/google.py">

<violation number="1" location="backend/apps/ai/embeddings/google.py:319">
P2: Inconsistent exception handling: `embed_query` warns on SDK failure, but `embed_documents` silently swallows the same exceptions with `pass`. Consider adding a warning here too, matching the pattern in `embed_query`, so developers can diagnose unexpected SDK behavior during batch operations.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 16, 2026
…ling

- Add warning to embed_documents when SDK fails, matching embed_query pattern
- Ensures consistent exception handling across both methods
- Makes it easier to diagnose SDK issues during batch operations

Addresses CodeRabbit suggestion for consistent exception handling.
@rajnisk rajnisk marked this pull request as ready for review February 16, 2026 19:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
backend/apps/ai/embeddings/google.py (2)

52-60: embed_documents can use a single batched SDK call instead of N sequential calls.

The google.genai SDK accepts a list for the contents parameter, returning all embeddings in one API round-trip. The current per-text loop makes N separate calls, which adds latency and quota overhead proportional to document count.

♻️ Proposed refactor
 def embed_documents(self, texts: list[str]) -> list[list[float]]:
-    results = []
-    for text in texts:
-        result = self.client.models.embed_content(
-            model=self.model,
-            contents=text,
-            config=types.EmbedContentConfig(output_dimensionality=1536),
-        )
-        results.append(result.embeddings[0].values)
-    return results
+    result = self.client.models.embed_content(
+        model=self.model,
+        contents=texts,
+        config=types.EmbedContentConfig(output_dimensionality=1536),
+    )
+    return [emb.values for emb in result.embeddings]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/ai/embeddings/google.py` around lines 52 - 60, The
embed_documents implementation currently loops and calls
self.client.models.embed_content per text; change it to make one batched call by
passing the entire texts list as contents to self.client.models.embed_content
(keep model=self.model and output_dimensionality=1536), then iterate over the
returned response.embeddings to extract each embedding.values into the results
list and return that list; update any variable names in the embed_documents
method to reflect the single response handling instead of per-text calls.

14-23: Extract the hardcoded 1536 as a class-level constant to keep _dimensions and output_dimensionality in sync.

The value 1536 appears independently in _dimensions (line 23) and both embed_content calls (lines 38, 57). If they drift, get_dimensions() will report a dimension that differs from what the API actually returns.

♻️ Proposed refactor
+_EMBEDDING_DIMENSIONS = 1536
+
 class GoogleEmbedder(Embedder):
     def __init__(self, model: str = "gemini-embedding-001") -> None:
         self.client = genai.Client(api_key=settings.GOOGLE_API_KEY)
         self.model = model
-        self._dimensions = 1536
+        self._dimensions = _EMBEDDING_DIMENSIONS

Then reference _EMBEDDING_DIMENSIONS in the config of both embed_content calls.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/ai/embeddings/google.py` around lines 14 - 23, Extract the
hardcoded 1536 into a class-level constant named _EMBEDDING_DIMENSIONS (on the
Google embedder class) and set self._dimensions = self._EMBEDDING_DIMENSIONS in
__init__; then update both calls to genai.Client().embed_content (the
embed_content usages in this file) to pass
output_dimensionality=self._EMBEDDING_DIMENSIONS in their config so
embed_content and get_dimensions() remain in sync; verify references in methods
embed_content and get_dimensions (or any other place using the literal 1536) are
replaced with the constant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/apps/ai/embeddings/google.py`:
- Around line 35-40: The embed_content call is passing output_dimensionality as
a direct kwarg which the google.genai SDK expects inside a config object; update
every call to self.client.models.embed_content (both occurrences) to remove the
direct output_dimensionality kwarg and instead pass
config={"outputDimensionality": 1536} (or the SDK-expected config key) so the
model returns 1536-dim vectors that match the class _dimensions; keep
model=self.model and contents=text as before.

---

Nitpick comments:
In `@backend/apps/ai/embeddings/google.py`:
- Around line 52-60: The embed_documents implementation currently loops and
calls self.client.models.embed_content per text; change it to make one batched
call by passing the entire texts list as contents to
self.client.models.embed_content (keep model=self.model and
output_dimensionality=1536), then iterate over the returned response.embeddings
to extract each embedding.values into the results list and return that list;
update any variable names in the embed_documents method to reflect the single
response handling instead of per-text calls.
- Around line 14-23: Extract the hardcoded 1536 into a class-level constant
named _EMBEDDING_DIMENSIONS (on the Google embedder class) and set
self._dimensions = self._EMBEDDING_DIMENSIONS in __init__; then update both
calls to genai.Client().embed_content (the embed_content usages in this file) to
pass output_dimensionality=self._EMBEDDING_DIMENSIONS in their config so
embed_content and get_dimensions() remain in sync; verify references in methods
embed_content and get_dimensions (or any other place using the literal 1536) are
replaced with the constant.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
backend/apps/ai/embeddings/google.py (2)

14-23: Extract the hardcoded dimension 1536 into a single module-level constant.

The value 1536 appears three times independently — on lines 23, 38, and 57. A future change to the dimension (e.g. switching to 3072 or 768) requires three separate edits with no guarantee they stay in sync.

♻️ Proposed refactor
+DEFAULT_MODEL = "gemini-embedding-001"
+DEFAULT_DIMENSIONS = 1536
+
 class GoogleEmbedder(Embedder):
     def __init__(self, model: str = "gemini-embedding-001") -> None:
+    def __init__(self, model: str = DEFAULT_MODEL) -> None:
         self.client = genai.Client(api_key=settings.GOOGLE_API_KEY)
         self.model = model
-        self._dimensions = 1536
+        self._dimensions = DEFAULT_DIMENSIONS

Then reference DEFAULT_DIMENSIONS in the two config= calls as well.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/ai/embeddings/google.py` around lines 14 - 23, Extract the
hardcoded 1536 into a module-level constant named DEFAULT_DIMENSIONS and use it
everywhere instead of literal numbers: set DEFAULT_DIMENSIONS = 1536 at top of
the module, replace self._dimensions = 1536 in __init__ with self._dimensions =
DEFAULT_DIMENSIONS, and update both config= calls that currently pass 1536 to
reference DEFAULT_DIMENSIONS so all three places (the __init__ assignment and
the two config parameters) use the single constant.

52-60: embed_documents makes N sequential API calls; the SDK supports a single batched call.

The Google GenAI SDK accepts a list of strings as contents, returning one embedding per entry in a single request. The current loop multiplies latency and API quota consumption linearly with document count.

♻️ Proposed batch refactor
     def embed_documents(self, texts: list[str]) -> list[list[float]]:
         ...
-        results = []
-        for text in texts:
-            result = self.client.models.embed_content(
-                model=self.model,
-                contents=text,
-                config={"output_dimensionality": 1536},
-            )
-            results.append(result.embeddings[0].values)
-        return results
+        if not texts:
+            return []
+        result = self.client.models.embed_content(
+            model=self.model,
+            contents=texts,
+            config={"output_dimensionality": 1536},
+        )
+        if not result.embeddings:
+            msg = f"Google embedding API returned no embeddings for model {self.model!r}"
+            raise ValueError(msg)
+        return [embedding.values for embedding in result.embeddings]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/ai/embeddings/google.py` around lines 52 - 60, The
embed_documents implementation is making N sequential calls; change it to a
single batched call by passing the full texts list as the contents argument to
self.client.models.embed_content (instead of looping), then map the returned
embeddings to the same shape (extracting embedding.values for each item) and
return that list; update the method named embed_documents (and any helper using
the per-text loop) to call self.client.models.embed_content once with
contents=texts and config={"output_dimensionality": 1536} and convert
result.embeddings into the list of lists expected by the caller.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/apps/ai/embeddings/google.py`:
- Line 40: The code returns result.embeddings[0].values without validating the
API response, which will raise IndexError/TypeError on empty/None embeddings;
update the two return sites that reference result.embeddings (the lines
returning result.embeddings[0].values) to first verify that result.embeddings is
a non-empty sequence and that result.embeddings[0] has a values attribute (or
raise a clear ValueError/RuntimeError including the raw result/response); ensure
the check is defensive (None, empty list, or missing values) and raise a
descriptive error message so callers can handle API failures gracefully.

---

Nitpick comments:
In `@backend/apps/ai/embeddings/google.py`:
- Around line 14-23: Extract the hardcoded 1536 into a module-level constant
named DEFAULT_DIMENSIONS and use it everywhere instead of literal numbers: set
DEFAULT_DIMENSIONS = 1536 at top of the module, replace self._dimensions = 1536
in __init__ with self._dimensions = DEFAULT_DIMENSIONS, and update both config=
calls that currently pass 1536 to reference DEFAULT_DIMENSIONS so all three
places (the __init__ assignment and the two config parameters) use the single
constant.
- Around line 52-60: The embed_documents implementation is making N sequential
calls; change it to a single batched call by passing the full texts list as the
contents argument to self.client.models.embed_content (instead of looping), then
map the returned embeddings to the same shape (extracting embedding.values for
each item) and return that list; update the method named embed_documents (and
any helper using the per-text loop) to call self.client.models.embed_content
once with contents=texts and config={"output_dimensionality": 1536} and convert
result.embeddings into the list of lists expected by the caller.

- Add validation to prevent IndexError when API returns no embeddings
- Raise ValueError with descriptive message for better error handling
- Apply fix to both embed_query and embed_documents methods
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
backend/apps/ai/embeddings/google.py (2)

38-38: output_dimensionality should reference self._dimensions instead of a hardcoded literal.

_dimensions is set once in __init__ but the config dict in both embed_query (line 38) and embed_documents (line 60) repeat the literal 1536. These two values will silently diverge if _dimensions is ever changed.

♻️ Proposed fix
-            config={"output_dimensionality": 1536},
+            config={"output_dimensionality": self._dimensions},

(Apply in both embed_query and embed_documents.)

Also applies to: 60-60

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/ai/embeddings/google.py` at line 38, Replace the hardcoded 1536
in the config dicts with the instance field so the embed functions always use
the initialized dimension: update the config={"output_dimensionality": 1536}
entries in both embed_query and embed_documents to reference self._dimensions
(e.g., config={"output_dimensionality": self._dimensions}) so the methods use
the class's _dimensions value rather than a literal.

55-66: embed_documents makes N serial API calls — use a single batch call instead.

The google-genai SDK supports passing a list of strings to contents in client.models.embed_content(), returning all embeddings in one response.

♻️ Proposed batch refactor
 def embed_documents(self, texts: list[str]) -> list[list[float]]:
     ...
-    results = []
-    for text in texts:
-        result = self.client.models.embed_content(
-            model=self.model,
-            contents=text,
-            config={"output_dimensionality": self._dimensions},
-        )
-        if not result.embeddings:
-            msg = f"Google embedding API returned no embeddings for model {self.model!r}"
-            raise ValueError(msg)
-        results.append(result.embeddings[0].values)
-    return results
+    if not texts:
+        return []
+    result = self.client.models.embed_content(
+        model=self.model,
+        contents=texts,
+        config={"output_dimensionality": self._dimensions},
+    )
+    if not result.embeddings or len(result.embeddings) != len(texts):
+        msg = f"Google embedding API returned {len(result.embeddings) if result.embeddings else 0} embeddings for {len(texts)} texts using model {self.model!r}"
+        raise ValueError(msg)
+    return [emb.values for emb in result.embeddings]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/ai/embeddings/google.py` around lines 55 - 66, The current loop
in embed_documents that calls self.client.models.embed_content per text should
be replaced with a single batched call: call self.client.models.embed_content
once with contents=texts and the same config ({"output_dimensionality": 1536}),
then validate the response (ensure result.embeddings exists and length matches
len(texts)), extract each embedding's .values into a list in the original order,
and return that list; update error handling to raise a ValueError if embeddings
are missing or counts mismatch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/apps/ai/embeddings/google.py`:
- Line 23: The embeddings produced at self._dimensions = 1536 are not
pre-normalized, so update the GoogleEmbedder to normalize vectors when returning
them (or switch to requesting the 3072-dimension model); specifically add a
private _normalize method and apply it to outputs from embed_query and
embed_documents (and any internal call that returns raw embedding lists) so
cosine similarity consumers receive unit vectors; keep the existing
self._dimensions field if you choose normalization, or change it to 3072 and
adjust API request model/params if you prefer the pre-normalized output.

---

Nitpick comments:
In `@backend/apps/ai/embeddings/google.py`:
- Line 38: Replace the hardcoded 1536 in the config dicts with the instance
field so the embed functions always use the initialized dimension: update the
config={"output_dimensionality": 1536} entries in both embed_query and
embed_documents to reference self._dimensions (e.g.,
config={"output_dimensionality": self._dimensions}) so the methods use the
class's _dimensions value rather than a literal.
- Around line 55-66: The current loop in embed_documents that calls
self.client.models.embed_content per text should be replaced with a single
batched call: call self.client.models.embed_content once with contents=texts and
the same config ({"output_dimensionality": 1536}), then validate the response
(ensure result.embeddings exists and length matches len(texts)), extract each
embedding's .values into a list in the original order, and return that list;
update error handling to raise a ValueError if embeddings are missing or counts
mismatch.

- Add L2 normalization for 1536-dimensional embeddings
- Replace os.environ patching with direct settings mock
- Fix decorator order to match unittest.mock behavior
- Remove unused imports and simplify test structure
- Add noqa comments for test secret keys
- Fixes CI/CD test failures in llm_config_test.py
coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 20, 2026
Copy link
Collaborator

@rudransh-shrivastava rudransh-shrivastava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I tested the changes and they work.

However, I have changes:

I ran into some issues using gemini-2.0-flash, I think they updated their ratelimits:

Image

I think we need to use an updated model:

Image

https://ai.google.dev/gemini-api/docs/changelog#02-18-2026


Optionally, you may also add Gemini support for the image extraction feature added in #3925

@rajnisk
Copy link
Author

rajnisk commented Feb 24, 2026

Hi, I tested the changes and they work.

However, I have changes:

I ran into some issues using gemini-2.0-flash, I think they updated their ratelimits:

Image I think we need to use an updated model: Image https://ai.google.dev/gemini-api/docs/changelog#02-18-2026

Optionally, you may also add Gemini support for the image extraction feature added in #3925

Yes, they have adjusted the quota for the 2 and 2.5 series models.
If they completely removed 2 from the free quota. we can use 2.5-flash

- Remove base_url from Google LLM config (CrewAI handles it automatically)
- Remove unnecessary comments from settings
- Remove model name settings (use os.getenv instead)
- Revert LLM_PROVIDER to use os.getenv instead of Django settings
- Update factory.py to use os.getenv for LLM_PROVIDER
- Update default model to gemini-2.5-flash for free tier compatibility
- Update tests to use os.getenv patching
coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 24, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 11 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/apps/ai/common/utils.py">

<violation number="1" location="backend/apps/ai/common/utils.py:46">
P2: Embedding API errors are no longer caught; OpenAI SDK exceptions from embed_documents will now propagate and can break chunk creation instead of returning an empty list.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 26, 2026
@sonarqubecloud
Copy link

@preeeetham
Copy link
Contributor

Hey, nice work on decoupling the OpenAI dependency — the factory pattern refactor and removing self.openai_client from the base commands is a solid improvement. A few things I noticed while going through the diff:

The biggest concern is around error handling in utils.py. The old code caught OpenAIError alongside AttributeError and TypeError, but now only the latter two are caught. The Google SDK can throw its own exceptions (like google.api_core.exceptions.GoogleAPIError) which would bubble up uncaught and potentially crash the entire batch processing pipeline. Worth adding the relevant Google exceptions to that handler, or at minimum catching a broader base exception around the embedding call.

Also, GOOGLE_API_KEY is defined as values.Value(default=None) in settings/base.py while OPEN_AI_SECRET_KEY uses SecretValue(). This means if someone sets LLM_PROVIDER=google but forgets the API key, they'll get a confusing SDK error deep in a request instead of a clear startup failure. Making it a SecretValue or adding an explicit check in the embedder init would help.

A couple smaller things — _dimensions is set to 1536 in the GoogleEmbedder but the PR description mentions 768, and gemini-embedding-001 defaults to 768 per Google's docs. That should be clarified so it matches whatever the existing vector store expects. And factory.py reads LLM_PROVIDER via os.getenv while llm_config.py uses django.conf.settings — worth making those consistent to avoid env var naming surprises with django-configurations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants