Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[BUG] Possible incorrect token counting for Gemini with thinking #346

@weizheheng

Description

@weizheheng

Basic checks

  • I searched existing issues - this hasn't been reported
  • I can reproduce this consistently
  • This is a RubyLLM bug, not my application code

What's broken?

When using Gemini 2.5 flash with thinking, the response returns an additional attribute thoughtsTokenCount, and the total token count seems to made up of promptTokenCount + candidatesTokenCount + thoughtsTokenCount.

Image

While right now in ruby_llm, the output_tokens only take candidatesTokenCount into account.

output_tokens: data.dig('usageMetadata', 'candidatesTokenCount')

Similar issue reported in other repo.

How to reproduce

  1. Make a chat.ask using Gemini 2.5 flash.
  2. The response.output_tokens is not the same as candidatesTokenCount + thoughtsTokenCount.

Expected behavior

Expecting the output_tokens to include thoughtsTokenCount if present.

What actually happened

output_tokens doesn't include thoughtsTokenCount when present.

Environment

Ruby version: 3.3.3
RubyLLM version: 1.3.1 (we forked it to add in the thoughtsToken, but still the same in latest version)
Provider: Gemini
OS: MacOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions