解析pdf后，其中的图片会以base64格式上传，占用大量token

[//]: # (方框内删除已有的空格，填 x 号)
+ [x] 我已确认目前没有类似 issue
+ [x] 我已确认我已升级到最新版本
+ [x] 我已完整浏览项目 README 和项目文档并未找到解决方案
+ [x] 我理解并愿意跟进此 issue，协助测试和提供反馈
+ [x] 我将以礼貌和尊重的态度提问，不得使用不文明用语 (包括在此发布评论的所有人同样适用, 不遵守的人将被 block)
+ [x] 我理解并认可上述内容，并理解项目维护者精力有限，**不遵循规则的 issue 可能会被无视或直接关闭**

**问题描述**
尝试使用gpt-4o模型概括该pdf内容，[fpubh-12-1368933.pdf](https://github.com/user-attachments/files/18389670/fpubh-12-1368933.pdf)

结果报错
`This model's maximum context length is 128000 tokens. However, your messages resulted in 180050 tokens. Please reduce the length of the messages. (type: invalid_request_error)`

<img width="1679" alt="图片" src="https://github.com/user-attachments/assets/460c4c27-b945-420f-8d2a-3fce7d29cd3d" />

多模态模型图片如何计算占用tokens参考OpenAI官方 https://openai.com/api/pricing ，这么一个小小的pdf，里面也没几张图，怎么也不可能跑到180050 tokens

查看通过 https://blob.chatnio.net/ 解析后的内容（ 解析出来的内容见 https://gist.github.com/whitewatercn/d1dd7488a158e0e0b0a29d9008ff1a77 ），有一段超长的base64编码图片，通过转码工具可以确认是pdf里的图片被转成了base64编码，其规格为
```
Width: 1553
Height: 704
File size: 84064 bytes
The length of the Base64 encoded string is: 333728
```
怀疑这段超长的编码被当成文本处理，占用了大量token

<img width="992" alt="图片" src="https://github.com/user-attachments/assets/5697b002-b428-4fc4-b2bb-1650d15aebe8" />



在之前的issue中（ https://github.com/coaidev/coai/issues/215 ）提到
> 当模型不支持图片识别时，经过 base64 编码后的图片会被认为是文本，过长的文本（token）会导致过多的费用，所以采取判断省略用于防止出现该情况。
> https://github.com/coaidev/coai/issues/215#issuecomment-2186874383

可是我使用的是gpt-4o




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

解析pdf后，其中的图片会以base64格式上传，占用大量token #320

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

解析pdf后，其中的图片会以base64格式上传，占用大量token #320

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions