Call for Bad Cases of BAGEL

# [Welcom to Discord](https://discord.com/invite/Z836xxzy)

<div align="center">
  <img src="https://github.com/user-attachments/assets/1ae40d36-592b-4a28-a83b-4524541df568" width="200">
</div>


# Welcom to Wechat

<div align="center">
  <img src="https://github.com/user-attachments/assets/12604ac8-44e8-4e45-a31a-1eedfe62f9cd" width="200">
</div>

---

We are collecting failure or bad cases of the **BAGEL** model to better understand its current limitations and to help advance multimodal AI research.

If you have encountered situations where BAGEL performs poorly or produces unexpected results, please share them in the comments below. Kindly organize each case using the following format:

**Problem:**  
The problem of the bad case.

**Prompt:**  
The input text you provided.

**Image:**  
The related visual input.

**Hyperparameters:**
The inference hyperparameters include those in `inference_hyper`,  random seed (if possible), and whether to enable the thinking mode.

**Response:**  
The output generated by BAGEL.

Your examples are invaluable for identifying edge cases, improving BAGEL’s robustness, and ultimately delivering a more powerful and user-friendly multimodal foundation model to the community.

Thank you for your support and contributions!

---
**About Inference Hyperparameters:**
- **`cfg_text_scale`:** Controls how strongly the model follows the text prompt. `1.0` disables text guidance. Typical range: `4.0–8.0`.
- **`cfg_image_scale`:** Controls how much the model preserves input image details. `1.0` disables image guidance. Typical range: `1.0–2.0`.
- **`cfg_interval`:** Fraction of denoising steps where CFG is applied. Later steps can skip CFG to reduce computation. Typical: `[0.4, 1.0]`.
- **`timestep_shift`:** Shifts the distribution of denoising steps. Higher values allocate more steps at the start (affects layout); lower values allocate more at the end (improves details).
- **`num_timesteps`:** Total denoising steps. Typical: `50`.
- **`cfg_renorm_min`:** Minimum value for CFG-Renorm. `1.0` disables renorm. Typical: `0`.
- **`cfg_renorm_type`:** CFG-Renorm method:  
  - `global`: Normalize over all tokens and channels (default for T2I).
  - `local`: Normalize per channel.
  - `text_channel`: Like `local`, but only applies to text condition (good for editing, may cause blur).
- **If edited images appear blurry, try `global` CFG-Renorm, decrease `cfg_renorm_min` or decrease `cfg_scale`.**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Call for Bad Cases of BAGEL #11

Welcom to Discord

Welcom to Wechat

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Call for Bad Cases of BAGEL #11

Description

Welcom to Discord

Welcom to Wechat

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions