feat: reduce CLIP memory usage with no embeddings #768

wbruna · 2025-08-14T15:28:29Z

The CLIP weights need to be converted to f32 for textual inversions (fbd42b6, for #601), but that increases the amount of allocated VRAM even when embeddings aren't being used.

On a typical SDXL render on Vulkan, this change reduces peak VRAM usage around 190MB.

I'm including a refactor of the clip_skip handling, since this change would end up conflicting with it. Please let me know if I shall submit it separately instead.

All handlers are constructed with the default clip_skip value, and it is always set during inference time, so there isn't much point in keeping it as a persistent attribute. Instead, just propagate the parameter value down from get_learned_condition*.

The CLIP weights need to be converted to f32 for textual inversions (fbd42b6), but that increases the amount of allocated VRAM even when embeddings aren't being used.

wbruna mentioned this pull request Sep 9, 2025

update stable-diffusion.cpp to master-c648001 (+fixes) LostRuins/koboldcpp#1732

Draft

wbruna added 2 commits September 13, 2025 09:19

feat: reduce CLIP memory usage with no embeddings

489069c

The CLIP weights need to be converted to f32 for textual inversions (fbd42b6), but that increases the amount of allocated VRAM even when embeddings aren't being used.

wbruna force-pushed the clip_memory_usage_embeddings branch from 42f2fa6 to 489069c Compare September 13, 2025 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: reduce CLIP memory usage with no embeddings #768

feat: reduce CLIP memory usage with no embeddings #768

wbruna commented Aug 14, 2025

Uh oh!

Uh oh!

feat: reduce CLIP memory usage with no embeddings #768

Are you sure you want to change the base?

feat: reduce CLIP memory usage with no embeddings #768

Conversation

wbruna commented Aug 14, 2025

Uh oh!

Uh oh!