Add exportable coreml codebook quantization op #2443

jerryzh168 · 2025-06-26T03:31:12Z

Summary:
Added CoreML codebook quant (Palettization): https://apple.github.io/coremltools/docs-guides/source/opt-palettization-overview.html#palettization-overview

supports group_size per_grouped_channel
doesn't support vector quantization yet, but will be easy to turn on if needed
ops added: choose_qparams_and_quantize_codebook, dequantize_codebook
also enabled support for export, these two ops will be preserved after exporta
Added CodebookWeightOnlyConfig(dtype, group_size) that can be used with quantize_ to quantize the Tensor

Test Plan:
python test/prototype/test_coreml_codebook.py

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-06-26T03:31:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2443

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

VolumeLimitExceeded Issue for linux.2xlarge and linux.4xlarge

✅ No Failures

As of commit 8fb10b2 with merge base 5a50667 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/prototype/test_coreml_codebook.py

torchao/prototype/quantization/coreml_codebook/api.py

torchao/prototype/quantization/coreml_codebook/codebook_ops.py

torchao/prototype/quantization/coreml_codebook/codebook_quantized_tensor.py

metascroy · 2025-06-26T20:27:05Z

torchao/prototype/quantization/coreml_codebook/codebook_quantized_tensor.py

+            codes = codes.to(torch.int32)
+        return dequantize_codebook(
+            codes,
+            self.codebook,


Should we have a granularity?

we'll use block_size for now, it can be extended to other granularities

Summary: Added CoreML codebook quant (Palettization): https://apple.github.io/coremltools/docs-guides/source/opt-palettization-overview.html#palettization-overview * supports group_size `per_grouped_channel` * doesn't support vector quantization yet, but will be easy to turn on if needed * ops added: choose_qparams_and_quantize_codebook, dequantize_codebook * also enabled support for export, these two ops will be preserved after exporta * Added CodebookWeightOnlyConfig(dtype, group_size) that can be used with quantize_ to quantize the Tensor Test Plan: python test/prototype/test_coreml_codebook.py Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2025