Anole-HF cannot be converted to Anole-Torch accurately

Hi, 

I converted Anole-HF (https://huggingface.co/leloy/Anole-7b-v0.1-hf) to Anole-torch using [bin_to_pth.py](https://github.com/GAIR-NLP/anole/blob/main/training/bin_to_pth.py) script. 

I observe that the `{q/k}_norm.weight`, `{q/k}_norm.bias`, `self_attn.{q/k}_proj.weight` params do not match between the converted Anole-torch (from HF) and the original Anole-torch (Anole-7b-v0.1) shared as a part of this repo. 

I believe that this is due to some special operations (`permute` and `t()`) that [convert_chameleon_weights_to_hf.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/chameleon/convert_chameleon_weights_to_hf.py) performs on the Torch checkpoint to give the HF checkpoint, which the `bin_to_pth.py` does not revert.

Can you fix the bin_to_pth file for accurate conversion? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Anole-HF cannot be converted to Anole-Torch accurately #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Anole-HF cannot be converted to Anole-Torch accurately #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions