question on duplex attention (k means) code

First, thank you for this amazing work!

I am suspecting that an indentation is missing at the following position of the code:

https://github.com/dorarad/gansformer/blob/3a9efa4545be25604b70560b7f491ec3633c14a3/pytorch_version/training/networks.py#L784

The reason why it raises my suspicion is that, if the code is executed as it is, it seems like the actual key values (to_tensor) are never involved in the computation of the attention scores when k means is enabled. If I am mistaken, would you mind explain why line 787 replaces the original attention scores with the values computed here (where the embedding "to_centroids" seems to be initialized to be a mapping of the queries)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

question on duplex attention (k means) code #45

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

question on duplex attention (k means) code #45

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions