Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

byroot
Copy link
Member

@byroot byroot commented Aug 29, 2025

While writing into ivars of other types is too complicated to realistically generate the ASM for it, we can at least provide the ivar index as to not have to lookup the shape tree every time.

compare-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a48e7) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a48e7) +YJIT +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |      1.664k|    1.672k|
|                         |           -|     1.00x|
|vm_ivar_set_on_generic   |     172.193|   257.789|
|                         |           -|     1.50x|
|vm_ivar_set_on_class     |     184.145|   377.723|
|                         |           -|     2.05x|

The vm_ivar_set_on_class case is still noticeably slower than vm_ivar_set_on_instance because rb_ivar_set_at performs ractor checks. We could bypass them with assume_single_ractor_mode to acheived ~2.2x the baseline.

Then it's also noticeably slower because the benchmark sets immediate and YJIT skip triggering write barriers for those, but RB_OBJ_WRITE always does. If we skip the write barrier we end up at ~3.5x the baseline.

compare-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a48e7) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a48e7) +YJIT +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |      1.671k|    1.666k|
|                         |       1.00x|         -|
|vm_ivar_set_on_generic   |     172.174|   348.817|
|                         |           -|     2.03x|
|vm_ivar_set_on_class     |     176.337|   611.242|
|                         |           -|     3.47x|

If the benchmark was modified to set an heap value rather than an immediate the T_OBJECT case would be much closer to the other two:

compare-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a48e7) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-29T12:42:23Z yjit-set-gen-ivar ae7017f86d) +YJIT +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |     442.883|   442.278|
|                         |       1.00x|         -|
|vm_ivar_set_on_generic   |     165.336|   243.261|
|                         |           -|     1.47x|
|vm_ivar_set_on_class     |     145.395|   345.946|
|                         |           -|     2.38x|

@matzbot matzbot requested a review from a team August 29, 2025 12:47
@byroot byroot marked this pull request as draft August 29, 2025 13:17
While writing into ivars of other types is too complicated to
realistically generate the ASM for it, we can at least provide
the ivar index as to not have to lookup the shape tree every
time.

```
compare-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a4) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a4) +YJIT +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |      1.664k|    1.672k|
|                         |           -|     1.00x|
|vm_ivar_set_on_generic   |     172.193|   257.789|
|                         |           -|     1.50x|
|vm_ivar_set_on_class     |     184.145|   377.723|
|                         |           -|     2.05x|
```

The `vm_ivar_set_on_class` case is still noticeably slower
than `vm_ivar_set_on_instance` because `rb_ivar_set_at` performs
ractor checks. We could bypass them with `assume_single_ractor_mode`
to acheived ~2.2x the baseline.

Then it's also noticeably slower because the benchmark sets immediate
and YJIT skip triggering write barriers for those, but `RB_OBJ_WRITE`
always does. If we skip the write barrier we end up at ~3.5x the baseline.

```
compare-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a4) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a4) +YJIT +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |      1.671k|    1.666k|
|                         |       1.00x|         -|
|vm_ivar_set_on_generic   |     172.174|   348.817|
|                         |           -|     2.03x|
|vm_ivar_set_on_class     |     176.337|   611.242|
|                         |           -|     3.47x|
```

If the benchmark was modified to set an heap value rather than
an immediate the T_OBJECT case would be much closer to the other
two:

```
compare-ruby: ruby 3.5.0dev (2025-08-29T07:25:05Z yjit-set-gen-ivar dc555a4) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-29T12:42:23Z yjit-set-gen-ivar ae7017f86d) +YJIT +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |     442.883|   442.278|
|                         |       1.00x|         -|
|vm_ivar_set_on_generic   |     165.336|   243.261|
|                         |           -|     1.47x|
|vm_ivar_set_on_class     |     145.395|   345.946|
|                         |           -|     2.38x|
```
@byroot byroot force-pushed the yjit-set-gen-ivar branch from 7d378a7 to 82c74de Compare August 29, 2025 13:37
Copy link

launchable-app bot commented Aug 29, 2025

Tests Failed

✖️no tests failed ✔️62232 tests passed(68 flakes)

@byroot
Copy link
Member Author

byroot commented Aug 29, 2025

Unfrotunately I don't understand where the bug comes from 😿

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant