Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@lewurm
Copy link
Contributor

@lewurm lewurm commented Aug 30, 2019

Fixes #14080

Consider the following example:

static void CommonCallTarget () { }

static void TailA () {
    tailcall CommonCallTarget ();
}

static void TailB () {
    tailcall CommonCallTarget ();
}

Since TailA and TailB are tailcalling into CommonCallTarget, the resolution at patch-time is a bit tricky, that is, since it's a jump-like instruction the patching machinery won't know where it was called from. That's why we maintain a global hashtable jump_target_hash where each jump-site is signed up to be patched. At patch-time we know the target method (in the example CommonCallTarget), but since we don't know where we are coming from, we will just apply all patches for that target.

This works since ages, so why did it crash on arm64 sometimes?
When the patching happens, we check if the displacement between jump-site and target fits into it (24bit). If not, which happens not very often, we have to allocate a thunk:

mono/mono/mini/mini-arm64.c

Lines 928 to 942 in 36296ce

static void
arm_patch_full (MonoCompile *cfg, MonoDomain *domain, guint8 *code, guint8 *target, int relocation)
{
switch (relocation) {
case MONO_R_ARM64_B:
if (arm_is_bl_disp (code, target)) {
arm_b (code, target);
} else {
gpointer thunk;
thunk = create_thunk (cfg, domain, code, target);
g_assert (arm_is_bl_disp (code, thunk));
arm_b (code, thunk);
}
break;

So instead of jumping to the target directly, we'll branch to the thunk. This is a little trampoline that loads the full address of the target and then finally branches to it. This one will live close-by the jump-site, because during compilation we will reserve specifically for that scenario some space after the generated code. For this, however, we need the JitInfo of the jump-site. And that's where the origin of the race is. Let's say:

  • Thread A compiles TailA, and then jumps into it. Thus one patch point is in the jump_target_hash.
  • Now Thread B compiles TailB, registers the patch point but has not yet registered its JitInfo.
  • Then Thread A continues, does the tailcall into CommonCallTarget, enters the patching machinery, which sees two patches. Now assume when applying the patch for TailB the displacement is too large, thus it tries to allocate a thunk but can't find the relevant JitInfo for it that it needs to emit the thunk. So it crashes as reported in condition 'ji' not met, with 'dynamic' and multithreading #14080

As far as I can tell this only affects ARM64, ARM and PPC.

Fixes mono#14080

Consider the following example:

```csharp
static void CommonCallTarget () { }

static void TailA () {
    tailcall CommonCallTarget ();
}

static void TailB () {
    tailcall CommonCallTarget ();
}
```

Since `TailA` and `TailB` are tailcalling into `CommonCallTarget`, the resolution at patch-time is a bit tricky, that is, since it's a jump-like instruction the patching machinery won't know where it was called from. That's why we maintain a global hashtable `jump_target_hash` where each jump-site is signed up to be patched. At patch-time we know the target method (in the example `CommonCallTarget`), but since we don't know where we are coming from, we will just apply all patches for that target.

This works since ages, so why did it crash on arm64 sometimes?
When the patching happens, we check if the displacement between jump-site and target fits into it (24bit). If not, which happens not very often, we have to allocate a _thunk_:
https://github.com/mono/mono/blob/36296ce291f8a7b19de3eccb7a32c7e4ed9df8f2/mono/mini/mini-arm64.c#L928-L942

So instead of jumping to the target directly, we'll branch to the thunk. This is a little trampoline that loads the full address of the target and then finally branches to it. This one will live close-by the jump-site, because during compilation we will reserve specifically for that scenario some space after the generated code. For this, however, we need the JitInfo of the jump-site. And that's where the origin of the race is. Let's say:

* Thread A compiles `TailA`, and then jumps into it. Thus one patch point is in the `jump_target_hash`.
* Now Thread B compiles `TailB`, registers the patch point but has _not_ yet registered its JitInfo.
* Then Thread A continues, does the tailcall into `CommonCallTarget`, enters the patching machinery, which sees two patches. Now assume when applying the patch for `TailB` the displacement is too large, thus it tries to allocate a thunk but can't find the relevant JitInfo for it that it needs to emit the thunk. So it crashes as reported in mono#14080

As far as I can tell this only affects ARM64, ARM and PPC.
@lewurm
Copy link
Contributor Author

lewurm commented Aug 30, 2019

@monojenkins build failed

@lewurm
Copy link
Contributor Author

lewurm commented Aug 30, 2019

@monojenkins squash

@monojenkins monojenkins merged commit 06e63b3 into mono:master Aug 30, 2019
@marek-safar
Copy link
Member

@monojenkins backport 2019-08

ManickaP pushed a commit to ManickaP/runtime that referenced this pull request Jan 20, 2020
…#16589)

[mini] publish global patches after JitInfo has been added

Fixes mono/mono#14080

Consider the following example:

```csharp
static void CommonCallTarget () { }

static void TailA () {
    tailcall CommonCallTarget ();
}

static void TailB () {
    tailcall CommonCallTarget ();
}
```

Since `TailA` and `TailB` are tailcalling into `CommonCallTarget`, the resolution at patch-time is a bit tricky, that is, since it's a jump-like instruction the patching machinery won't know where it was called from. That's why we maintain a global hashtable `jump_target_hash` where each jump-site is signed up to be patched. At patch-time we know the target method (in the example `CommonCallTarget`), but since we don't know where we are coming from, we will just apply all patches for that target.

This works since ages, so why did it crash on arm64 sometimes?
When the patching happens, we check if the displacement between jump-site and target fits into it (24bit). If not, which happens not very often, we have to allocate a _thunk_:
https://github.com/mono/mono/blob/mono/mono@36296ce291f8a7b19de3eccb7a32c7e4ed9df8f2/mono/mini/mini-arm64.c#L928-L942

So instead of jumping to the target directly, we'll branch to the thunk. This is a little trampoline that loads the full address of the target and then finally branches to it. This one will live close-by the jump-site, because during compilation we will reserve specifically for that scenario some space after the generated code. For this, however, we need the JitInfo of the jump-site. And that's where the origin of the race is. Let's say:

* Thread A compiles `TailA`, and then jumps into it. Thus one patch point is in the `jump_target_hash`.
* Now Thread B compiles `TailB`, registers the patch point but has _not_ yet registered its JitInfo.
* Then Thread A continues, does the tailcall into `CommonCallTarget`, enters the patching machinery, which sees two patches. Now assume when applying the patch for `TailB` the displacement is too large, thus it tries to allocate a thunk but can't find the relevant JitInfo for it that it needs to emit the thunk. So it crashes as reported in mono/mono#14080

As far as I can tell this only affects ARM64, ARM and PPC.



<!--
Thank you for your Pull Request!

If you are new to contributing to Mono, please try to do your best at conforming to our coding guidelines http://www.mono-project.com/community/contributing/coding-guidelines/ but don't worry if you get something wrong. One of the project members will help you to get things landed.

Does your pull request fix any of the existing issues? Please use the following format: Fixes #issue-number
-->



Commit migrated from mono/mono@06e63b3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

condition 'ji' not met, with 'dynamic' and multithreading

4 participants