Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@monojenkins
Copy link
Contributor

Fixes #14080

Consider the following example:

static void CommonCallTarget () { }

static void TailA () {
    tailcall CommonCallTarget ();
}

static void TailB () {
    tailcall CommonCallTarget ();
}

Since TailA and TailB are tailcalling into CommonCallTarget, the resolution at patch-time is a bit tricky, that is, since it's a jump-like instruction the patching machinery won't know where it was called from. That's why we maintain a global hashtable jump_target_hash where each jump-site is signed up to be patched. At patch-time we know the target method (in the example CommonCallTarget), but since we don't know where we are coming from, we will just apply all patches for that target.

This works since ages, so why did it crash on arm64 sometimes?
When the patching happens, we check if the displacement between jump-site and target fits into it (24bit). If not, which happens not very often, we have to allocate a thunk:

mono/mono/mini/mini-arm64.c

Lines 928 to 942 in 36296ce

static void
arm_patch_full (MonoCompile *cfg, MonoDomain *domain, guint8 *code, guint8 *target, int relocation)
{
switch (relocation) {
case MONO_R_ARM64_B:
if (arm_is_bl_disp (code, target)) {
arm_b (code, target);
} else {
gpointer thunk;
thunk = create_thunk (cfg, domain, code, target);
g_assert (arm_is_bl_disp (code, thunk));
arm_b (code, thunk);
}
break;

So instead of jumping to the target directly, we'll branch to the thunk. This is a little trampoline that loads the full address of the target and then finally branches to it. This one will live close-by the jump-site, because during compilation we will reserve specifically for that scenario some space after the generated code. For this, however, we need the JitInfo of the jump-site. And that's where the origin of the race is. Let's say:

  • Thread A compiles TailA, and then jumps into it. Thus one patch point is in the jump_target_hash.
  • Now Thread B compiles TailB, registers the patch point but has not yet registered its JitInfo.
  • Then Thread A continues, does the tailcall into CommonCallTarget, enters the patching machinery, which sees two patches. Now assume when applying the patch for TailB the displacement is too large, thus it tries to allocate a thunk but can't find the relevant JitInfo for it that it needs to emit the thunk. So it crashes as reported in condition 'ji' not met, with 'dynamic' and multithreading #14080

As far as I can tell this only affects ARM64, ARM and PPC.

Backport of #16589.

/cc @marek-safar @lewurm

Fixes mono#14080

Consider the following example:

```csharp
static void CommonCallTarget () { }

static void TailA () {
    tailcall CommonCallTarget ();
}

static void TailB () {
    tailcall CommonCallTarget ();
}
```

Since `TailA` and `TailB` are tailcalling into `CommonCallTarget`, the resolution at patch-time is a bit tricky, that is, since it's a jump-like instruction the patching machinery won't know where it was called from. That's why we maintain a global hashtable `jump_target_hash` where each jump-site is signed up to be patched. At patch-time we know the target method (in the example `CommonCallTarget`), but since we don't know where we are coming from, we will just apply all patches for that target.

This works since ages, so why did it crash on arm64 sometimes?
When the patching happens, we check if the displacement between jump-site and target fits into it (24bit). If not, which happens not very often, we have to allocate a _thunk_:
https://github.com/mono/mono/blob/36296ce291f8a7b19de3eccb7a32c7e4ed9df8f2/mono/mini/mini-arm64.c#L928-L942

So instead of jumping to the target directly, we'll branch to the thunk. This is a little trampoline that loads the full address of the target and then finally branches to it. This one will live close-by the jump-site, because during compilation we will reserve specifically for that scenario some space after the generated code. For this, however, we need the JitInfo of the jump-site. And that's where the origin of the race is. Let's say:

* Thread A compiles `TailA`, and then jumps into it. Thus one patch point is in the `jump_target_hash`.
* Now Thread B compiles `TailB`, registers the patch point but has _not_ yet registered its JitInfo.
* Then Thread A continues, does the tailcall into `CommonCallTarget`, enters the patching machinery, which sees two patches. Now assume when applying the patch for `TailB` the displacement is too large, thus it tries to allocate a thunk but can't find the relevant JitInfo for it that it needs to emit the thunk. So it crashes as reported in mono#14080

As far as I can tell this only affects ARM64, ARM and PPC.
@lewurm
Copy link
Contributor

lewurm commented Sep 9, 2019

@monojenkins squash

@monojenkins
Copy link
Contributor Author

Cannot squash because the following required status checks are not successful:

  • "Windows x64" state is "failure"

@marek-safar marek-safar merged commit 4a0b4f4 into mono:2019-08 Sep 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants