[2019-08] [mini] publish global patches after JitInfo has been added #16730

monojenkins · 2019-09-09T09:10:38Z

Consider the following example:

static void CommonCallTarget () { }

static void TailA () {
    tailcall CommonCallTarget ();
}

static void TailB () {
    tailcall CommonCallTarget ();
}

Since TailA and TailB are tailcalling into CommonCallTarget, the resolution at patch-time is a bit tricky, that is, since it's a jump-like instruction the patching machinery won't know where it was called from. That's why we maintain a global hashtable jump_target_hash where each jump-site is signed up to be patched. At patch-time we know the target method (in the example CommonCallTarget), but since we don't know where we are coming from, we will just apply all patches for that target.

This works since ages, so why did it crash on arm64 sometimes?
When the patching happens, we check if the displacement between jump-site and target fits into it (24bit). If not, which happens not very often, we have to allocate a thunk:

mono/mono/mini/mini-arm64.c

Lines 928 to 942 in 36296ce

    
           static void 
        
           arm_patch_full (MonoCompile *cfg, MonoDomain *domain, guint8 *code, guint8 *target, int relocation) 
        
           { 
        
           	switch (relocation) { 
        
           	case MONO_R_ARM64_B: 
        
           		if (arm_is_bl_disp (code, target)) { 
        
           			arm_b (code, target); 
        
           		} else { 
        
           			gpointer thunk; 
        
           			thunk = create_thunk (cfg, domain, code, target); 
        
           			g_assert (arm_is_bl_disp (code, thunk)); 
        
           			arm_b (code, thunk); 
        
           		} 
        
           		break;

So instead of jumping to the target directly, we'll branch to the thunk. This is a little trampoline that loads the full address of the target and then finally branches to it. This one will live close-by the jump-site, because during compilation we will reserve specifically for that scenario some space after the generated code. For this, however, we need the JitInfo of the jump-site. And that's where the origin of the race is. Let's say:

Thread A compiles TailA, and then jumps into it. Thus one patch point is in the jump_target_hash.
Now Thread B compiles TailB, registers the patch point but has not yet registered its JitInfo.
Then Thread A continues, does the tailcall into CommonCallTarget, enters the patching machinery, which sees two patches. Now assume when applying the patch for TailB the displacement is too large, thus it tries to allocate a thunk but can't find the relevant JitInfo for it that it needs to emit the thunk. So it crashes as reported in condition 'ji' not met, with 'dynamic' and multithreading #14080

As far as I can tell this only affects ARM64, ARM and PPC.

Backport of #16589.

/cc @marek-safar @lewurm

Fixes mono#14080 Consider the following example: ```csharp static void CommonCallTarget () { } static void TailA () { tailcall CommonCallTarget (); } static void TailB () { tailcall CommonCallTarget (); } ``` Since `TailA` and `TailB` are tailcalling into `CommonCallTarget`, the resolution at patch-time is a bit tricky, that is, since it's a jump-like instruction the patching machinery won't know where it was called from. That's why we maintain a global hashtable `jump_target_hash` where each jump-site is signed up to be patched. At patch-time we know the target method (in the example `CommonCallTarget`), but since we don't know where we are coming from, we will just apply all patches for that target. This works since ages, so why did it crash on arm64 sometimes? When the patching happens, we check if the displacement between jump-site and target fits into it (24bit). If not, which happens not very often, we have to allocate a _thunk_: https://github.com/mono/mono/blob/36296ce291f8a7b19de3eccb7a32c7e4ed9df8f2/mono/mini/mini-arm64.c#L928-L942 So instead of jumping to the target directly, we'll branch to the thunk. This is a little trampoline that loads the full address of the target and then finally branches to it. This one will live close-by the jump-site, because during compilation we will reserve specifically for that scenario some space after the generated code. For this, however, we need the JitInfo of the jump-site. And that's where the origin of the race is. Let's say: * Thread A compiles `TailA`, and then jumps into it. Thus one patch point is in the `jump_target_hash`. * Now Thread B compiles `TailB`, registers the patch point but has _not_ yet registered its JitInfo. * Then Thread A continues, does the tailcall into `CommonCallTarget`, enters the patching machinery, which sees two patches. Now assume when applying the patch for `TailB` the displacement is too large, thus it tries to allocate a thunk but can't find the relevant JitInfo for it that it needs to emit the thunk. So it crashes as reported in mono#14080 As far as I can tell this only affects ARM64, ARM and PPC.

lewurm · 2019-09-09T13:10:49Z

@monojenkins squash

monojenkins · 2019-09-09T15:56:12Z

Cannot squash because the following required status checks are not successful:

"Windows x64" state is "failure"

monojenkins requested review from SamMonoRT, lambdageek, lewurm and vargaz as code owners September 9, 2019 09:10

monojenkins added this to the 2019-08 (6.6.xx) milestone Sep 9, 2019

lewurm approved these changes Sep 9, 2019

View reviewed changes

marek-safar merged commit 4a0b4f4 into mono:2019-08 Sep 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[2019-08] [mini] publish global patches after JitInfo has been added #16730

[2019-08] [mini] publish global patches after JitInfo has been added #16730

Uh oh!

monojenkins commented Sep 9, 2019

Uh oh!

lewurm commented Sep 9, 2019

Uh oh!

monojenkins commented Sep 9, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	static void
	arm_patch_full (MonoCompile cfg, MonoDomain domain, guint8 code, guint8 target, int relocation)
	{
	switch (relocation) {
	case MONO_R_ARM64_B:
	if (arm_is_bl_disp (code, target)) {
	arm_b (code, target);
	} else {
	gpointer thunk;

	thunk = create_thunk (cfg, domain, code, target);
	g_assert (arm_is_bl_disp (code, thunk));
	arm_b (code, thunk);
	}
	break;

[2019-08] [mini] publish global patches after JitInfo has been added #16730

[2019-08] [mini] publish global patches after JitInfo has been added #16730

Uh oh!

Conversation

monojenkins commented Sep 9, 2019

Uh oh!

lewurm commented Sep 9, 2019

Uh oh!

monojenkins commented Sep 9, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants