Add a desugaring step in the base linker. #5096

sjrd · 2024-12-29T22:17:52Z

Previously, the emitters and the optimizer all had to perform the same desugaring for LinkTimeProperty nodes. Instead, we now perform the desugaring in the BaseLinker, after the reachability analysis.

The reachability analysis records whether each method needs desugaring or not. For those that do, we cache the resulting desugaring. Methods that do not require desugaring are not processed, and so incur no additional cost.

The machinery is heavy. It definitely outweighs the benefits in terms of duplication for LinkTimeProperty alone. However, the same machinery will be used to desugar NewLambda nodes. This commit serves as a stepping stone in that direction.

This PR is meant as a stepping stone towards #5003, so that it is easier to review piecemeal. However, it makes no sense to actually merge this PR before we have consensus that this way of desugaring at BaseLinker time is the correct way to deal with NewLambda in #5003.

tanishiking

Desgar happens before optimizer 👍

scala-js/linker/shared/src/main/scala/org/scalajs/linker/frontend/LinkerFrontendImpl.scala

Lines 67 to 74 in 98e0875

    
           val linkResult = logger.timeFuture("Linker") { 
        
             linker.link(irFiles, moduleInitializers, logger, 
        
                 preOptimizerRequirements) 
        
           } 
        
           val optimizedResult = optOptimizer.fold(linkResult) { optimizer => 
        
             linkResult.flatMap(optimize(_, symbolRequirements, optimizer, logger)) 
        
           }

LGTM from my side

gzm0 · 2025-01-03T14:34:19Z

I think formalizing desugaring as a phase / step makes sense. Essentially, IIUC, it would host transformations / lowerings that are necessary for later phases to work properly, so unlike the optimizer, they are required for correctness.

What I'm wondering is if it makes sense to include these in the BaseLinker and not as an additional "top-level" step in the LinkerFrontendImpl. From a cursory look at this PR, it seems that doing so would require:

An additional pass over all LinkedClasses
Forwarding the "needs desugaring flags"

It feels that both of these would be acceptable to do.

What we would get from this:

Clearer separation between linking and desugaring.
Maintain proximity of Linker / Refiner.
Ability to ClassDef / IRCheck in-between linking and desugaring.
Ability to measure desugaring execution time separately.

As a natural follow-up (to my eyes at least), I wonder if it makes sense / is feasible to move method / class synthesis to such a phase as well. From what I can see, the main challenge is probably how encode this in LinkedClass. But maybe a more heterogeneous interface between phases is desirable.

sjrd · 2025-01-03T15:25:56Z

IIUC, what you're proposing is:

Base linker: synthesizes classes and methods, but not transform bodies
Desugar: rewrite LinkTimeProperty and NewLambda nodes inside bodies
Optimizer (optional)
Refiner (if optimizer was used)
Emitter

That makes sense. Forwarding the needsDesugaring flag is going to be annoying, though (probably more than it should be). A LinkedClass contains simple Lists of MemberDefs, without any additional metadata. Should we introduce an indirection such as Linked[T] containing a T and needsDesugaring: Boolean? We used to have something like that a long time ago, but I don't remember why. Otherwise we can reserve a flag inside the members' optimizerHints bit set, although that's not very elegant.

In order for Desugar to rewrite NewLambda nodes, it will also need to receive a list of the synthetic classes. So that's one more thing to transfer between those phases.

As a natural follow-up (to my eyes at least), I wonder if it makes sense / is feasible to move method / class synthesis to such a phase as well. From what I can see, the main challenge is probably how encode this in LinkedClass. But maybe a more heterogeneous interface between phases is desirable.

I'm not so sure about that. The existing method synthesis is deeply linked to the reachability analysis. Arguably, it's what makes the job of the base linker actually a linker (and not just a dead code elimination pass, like the Refiner). If we move it out of the BaseLinker, then we have to pass a substantial subset of the Analysis to the second phase, and at the same time, BaseLinker wouldn't be doing much anymore. In fact, arguably it might as well return ClassDefs and an Analysis, instead of LinkedClasses.

Writing the above actually makes me wonder: perhaps BaseLinker is already the phase that performs the transformations required for correctness. It's not possible to emit code for a bunch of ClassDefs without passing them through the BaseLinker, after all.

gzm0 · 2025-01-03T18:07:40Z

IIUC, what you're proposing is

Yes.

We used to have something like that a long time ago, but I don't remember why

For versions. But we ended up re-using the method's version field: #4772

Forwarding the needsDesugaring flag is going to be annoying, though (probably more than it should be).

my thoughts:

we could be more aggressive: don't determine whether something needs desugaring in analysis. solely rely on caching (we have done something similar with infos by moving them from the compiler to the linker).
we could introduce a transient-like construct for this.
we could re-introduce the LinkedMethod wrapper for this.

In order for Desugar to rewrite NewLambda nodes, it will also need to receive a list of the synthetic classes. So that's one more thing to transfer between those phases.

Does it? IIUC, the mapping from descriptor to class name is deterministic:

scala-js/linker/shared/src/main/scala/org/scalajs/linker/frontend/SyntheticClassKind.scala

Line 39 in aa39914

val className: ClassName = Lambda.makeClassName(descriptor)

I'm not so sure about that. The existing method synthesis is deeply linked to the reachability analysis. Arguably, it's what makes the job of the base linker actually a linker (and not just a dead code elimination pass, like the Refiner). If we move it out of the BaseLinker, then we have to pass a substantial subset of the Analysis to the second phase, and at the same time, BaseLinker wouldn't be doing much anymore. In fact, arguably it might as well return ClassDefs and an Analysis, instead of LinkedClasses.

Ah, sorry, I wasn't clear. Determining what needs to be synthesized for sure must be part of the linker.

However, actually synthesizing these things (what the method synthesizer does ATM) could (IIUC) happen later.

I have looked into this a bit more specifically and it would change MethodSynthesizer quite a bit. But my hunch is this might be for the better: right now, MethodSynthesizer looks up targets via IR loader. This could (but should not) trigger an actual IR load.

It is not entirely clear to me how we would forward synthetization information from the base linker. Something-something transient probably?

perhaps BaseLinker is already the phase that performs the transformations required for correctness.

I think that depends on how we define correctness.

The base linker is what is required for some IR to be "complete" and correct in that sense. Most notably, it is not possible to run IR checking before the base linker.
The desugar transformation is what is required for downstream phases to work correctly. However, it is possible (and we probably should) run the IR checker before the desugar transformation.

sjrd · 2025-01-04T17:34:28Z

I did a PoC of the separate phase in #5101. That was while sick and on the train, so there's probably a lot to improve there.

Being able to IR check before and after desugaring is definitely a nice perk of that approach. Also currently the desugaring itself (without the additional IR check pass) took less than 20 ms on my slow laptop, so it's basically free. That may change if we process NewLambda nodes this way, as many more methods will need desugaring in that case.

Previously, the emitters and the optimizer all had to perform the same desugaring for `LinkTimeProperty` nodes. Instead, we now perform the desugaring in the `BaseLinker`, after the reachability analysis. The reachability analysis records whether each method needs desugaring or not. Methods that do not require desugaring are not processed, and so incur no additional cost. Since very few methods need desugaring, we do not cache the results. The machinery is heavy. It definitely outweighs the benefits in terms of duplication for `LinkTimeProperty` alone. However, the same machinery will be used to desugar `NewLambda` nodes. This commit serves as a stepping stone in that direction.

sjrd · 2025-02-06T15:48:40Z

Superseded by #5101.

sjrd requested a review from gzm0 December 29, 2024 22:17

sjrd mentioned this pull request Dec 29, 2024

Introduce NewLambda to synthesize instances of SAM types. #5003

Merged

tanishiking approved these changes Dec 30, 2024

View reviewed changes

sjrd force-pushed the ir-desugaring branch from d7823df to db9f5a3 Compare December 30, 2024 09:54

gzm0 mentioned this pull request Jan 3, 2025

LinkTimeIf - on top of desugaring in BaseLinker #5100

Closed

sjrd mentioned this pull request Jan 4, 2025

Add a desugaring pass between the base linker and the optimizer. #5101

Merged

sjrd force-pushed the ir-desugaring branch from db9f5a3 to 97bedfd Compare January 10, 2025 10:01

sjrd closed this Feb 6, 2025

sjrd deleted the ir-desugaring branch February 6, 2025 15:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a desugaring step in the base linker. #5096

Add a desugaring step in the base linker. #5096

Uh oh!

sjrd commented Dec 29, 2024

Uh oh!

tanishiking left a comment

Uh oh!

gzm0 commented Jan 3, 2025

Uh oh!

sjrd commented Jan 3, 2025

Uh oh!

gzm0 commented Jan 3, 2025

Uh oh!

sjrd commented Jan 4, 2025

Uh oh!

sjrd commented Feb 6, 2025

Uh oh!

Uh oh!

	val linkResult = logger.timeFuture("Linker") {
	linker.link(irFiles, moduleInitializers, logger,
	preOptimizerRequirements)
	}

	val optimizedResult = optOptimizer.fold(linkResult) { optimizer =>
	linkResult.flatMap(optimize(_, symbolRequirements, optimizer, logger))
	}

Add a desugaring step in the base linker. #5096

Add a desugaring step in the base linker. #5096

Uh oh!

Conversation

sjrd commented Dec 29, 2024

Uh oh!

tanishiking left a comment

Choose a reason for hiding this comment

Uh oh!

gzm0 commented Jan 3, 2025

Uh oh!

sjrd commented Jan 3, 2025

Uh oh!

gzm0 commented Jan 3, 2025

Uh oh!

sjrd commented Jan 4, 2025

Uh oh!

sjrd commented Feb 6, 2025

Uh oh!

Uh oh!