Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add a desugaring step in the base linker. #5096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented Dec 29, 2024

Previously, the emitters and the optimizer all had to perform the same desugaring for LinkTimeProperty nodes. Instead, we now perform the desugaring in the BaseLinker, after the reachability analysis.

The reachability analysis records whether each method needs desugaring or not. For those that do, we cache the resulting desugaring. Methods that do not require desugaring are not processed, and so incur no additional cost.

The machinery is heavy. It definitely outweighs the benefits in terms of duplication for LinkTimeProperty alone. However, the same machinery will be used to desugar NewLambda nodes. This commit serves as a stepping stone in that direction.


This PR is meant as a stepping stone towards #5003, so that it is easier to review piecemeal. However, it makes no sense to actually merge this PR before we have consensus that this way of desugaring at BaseLinker time is the correct way to deal with NewLambda in #5003.

Copy link
Contributor

@tanishiking tanishiking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Desgar happens before optimizer 👍

val linkResult = logger.timeFuture("Linker") {
linker.link(irFiles, moduleInitializers, logger,
preOptimizerRequirements)
}
val optimizedResult = optOptimizer.fold(linkResult) { optimizer =>
linkResult.flatMap(optimize(_, symbolRequirements, optimizer, logger))
}

LGTM from my side

@gzm0
Copy link
Contributor

gzm0 commented Jan 3, 2025

I think formalizing desugaring as a phase / step makes sense. Essentially, IIUC, it would host transformations / lowerings that are necessary for later phases to work properly, so unlike the optimizer, they are required for correctness.

What I'm wondering is if it makes sense to include these in the BaseLinker and not as an additional "top-level" step in the LinkerFrontendImpl. From a cursory look at this PR, it seems that doing so would require:

  • An additional pass over all LinkedClasses
  • Forwarding the "needs desugaring flags"

It feels that both of these would be acceptable to do.

What we would get from this:

  • Clearer separation between linking and desugaring.
  • Maintain proximity of Linker / Refiner.
  • Ability to ClassDef / IRCheck in-between linking and desugaring.
  • Ability to measure desugaring execution time separately.

As a natural follow-up (to my eyes at least), I wonder if it makes sense / is feasible to move method / class synthesis to such a phase as well. From what I can see, the main challenge is probably how encode this in LinkedClass. But maybe a more heterogeneous interface between phases is desirable.

@sjrd
Copy link
Member Author

sjrd commented Jan 3, 2025

IIUC, what you're proposing is:

  1. Base linker: synthesizes classes and methods, but not transform bodies
  2. Desugar: rewrite LinkTimeProperty and NewLambda nodes inside bodies
  3. Optimizer (optional)
  4. Refiner (if optimizer was used)
  5. Emitter

That makes sense. Forwarding the needsDesugaring flag is going to be annoying, though (probably more than it should be). A LinkedClass contains simple Lists of MemberDefs, without any additional metadata. Should we introduce an indirection such as Linked[T] containing a T and needsDesugaring: Boolean? We used to have something like that a long time ago, but I don't remember why. Otherwise we can reserve a flag inside the members' optimizerHints bit set, although that's not very elegant.

In order for Desugar to rewrite NewLambda nodes, it will also need to receive a list of the synthetic classes. So that's one more thing to transfer between those phases.

As a natural follow-up (to my eyes at least), I wonder if it makes sense / is feasible to move method / class synthesis to such a phase as well. From what I can see, the main challenge is probably how encode this in LinkedClass. But maybe a more heterogeneous interface between phases is desirable.

I'm not so sure about that. The existing method synthesis is deeply linked to the reachability analysis. Arguably, it's what makes the job of the base linker actually a linker (and not just a dead code elimination pass, like the Refiner). If we move it out of the BaseLinker, then we have to pass a substantial subset of the Analysis to the second phase, and at the same time, BaseLinker wouldn't be doing much anymore. In fact, arguably it might as well return ClassDefs and an Analysis, instead of LinkedClasses.

Writing the above actually makes me wonder: perhaps BaseLinker is already the phase that performs the transformations required for correctness. It's not possible to emit code for a bunch of ClassDefs without passing them through the BaseLinker, after all.

@gzm0
Copy link
Contributor

gzm0 commented Jan 3, 2025

IIUC, what you're proposing is

Yes.

We used to have something like that a long time ago, but I don't remember why

For versions. But we ended up re-using the method's version field: #4772

Forwarding the needsDesugaring flag is going to be annoying, though (probably more than it should be).

my thoughts:

  • we could be more aggressive: don't determine whether something needs desugaring in analysis. solely rely on caching (we have done something similar with infos by moving them from the compiler to the linker).
  • we could introduce a transient-like construct for this.
  • we could re-introduce the LinkedMethod wrapper for this.

In order for Desugar to rewrite NewLambda nodes, it will also need to receive a list of the synthetic classes. So that's one more thing to transfer between those phases.

Does it? IIUC, the mapping from descriptor to class name is deterministic:

val className: ClassName = Lambda.makeClassName(descriptor)

I'm not so sure about that. The existing method synthesis is deeply linked to the reachability analysis. Arguably, it's what makes the job of the base linker actually a linker (and not just a dead code elimination pass, like the Refiner). If we move it out of the BaseLinker, then we have to pass a substantial subset of the Analysis to the second phase, and at the same time, BaseLinker wouldn't be doing much anymore. In fact, arguably it might as well return ClassDefs and an Analysis, instead of LinkedClasses.

Ah, sorry, I wasn't clear. Determining what needs to be synthesized for sure must be part of the linker.

However, actually synthesizing these things (what the method synthesizer does ATM) could (IIUC) happen later.

I have looked into this a bit more specifically and it would change MethodSynthesizer quite a bit. But my hunch is this might be for the better: right now, MethodSynthesizer looks up targets via IR loader. This could (but should not) trigger an actual IR load.

It is not entirely clear to me how we would forward synthetization information from the base linker. Something-something transient probably?

perhaps BaseLinker is already the phase that performs the transformations required for correctness.

I think that depends on how we define correctness.

  • The base linker is what is required for some IR to be "complete" and correct in that sense. Most notably, it is not possible to run IR checking before the base linker.
  • The desugar transformation is what is required for downstream phases to work correctly. However, it is possible (and we probably should) run the IR checker before the desugar transformation.

@sjrd
Copy link
Member Author

sjrd commented Jan 4, 2025

I did a PoC of the separate phase in #5101. That was while sick and on the train, so there's probably a lot to improve there.

Being able to IR check before and after desugaring is definitely a nice perk of that approach. Also currently the desugaring itself (without the additional IR check pass) took less than 20 ms on my slow laptop, so it's basically free. That may change if we process NewLambda nodes this way, as many more methods will need desugaring in that case.

Previously, the emitters and the optimizer all had to perform the
same desugaring for `LinkTimeProperty` nodes. Instead, we now
perform the desugaring in the `BaseLinker`, after the reachability
analysis.

The reachability analysis records whether each method needs
desugaring or not. Methods that do not require desugaring are not
processed, and so incur no additional cost. Since very few methods
need desugaring, we do not cache the results.

The machinery is heavy. It definitely outweighs the benefits in
terms of duplication for `LinkTimeProperty` alone. However, the
same machinery will be used to desugar `NewLambda` nodes. This
commit serves as a stepping stone in that direction.
@sjrd
Copy link
Member Author

sjrd commented Feb 6, 2025

Superseded by #5101.

@sjrd sjrd closed this Feb 6, 2025
@sjrd sjrd deleted the ir-desugaring branch February 6, 2025 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants