-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Better heap-safety for iterative algorithms on lazy data types #7990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Simple implementation that always nulls (instead of aliasing where possible). No checks for illegal usage.
I like the idea. I don't feel like it's something that the compiler needs to do automatically, in general, keeping Some ideas
Not sure we need to do that, see also my comment below. The inliner nulls out local variables (e.g. the one holding
We never went to implement a register allocator in the Scala backend because (1) the JVM does it and (2) having a slot per local variable makes other local analyses / optimziations much simpler. Also, I'm not sure if it would be a good idea to rely on register allocation for "correctness" (i.e., to rely on the register allocator to null out Side-note: "compatible types" is not a necessary requirement, a local variable slot can hold objects of different types (even reference / primitive) within the same method. |
It doesn't need to (especially if you have an annotation to trigger it on demand) but it would be nice to do it automatically when it's free (i.e. you don't need a null assignment). I haven't done any benchmarking specifically for this case yet but in several cases of benchmarking collection methods I noticed that tail-recursive versions were faster than iterative versions. I suspect it is precicely because of this difference (the iterative version has to keep an extra variable around). Note that you cannot rely on the VM to not drop the reference anyway (see https://shipilev.net/jvm/anatomy-quarks/8-local-var-reachability/).
Right. In fact, it should also work for all local variables. Are there any cases where the code generator creates new local variables for sub-expressions that could accidentally keep a reference alive? If this is the case, we may want to allow it for arbitrary expressions.
That was my original plan but it's useless for ensuring that your methods actually drops the reference when you intend to.
Right, this would just be a general optimization. If you want to be sure you still need to annotate it (similar to how
Good point. I didn't consider this for |
That's a good question, I'd have to look in detail, but I can imagine this being the case. For example to emit try-catch expressions. |
Note to self (or anyone else who might already know the answer): After looking at the LazyList PR again, it's clear that simple forwarders (like |
After working on scala/bug#11443, it has become less clear. Using a |
@szeiger should we close this one? |
I don't think I'll be able to continue with this any time soon |
This PR would enable heap safe implementations without relying on
Difficulties
trait Growable {
final def ++=(elems: IterableOnce[A]): this.type = this.addAll(elems: @nullOut)
def addAll(elems: IterableOnce[A]): this.type = {
val it = (elems: @nullOut).iterator
while (it.hasNext) {
addOne(it.next())
}
this
}
}
On the other hand, we could fix heap safety and even get rid of current overrides in
I was going to argue that we can use
IMO we can do without, especially if this is going to be an internal feature. The compiler also doesn't check how local variables are used after setting them to Scala 3 explicit nulls has the necessary flow typing, but it would need to be tweaked to understand the annotation. |
This is an idea I explored after seeing #7916. Some gitter discussion at https://gitter.im/scala/contributors?at=5cb6f1596a84d76ed8a99167.
What I implemented so far is an annotation
@dropthis
that can be applied to an expressionthis
in a method in order to drop the reference (after pushing it onto the stack) so it can be garbage-collected. This allows writing iterative algorithms (see theforeach3
test case) that do not keep an unnecessary reference to the original receiver of the method, which is currently impossible in Scala.So far this is a simple proof of concept. There are no checks for incorrect use of the annotation. If you put it on any non-
this
expression, it is ignored. If you use it and try to dereferencethis
afterwards, you get an NPE. The next obvious step is to add these checks. The easiest way to do the flow analysis would probably be right in the backend at the ASM level but then you would get different semantics when running with a non-JVM backend, so we should do it earlier.Furthermore, in cases where
this
gets aliased to avar
with the same type (like inforeach2
) it should be possible to perform this optimization without the need for an explicit annotation. We could reuse slot 0 in the LVT (which holds the receiver) for the var. In other cases (e.g. using anIterator
instead of manually traversing a linked list) the type may not match, so this is not an option. We don't want to null out the receiver unnecessarily in all methods, either. In these cases you would still have to use the annotation.There is currently no optimization for aliasing of local variables or reuse of LVT slots in the backend. Every local variable gets a separate slot and everything is scoped to its full static scope.
@dropthis
@dropthis
on non-this
expressionsthis
into avar
and reuse slot 0@dropthis
-annotated methods