Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 20, 2022. It is now read-only.

Conversation

@pankajroark
Copy link
Contributor

AlsoProducers. The root cause has actually to do with the summer. In
toStream method we keep a map of Producer to stream to make sure we don't run
parts of the graph again and again. The map uses value equality. The
problem is that summer includes store which is mutable. If a summer is
referenced by two also producers and if by the time summer is revisited
the store has changed then the key would be different from the original
summer and won't be found in the map, leading to the summer being
planned again. d3b5808 seems to fix the issue and passes a unit test
because that test has only one AlsoProducer, where planning left and
right before forcing left means that store doesn't change in between.
But it likely wouldn't fix the case where there are multiple AlsoProducers at
different levels. Even if it did fix those cases this is an indirect
solution and it's better to avoid taking any chances and fix the root of
the issue.

This is a case where reference equality fits the bill. I don't know of
an immutable implementation of a map that uses reference equality so
I'm using java.util.IdentityHashMap, if there is one I would love to use
that instead.

AlsoProducers. The root cause has actually to do with the summer. In
toStream method we keep a map of Producer to stream to make sure we don't run
parts of the graph again and again. The map uses value equality. The
problem is that summer includes store which is mutable. If a summer is
referenced by two also producers and if by the time summer is revisited
the store has changed then the key would be different from the original
summer and won't be found in the map, leading to the summer being
planned again. d3b5808 seems to fix the issue and passes a unit test
because that test has only one AlsoProducer, where planning left and
right before forcing left means that store doesn't change in between.
But it likely wouldn't fix the case where there are multiple AlsoProducers at
different levels. Even if it did fix those cases this is an indirect
solution and it's better to avoid taking any chances and fix the root of
the issue.

This is a case where reference equality fits the bill. I don't know of
an immutable implementation of a map that uses reference equality so
I'm using java.util.IdentityHashMap, if there is one I would love to use
that instead.
val st = s.asInstanceOf[Stream[T]]
(st, m + (outerProducer -> st))
def toStream[T, K, V](outerProducer: Prod[T], jamfs: JamfMap): (Stream[T], JamfMap) = {
val stream = jamfs.get(outerProducer)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we do: Option(jamfs.get(outerProducer)) match { so that the diff can be smaller? This is harder to read.

private def toStream[T](outerProducer: Prod[T], jamfs: JamfMap): (Stream[T], JamfMap) =
jamfs.get(outerProducer) match {
case Some(s) => (s, jamfs)
def toStream[T, K, V](outerProducer: Prod[T], jamfs: JamfMap): (Stream[T], JamfMap) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K, V don't seem to be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed.

@pankajroark
Copy link
Contributor Author

Some context for the review. This change is basically a revert of d3b5808 followed by use of IdentityHashMap instead of Map for JamfMap. btw, I don't know what Jamf stands for and curios find out.


private type Prod[T] = Producer[Memory, T]
private type JamfMap = HMap[Prod, Stream]
private type JamfMap = util.IdentityHashMap[Prod[_], Stream[_]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is mutable now, and before it was immutable. Does this not cause any problems? Can you add some comments explaining how it is safe? Also, why thread it through if you are mutating it? There is no need to return JamfMap if you are just returning the input.

I would prefer not introduce a mutable map, and instead add:

class Identity[T](val unwrap: T) {
  override def equals(that: Any) = that match {
    case i: Identity[_] => unwrap eq i.unwrap
    case _ => false
  }
  override def hashCode = System.identityHashCode(unwrap)
}
private type Prod[T] = Identity[Producer[Memory, T]]

then wrap keys with new Identity(key).

I really hate to give up the reasoning we get with immutable types, and we should only do so for big performance wins, but we don't care about performance here (planning is fast, and only happens at submit).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. I would really love to keep everything immutable too. Let me give this a shot.

@johnynek
Copy link
Collaborator

Jamf was an old inside joke between @singhala @sritchie and I based on Jules Winnfield's wallet, and changing the word "Bad" to "Jive Ass".

@johnynek
Copy link
Collaborator

@pankajroark
Copy link
Contributor Author

He he, I love pulp fiction dialogs.

I've updated the review to use immutable map with reference equality.

lazy val lforcedEmpty = left.filter(_ => false)
(right.append(lforcedEmpty), rightM)
val lforcedEmpty = left.filter(_ => false)
val (right, rightM) = toStream(r, leftM)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have to revert this change? Calling toStream(r, leftM) first seems more correct to me, as does using the lazy concatenation (rather than the strict that we have reverted to).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It breaks a bunch of our tests which are relying on the current execution order. Since we don't need to change the ordering to fix the duplication issue it will be great if we can preserve the same execution order.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, can you just copy this file into the repo rather than hardcode a bug just because a few tests assumed something false?

I fear the "keeping bugs for a few tests" is not scalable. I doubt you would be happy if you hit a bug that I want us to keep because my tests assumed the buggy behavior.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to link to the prior discussion:
https://github.com/twitter/summingbird/pull/647/files#r47308385

#647 may indeed be fixed more than one way, but I still don't like that planning forces the left hand side.

Could you get the tests to pass if you did something like:

def merge[A, B](a: Stream[A], b: Stream[B]): Stream[Either[A, B]] = {
  val itA = a.iterator
  val itB = b.iterator
  var left = itA.hasNext
    // try to alternate as much as possible
  def next: Option[Either[A, B]] =
    if (!(itA.hasNext || itB.hasNext)) None
    else if (left) {
      left = false
      if (itA.hasNext) Some(Left(itA.next)) else next
    } else {
      left = true
      if (itB.hasNext) Some(Right(itB.next)) else next
    }
  Stream.continually(next).takeWhile(_.isDefined).map(_.get)
}

Then we can return merge(left, right).collect { case Right(r) => r }. as the result of the Also. In this way, we are closer to modeling a real streaming system.

If the contract we are presenting here is that the left will always be consumed at planning, I just don't like that. It leads to a side-effect from planning which was never our intent. I think the test needs to not assume that.

I think we need to make a clear case as to why the behavior is a certain way to which is not "some broken test at Twitter will fail if we change this". If that is the only reason, you should have an internal fork of the platform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the broken tests actually got fixed by this cl. Other teams have agreed to work on fixing their tests or comment out the broken test for now. I've updated the cl to restore the lazy evaluation of left producer. Please take a look.


package com.twitter.summingbird.memory

import java.util
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we aren't using this anymore are we?

@johnynek
Copy link
Collaborator

Okay, given that this was always an example and toy platform, if it is really important for twitter for it to have the (unrealistic) semantics that in an also, the left is fully evaluated before the right, I'm okay with that (since it didn't cause an issue for anyone that we know of yet).

If the tests can be fixed to not assume that, even better.

I'll leave it up to @pankajroark

👍

@pankajroark
Copy link
Contributor Author

In this case we've decided to remove/fix the tests that are failing due to wrong assumptions. So we're going ahead and keeping the change to make execution of left producer lazy.

@sritchie
Copy link
Collaborator

Jamf is exactly what @johnynek says; it's also a reference to Dr. Laszlo Jamf, from Gravity's Rainbow: https://en.wikipedia.org/wiki/Gravity%27s_Rainbow

@pankajroark
Copy link
Contributor Author

@sritchie Never heard of Gravity's rainbow, I've got to read it now :)

@pankajroark pankajroark merged commit f89f2cf into develop Jun 21, 2016
@johnynek johnynek deleted the pg/memory_fix_ref_equality_try branch June 24, 2016 19:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants