Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 20, 2022. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ trait DagOptimizer[P <: Platform[P]] {
// TODO: we need to case class here to not lose the irreducible which may be named
case ValueFlatMappedProducer(in, fn) =>
// we know that (K, V) <: T due to the case match, but scala can't see it
def cast[K, V](p: Prod[(K, V)]): Prod[T] = p.asInstanceOf[Prod[T]]
def cast[K, V](p: Prod[(K, V)]): Prod[T] = IdentityKeyedProducer[P, K, V](p).asInstanceOf[Prod[T]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is line 283 above going to have the same issue?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not understanding why this is an error. Can you post a gist to a stack trace that hit the issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the stack trace from the test if we don't wrap in IdentityKeyedProducer:

[error] ! StormPlatform matches Scala for left join with flatMapValues jobs
[error]   ClassCastException: com.twitter.summingbird.FlatMappedProducer cannot be cast to com.twitter.summingbird.KeyedProducer (StripNamedNodes.scala:124)
[error] com.twitter.summingbird.planner.StripNamedNode$.castToKeyed(StripNamedNodes.scala:29)
[error] com.twitter.summingbird.planner.StripNamedNode$$anonfun$functionize$12.apply(StripNamedNodes.scala:124)
[error] com.twitter.summingbird.planner.StripNamedNode$$anonfun$functionize$12.apply(StripNamedNodes.scala:124)
[error] com.twitter.summingbird.planner.StripNamedNode$$anonfun$processLevel$1.apply(StripNamedNodes.scala:38)
[error] com.twitter.summingbird.planner.StripNamedNode$$anonfun$processLevel$1.apply(StripNamedNodes.scala:35)
[error] com.twitter.summingbird.planner.StripNamedNode$.processLevel(StripNamedNodes.scala:35)
[error] com.twitter.summingbird.planner.StripNamedNode$$anonfun$mutateGraph$2.apply(StripNamedNodes.scala:149)
[error] com.twitter.summingbird.planner.StripNamedNode$$anonfun$mutateGraph$2.apply(StripNamedNodes.scala:147)
[error] com.twitter.summingbird.planner.StripNamedNode$.mutateGraph(StripNamedNodes.scala:147)
[error] com.twitter.summingbird.planner.StripNamedNode$.stripNamedNodes(StripNamedNodes.scala:156)
[error] com.twitter.summingbird.planner.StripNamedNode$.apply(StripNamedNodes.scala:168)
[error] com.twitter.summingbird.planner.OnlinePlan$.apply(OnlinePlan.scala:223)
[error] com.twitter.summingbird.storm.Storm.plan(StormPlatform.scala:332)
[error] com.twitter.summingbird.storm.StormTestRun$.apply(StormTestRun.scala:78)
[error] com.twitter.summingbird.storm.StormTestRun$.simpleRun(StormTestRun.scala:98)
[error] com.twitter.summingbird.storm.StormLaws$$anonfun$6.apply(StormLaws.scala:232)
[error] com.twitter.summingbird.storm.StormLaws$$anonfun$6.apply(StormLaws.scala:227)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. This code was actually the predecessor of the Dag optimizer code. I bet we could remove it now and replace the remove name nodes dag rule. I feel like that might be a cleaner solution since this code is actually making a false assumption (the strip name nodes code above).

cast(in.flatMap { case (k, v) => fn(v).map((k, _)) })
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,26 @@ object TestGraphs {
.flatMap(postJoinFn)
.sumByKey(store)

def leftJoinWithFlatMapValuesInScala[T, U, JoinedU, K, V: Monoid](source: TraversableOnce[T])(service: K => Option[JoinedU])(preJoinFn: T => TraversableOnce[(K, U)])(postJoinFn: ((U, Option[JoinedU])) => TraversableOnce[V]): Map[K, V] =
MapAlgebra.sumByKey(
source
.flatMap(preJoinFn)
.map { case (k, v) => (k, (v, service(k))) }
.flatMap { case (k, v) => postJoinFn(v).map { v => (k, v) } }
)

def leftJoinJobWithFlatMapValues[P <: Platform[P], T, U, JoinedU, K, V: Monoid](
source: Producer[P, T],
service: P#Service[K, JoinedU],
store: P#Store[K, V])(preJoinFn: T => TraversableOnce[(K, U)])(postJoinFn: ((U, Option[JoinedU])) => TraversableOnce[V]): TailProducer[P, (K, (Option[V], V))] =
source
.name("My named source")
.flatMap(preJoinFn)
.leftJoin(service)
.name("My named flatmap")
.flatMapValues(postJoinFn)
.sumByKey(store)

def leftJoinWithStoreInScala[T1, T2, U, JoinedU: Monoid, K: Ordering, V: Monoid](source1: TraversableOnce[T1], source2: TraversableOnce[T2])(simpleFM1: T1 => TraversableOnce[(Long, (K, JoinedU))])(simpleFM2: T2 => TraversableOnce[(Long, (K, U))])(postJoinFn: ((Long, (K, (U, Option[JoinedU])))) => TraversableOnce[(Long, (K, V))]): (Map[K, JoinedU], Map[K, V]) = {

val firstStore = MapAlgebra.sumByKey(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -110,9 +110,13 @@ object StormLaws extends Specification {
List((k -> joinedV.getOrElse(10)))
}

val nextFn1 = { pair: ((Int, Option[Int])) =>
val (v, joinedV) = pair
List((joinedV.getOrElse(10)))
}

val serviceFn = sample[Int => Option[Int]]
val service = ReadableServiceFactory[Int, Int](() => ReadableStore.fromFn(serviceFn))

// ALL TESTS START AFTER THIS LINE

"StormPlatform matches Scala for single step jobs" in {
Expand Down Expand Up @@ -219,6 +223,21 @@ object StormLaws extends Specification {
) must beTrue
}

"StormPlatform matches Scala for left join with flatMapValues jobs" in {
val original = sample[List[Int]]
val staticFunc = { i: Int => List((i -> i)) }

val returnedState =
StormTestRun.simpleRun[Int, Int, Int](original,
TestGraphs.leftJoinJobWithFlatMapValues[Storm, Int, Int, Int, Int, Int](_, service, _)(staticFunc)(nextFn1)
)

Equiv[Map[Int, Int]].equiv(
TestGraphs.leftJoinWithFlatMapValuesInScala(original)(serviceFn)(staticFunc)(nextFn1),
returnedState.toScala
) must beTrue
}

"StormPlatform matches Scala for repeated tuple leftJoin jobs" in {
val original = sample[List[Int]]
val staticFunc = { i: Int => List((i -> i)) }
Expand Down