-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Improve LINQ perf of chained Concats #6131
Conversation
Array.Copy(n._sources, 0, sources, 0, n._sources.Length); | ||
sources[n._sources.Length] = second; | ||
return new ConcatNIterator<TSource>(sources); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized I can likely avoid these array allocations by chaining the concat iterators and changing how GetEnumerable is implemented. I'll try it out tomorrow. Not sure it'll better or not.
How much would it hurt the common "append" case (e.g. |
I was taking a look at the same thing, though planning to wait until after #6127 and particularly #6129 were in. You can take a look at JonHanna@9894d37 though it's far from ready. Differences:
Anyway, this LGTM, but you might find one or more of the ideas in mine worth considering. |
IEnumerable<TSource> source = GetEnumerable(i); | ||
if (source == null) break; | ||
|
||
ICollection<TSource> c = source as ICollection<TSource>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also test for source as IIListProvider
and call GetCount(onlyIfCheap)
on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed the approach used in OrderedEnumerable<T>
would work here, which covers that and also the non-generic ICollection
by defering to Count()
in either case but predicting whether Count()
itself will be constant-time or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I was actually hesitant to even take it this far... I'm a bit concerned that having to look at all of the constituent enumerables, do all of the casting and type checks, etc., could actually hurt cases where onlyIfCheap and it needs to be really, really cheap. I think I may just scale this back to always return -1 if onlyIfCheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't consider this to be a strong case for IIListProvider
at all, for similar reasons (but did if the sources are all lists, which was my main thing about my stab at it). Of course since my plan was to have separate implementations when all inputs are lists, I'd already set things so to make cheap counts less likely, but having some of the most common cases where they are handled elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do scale it back that way, the check could be removed here entirely and just depend on Count() doing the right thing. Another possibility is to be more thorough in the "small" classes, and lazy in the case with more than 3 items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check could be removed here entirely and just depend on Count() doing the right thing
Yup, done.
I started on that path, looked at a bunch of existing use cases and what value would actually be had for doing the type checks, adding all the special paths, etc., and it didn't seem worthwhile. If it turns out to be valuable, it's just "more code" and could be added in the future.
Yeah, I think that's separate, and IMO chains of concats is much more common than chains of Unions. Again, though, it's just "more code" that could be added later.
That's a good idea. I'll do that.
Sure. There are lots of potential combinations. I simply handled the one that seemed to provide the best return on investment. I'm trying to weigh the possible gains for the most common cases with keeping the code complexity low. It's possible additional cases would be valuable in the future. |
I was thinking I'd probably keep the prepend as a reasonably likely case that needs just one more check, but drop the check within append that catches a concatenation of concatenations as more trouble than its worth. |
Thanks for the review, @JonHanna. I updated it to avoid the arrays entirely and to address your feedback, plus added a few more tests. |
1951db6
to
ed635e1
Compare
Yeah, I was just led to think of it due to the way they correspond to two types of SQL |
return new Concat2Iterator<TSource>(_first, _second); | ||
} | ||
|
||
internal override ConcatIterator<TSource> Append(IEnumerable<TSource> next) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concat
is perhaps a better name for what is a specialised Concat
rather than a specialised Append
(though mea culpa on having also used "Append").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look at your changes, so we came up with "Append" independently... that probably means something ;) Even so, I've changed it to be Concat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That it would be a perfectly good word if there wasn't an Append
in linq, and we aren't used to there now being an Append
in linq, probably.
ed635e1
to
450f29c
Compare
internal override IEnumerable<TSource> GetEnumerable(int index) | ||
{ | ||
return | ||
index < _nextIndex ? _previousConcat.GetEnumerable(index) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this can be done without recursion? Iterating through prev chain could be cheaper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against that, but this was by far the simplest mechanism I could come up with, and the cost here should in general be very minimal. Since we'd need to process from the oldest to the newest (which is the opposite direction of the chain), and since we can't rewrite the chain, how would you recommend doing this without manually building up a stack (which has its own costs)?
(The recursion is in effect no different than what's already being done today, just on each call to MoveNext and Current rather than once per enumerable here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be O(n^2) on the number of concatenated fragments, so for long chains of short fragments this can become a dominant factor.
Would it make sense to memoize the chain into a List if we know the chain is sufficiently long? 16 would be my guess at "long enough" :-)
That would obviously make sense only at the top-level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be O(n^2) on the number of concatenated fragments
My point is, today it's O(n^2) on the number of items in the enumerables (for each of a MoveNext and a Current call per item). This change makes it O(n^2) on the number of enumerables (for a GetEnumerable call per enumerable). So, yes, for long chains of very short fragments, it could approach within a constant multiple of what it is today, though still much less.
Would it make sense to memoize the chain into a List if we know the chain is sufficiently long?
That's what I initially had, actually, where ConcatNIterator stored an array of the enumerables rather than a link back to its previous one, but it requires allocating such an array/list, which is why I moved away from it. Are you asking that I add back such a thing for use with long chains, e.g. use Concat2Iterator for chains of 2, Concat3Iterator for chains of 3, ConcatLinkedIterator for chains of 4-16 (what's currently in this PR called ConcatNIterator), and a new ConcatArrayIterator for chains longer than 16? Or are you suggesting that when enumeration starts, build up an array of the enumerables once and then iterate through that?
I'm open to doing things like that. I just want to highlight that what's here in the PR is strictly better than what's currently checked in, at least in this regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we could null-out _previousConcat if memoizing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstood what you were trying to accomplish. This is still O(n^2) in the number of enumerables, you've just traded iteration for function calls... is that all you were going for? I thought you wanted an iteration mechanism to make it O(n).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this one just to avoid recursion. considering n^2 it could be a noticeable change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sorry, I completely misunderstood what you were going for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memoization suggestion is to avoid n^2 for big n, but that is indeed allocation vs. cycles trade and as such, I agree, not necessarily a win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree for the purpose of avoiding the deeply recursive call chain, this makes sense. I was misunderstanding what you were trying to achieve with it. I'll fix it up.
The Concat operator today is very simple: it iterats through the first source yielding all items, then does the same for the second. This works great in isolation, but when chained, the cost grows as yielding each item from the Nth source results in calling through the MoveNext/Current interface methods of the previous N-1 concats. While this is the nature of LINQ operators in general, it's particular pernicious with Concat, which is often used to assembly data from many sources. This commit introduces a special concat iterator that avoids that recursive cost. This comes at the small expense of N+1 interface calls per iteration, where N is the number of sources involved in the concatenation chain. Chains of two sources and three sources are special-cased, after which an array is allocated and used to hold all of the sources (this could be tweaked in the future to have specializations for more sources if, for example, we found that four was a very common number). Other benefits include the size of the concat iterator being a bit smaller than it was previously when generated by the compiler, and it now taking part in the IIListProvider interface, so that for example ToList operations are faster when any of the sources are ILists. Example results on my machine: - Enumerating a Concat of 2 Range(0, 10) enumerables: ~15% faster - Enumerating a Concat of 3 Range(0, 10) enumerables: ~30% faster - Enumerating a Concat of 10 Range(0, 100) enumerables: ~4x faster - Enumerating a Concat of 100 Range(0, 1) enumerables: ~2x faster
And add a few more tests.
450f29c
to
7f573ef
Compare
Improve LINQ perf of chained Concats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the unlikely case of this many concatenations, if we produced a ConcatNIterator with int.MaxValue then state would overflow before it matched its index.
private readonly IEnumerable<TSource> _next; | ||
private readonly int _nextIndex; | ||
|
||
internal ConcatNIterator(ConcatIterator<TSource> previousConcat, IEnumerable<TSource> next, int nextIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you should sure the nextIndex is >=0
that why you dont use uint?
Improve LINQ perf of chained Concats Commit migrated from dotnet/corefx@5790919
The Concat operator today is very simple: it iterats through the first source yielding all items, then does the same for the second. This works great in isolation, but when chained, the cost grows as yielding each item from the Nth source results in calling through the MoveNext/Current interface methods of the previous N-1 concats. While this is the nature of LINQ operators in general, it's particular pernicious with Concat, which is often used to assembly data from many sources.
This commit introduces a special concat iterator that avoids that recursive cost. This comes at the small expense of N+1 interface calls per iteration, where N is the number of sources involved in the concatenation chain. Chains of two sources and three sources are special-cased, after which an array is allocated and used to hold all of the sources (this could be tweaked in the future to have specializations for more sources if, for example, we found that four was a very common number). Other benefits include the size of the concat iterator being a bit smaller than it was previously when generated by the compiler, and it now taking part in the IIListProvider interface, so that for example ToList operations are faster when any of the sources are ILists.
Example results on my machine:
cc: @VSadov, @JonHanna
Related to https://github.com/dotnet/corefx/issues/2075