Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

stephentoub
Copy link
Member

The Concat operator today is very simple: it iterats through the first source yielding all items, then does the same for the second. This works great in isolation, but when chained, the cost grows as yielding each item from the Nth source results in calling through the MoveNext/Current interface methods of the previous N-1 concats. While this is the nature of LINQ operators in general, it's particular pernicious with Concat, which is often used to assembly data from many sources.

This commit introduces a special concat iterator that avoids that recursive cost. This comes at the small expense of N+1 interface calls per iteration, where N is the number of sources involved in the concatenation chain. Chains of two sources and three sources are special-cased, after which an array is allocated and used to hold all of the sources (this could be tweaked in the future to have specializations for more sources if, for example, we found that four was a very common number). Other benefits include the size of the concat iterator being a bit smaller than it was previously when generated by the compiler, and it now taking part in the IIListProvider interface, so that for example ToList operations are faster when any of the sources are ILists.

Example results on my machine:

  • Enumerating a Concat of 2 Range(0, 10) enumerables: ~15% faster
  • Enumerating a Concat of 3 Range(0, 10) enumerables: ~30% faster
  • Enumerating a Concat of 10 Range(0, 100) enumerables: ~4x faster
  • Enumerating a Concat of 100 Range(0, 1) enumerables: ~2x faster

cc: @VSadov, @JonHanna
Related to https://github.com/dotnet/corefx/issues/2075

Array.Copy(n._sources, 0, sources, 0, n._sources.Length);
sources[n._sources.Length] = second;
return new ConcatNIterator<TSource>(sources);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized I can likely avoid these array allocations by chaining the concat iterators and changing how GetEnumerable is implemented. I'll try it out tomorrow. Not sure it'll better or not.

@svick
Copy link

svick commented Feb 16, 2016

How much would it hurt the common "append" case (e.g. a.Concat(b).Concat(c).Concat(d)), if the less common "prepend" case (e.g. a.Concat(b.Concat(c.Concat(d)))) was optimized too?

@JonHanna
Copy link
Contributor

I was taking a look at the same thing, though planning to wait until after #6127 and particularly #6129 were in.

You can take a look at JonHanna@9894d37 though it's far from ready.

Differences:

  1. I also (though it's not completed here) optimise for IList<T> sources. That was in fact my original goal, but I made the same observation as you in the course of that, because it caused me to think about the costs of Concat more generally. (In particular, XUnit itself appears to do some concats of concats. It also behaves very strangely if your Concat has a bug in it and ends up telling you all your tests passed because it missed most of them 😉). There's no reason why that couldn't be added on to this PR at a later date.
  2. I also tackle Union. Likewise, there's no reason why that couldn't still be done if we take this PR.
  3. I use an abstract-method approach to finding the appropriate ConcatIterator to return. I don't know whether that would prove to be better or worse than that used here. (Cost of virtual lookup vs type-checking would be the main difference, I imagine).
  4. I handle x.Concat(y.Concat(z)) as well, though again there's no reason this couldn't be added here.
  5. I handle x.Concat(y).Concat(a.Concat(b)), though I always skip the explicitly-numbered classes in this case. That would be easy to bring to this.
  6. I've a very different approach to MoveNext(). It's hard to weigh the two just from looking at the code.
  7. I don't set a limit on how large an array can be created. Good idea!
  8. The most important difference; my approach isn't finished, in which regard this approach clearly wins 😄

Anyway, this LGTM, but you might find one or more of the ideas in mine worth considering.

IEnumerable<TSource> source = GetEnumerable(i);
if (source == null) break;

ICollection<TSource> c = source as ICollection<TSource>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also test for source as IIListProvider and call GetCount(onlyIfCheap) on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed the approach used in OrderedEnumerable<T> would work here, which covers that and also the non-generic ICollection by defering to Count() in either case but predicting whether Count() itself will be constant-time or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I was actually hesitant to even take it this far... I'm a bit concerned that having to look at all of the constituent enumerables, do all of the casting and type checks, etc., could actually hurt cases where onlyIfCheap and it needs to be really, really cheap. I think I may just scale this back to always return -1 if onlyIfCheap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider this to be a strong case for IIListProvider at all, for similar reasons (but did if the sources are all lists, which was my main thing about my stab at it). Of course since my plan was to have separate implementations when all inputs are lists, I'd already set things so to make cheap counts less likely, but having some of the most common cases where they are handled elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do scale it back that way, the check could be removed here entirely and just depend on Count() doing the right thing. Another possibility is to be more thorough in the "small" classes, and lazy in the case with more than 3 items.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the check could be removed here entirely and just depend on Count() doing the right thing

Yup, done.

@stephentoub
Copy link
Member Author

I also (though it's not completed here) optimise for IList sources.

I started on that path, looked at a bunch of existing use cases and what value would actually be had for doing the type checks, adding all the special paths, etc., and it didn't seem worthwhile. If it turns out to be valuable, it's just "more code" and could be added in the future.

I also tackle Union. Likewise, there's no reason why that couldn't still be done if we take this PR.

Yeah, I think that's separate, and IMO chains of concats is much more common than chains of Unions. Again, though, it's just "more code" that could be added later.

I use an abstract-method approach to finding the appropriate ConcatIterator to return.

That's a good idea. I'll do that.

I handle x.Concat(y.Concat(z)) as well, though again there's no reason this couldn't be added here.

Sure. There are lots of potential combinations. I simply handled the one that seemed to provide the best return on investment. I'm trying to weigh the possible gains for the most common cases with keeping the code complexity low. It's possible additional cases would be valuable in the future.

@JonHanna
Copy link
Contributor

I was thinking I'd probably keep the prepend as a reasonably likely case that needs just one more check, but drop the check within append that catches a concatenation of concatenations as more trouble than its worth.

@stephentoub
Copy link
Member Author

Thanks for the review, @JonHanna. I updated it to avoid the arrays entirely and to address your feedback, plus added a few more tests.

@JonHanna
Copy link
Contributor

Yeah, I think that's separate, and IMO chains of concats is much more common than chains of Unions.

Yeah, I was just led to think of it due to the way they correspond to two types of SQL UNION. That said, since this makes most of the rest of that experiment obsolete, I'll look at adding that part to #6129, though probably cut-down to not care about prepends (chains of unions going backwards are going to be rarer still).

return new Concat2Iterator<TSource>(_first, _second);
}

internal override ConcatIterator<TSource> Append(IEnumerable<TSource> next)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concat is perhaps a better name for what is a specialised Concat rather than a specialised Append (though mea culpa on having also used "Append").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look at your changes, so we came up with "Append" independently... that probably means something ;) Even so, I've changed it to be Concat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That it would be a perfectly good word if there wasn't an Append in linq, and we aren't used to there now being an Append in linq, probably.

internal override IEnumerable<TSource> GetEnumerable(int index)
{
return
index < _nextIndex ? _previousConcat.GetEnumerable(index) :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this can be done without recursion? Iterating through prev chain could be cheaper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against that, but this was by far the simplest mechanism I could come up with, and the cost here should in general be very minimal. Since we'd need to process from the oldest to the newest (which is the opposite direction of the chain), and since we can't rewrite the chain, how would you recommend doing this without manually building up a stack (which has its own costs)?

(The recursion is in effect no different than what's already being done today, just on each call to MoveNext and Current rather than once per enumerable here.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be O(n^2) on the number of concatenated fragments, so for long chains of short fragments this can become a dominant factor.
Would it make sense to memoize the chain into a List if we know the chain is sufficiently long? 16 would be my guess at "long enough" :-)
That would obviously make sense only at the top-level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be O(n^2) on the number of concatenated fragments

My point is, today it's O(n^2) on the number of items in the enumerables (for each of a MoveNext and a Current call per item). This change makes it O(n^2) on the number of enumerables (for a GetEnumerable call per enumerable). So, yes, for long chains of very short fragments, it could approach within a constant multiple of what it is today, though still much less.

Would it make sense to memoize the chain into a List if we know the chain is sufficiently long?

That's what I initially had, actually, where ConcatNIterator stored an array of the enumerables rather than a link back to its previous one, but it requires allocating such an array/list, which is why I moved away from it. Are you asking that I add back such a thing for use with long chains, e.g. use Concat2Iterator for chains of 2, Concat3Iterator for chains of 3, ConcatLinkedIterator for chains of 4-16 (what's currently in this PR called ConcatNIterator), and a new ConcatArrayIterator for chains longer than 16? Or are you suggesting that when enumeration starts, build up an array of the enumerables once and then iterate through that?

I'm open to doing things like that. I just want to highlight that what's here in the PR is strictly better than what's currently checked in, at least in this regard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we could null-out _previousConcat if memoizing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I misunderstood what you were trying to accomplish. This is still O(n^2) in the number of enumerables, you've just traded iteration for function calls... is that all you were going for? I thought you wanted an iteration mechanism to make it O(n).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this one just to avoid recursion. considering n^2 it could be a noticeable change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sorry, I completely misunderstood what you were going for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memoization suggestion is to avoid n^2 for big n, but that is indeed allocation vs. cycles trade and as such, I agree, not necessarily a win.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree for the purpose of avoiding the deeply recursive call chain, this makes sense. I was misunderstanding what you were trying to achieve with it. I'll fix it up.

The Concat operator today is very simple: it iterats through the first source yielding all items, then does the same for the second.  This works great in isolation, but when chained, the cost grows as yielding each item from the Nth source results in calling through the MoveNext/Current interface methods of the previous N-1 concats.  While this is the nature of LINQ operators in general, it's particular pernicious with Concat, which is often used to assembly data from many sources.

This commit introduces a special concat iterator that avoids that recursive cost.  This comes at the small expense of N+1 interface calls per iteration, where N is the number of sources involved in the concatenation chain.  Chains of two sources and three sources are special-cased, after which an array is allocated and used to hold all of the sources (this could be tweaked in the future to have specializations for more sources if, for example, we found that four was a very common number).  Other benefits include the size of the concat iterator being a bit smaller than it was previously when generated by the compiler, and it now taking part in the IIListProvider interface, so that for example ToList operations are faster when any of the sources are ILists.

Example results on my machine:
- Enumerating a Concat of 2 Range(0, 10) enumerables: ~15% faster
- Enumerating a Concat of 3 Range(0, 10) enumerables: ~30% faster
- Enumerating a Concat of 10 Range(0, 100) enumerables: ~4x faster
- Enumerating a Concat of 100 Range(0, 1) enumerables: ~2x faster
And add a few more tests.
stephentoub added a commit that referenced this pull request Feb 17, 2016
Improve LINQ perf of chained Concats
@stephentoub stephentoub merged commit 5790919 into dotnet:master Feb 17, 2016
@stephentoub stephentoub deleted the concat_perf branch February 17, 2016 00:13
@karelz karelz modified the milestone: 1.0.0-rtm Dec 3, 2016
Copy link
Member

@lindexi lindexi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the unlikely case of this many concatenations, if we produced a ConcatNIterator with int.MaxValue then state would overflow before it matched its index.

private readonly IEnumerable<TSource> _next;
private readonly int _nextIndex;

internal ConcatNIterator(ConcatIterator<TSource> previousConcat, IEnumerable<TSource> next, int nextIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you should sure the nextIndex is >=0 that why you dont use uint?

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Improve LINQ perf of chained Concats

Commit migrated from dotnet/corefx@5790919
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants