Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views7 pages

Java Stream API Guide

This document serves as a comprehensive guide to the Java Stream API and Collectors, detailing the creation, intermediate, and terminal operations of streams, as well as practical examples and common pitfalls. It covers primitive streams, optional handling, parallel streams, and various collector functionalities, providing insights into their usage and performance considerations. Additionally, it includes tips for interviews and best practices for utilizing the Stream API effectively.

Uploaded by

techbuzz3934
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views7 pages

Java Stream API Guide

This document serves as a comprehensive guide to the Java Stream API and Collectors, detailing the creation, intermediate, and terminal operations of streams, as well as practical examples and common pitfalls. It covers primitive streams, optional handling, parallel streams, and various collector functionalities, providing insights into their usage and performance considerations. Additionally, it includes tips for interviews and best practices for utilizing the Stream API effectively.

Uploaded by

techbuzz3934
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 7

Java Stream API & Collectors — Complete

Practitioner’s Guide
Generated on August 16, 2025 15:15

How to Read This Guide


This document is a practical, interview-ready reference to the Java Stream ecosystem:
Stream/BaseStream, primitive streams (IntStream, LongStream, DoubleStream), Optional types, and the
Collectors toolkit. For each method, you’ll find purpose, key behaviors, examples, and corner cases.
Common/most useful APIs are {\f1\cf5 highlighted like this}.

Streams in One Minute


• A Stream is a sequence of elements supporting aggregate operations in a pipelined fashion.
• Pipelines have three parts: source → intermediate operations → terminal operation.
• Operations are lazily evaluated until a terminal operation is invoked.
• Streams don’t store data; they view data. Most operations are non-mutating.

Creating Streams (Sources)


• {\f1\cf5 Collection.stream(), parallelStream()} — from in-memory collections.
• {\f1\cf5 Stream.of(T...)}, {\f1\cf5 Stream.ofNullable(T)} — varargs / possibly-null single element
(JDK 9+).
• {\f1\cf5 Stream.generate(Supplier)} — infinite stream; use {\f1\cf5 limit(n)}.
• {\f1\cf5 Stream.iterate(seed, UnaryOperator)} — infinite; {\f1\cf5 iterate(seed, hasNext, next)}
bounded (JDK 9+).
• {\f1\cf5 Arrays.stream(array)} — arrays; boxed or primitive variants.
• {\f1\cf5 Files.lines(Path)}, {\f1\cf5 BufferedReader.lines()} — I/O-backed streams; remember to
close.
• {\f1\cf5 Pattern.splitAsStream(CharSequence)} — splitting text as a stream.
--- code ---

// Common sources
Stream<String> a = Stream.of("a", "b", "c"); // common
Stream<String> maybeOne = Stream.ofNullable(System.getenv("USER")); // JDK
9+
Stream<Integer> evens = Stream.iterate(0, x -> x + 2).limit(5); //
0,2,4,6,8
--- end code ---

Intermediate Operations

Stateless
• {\f1\cf5 filter(Predicate)} — keep matching elements.
• {\f1\cf5 map(Function)} — transform each element.
• {\f1\cf5 mapToInt/Long/Double} — specialized projections.
• {\f1\cf5 flatMap(Function<T,Stream<R>>)} — flatten one level.
• {\f1\cf5 flatMapToInt/Long/Double} — flatten to primitives.
• {\f1\cf5 mapMulti(BiConsumer<T,Consumer<R>>) (JDK 16+)} — emit 0..N results per input
without creating intermediate streams (perf).
• {\f1\cf5 peek(Consumer)} — debug/observe; side-effects discouraged.

Stateful
• {\f1\cf5 distinct()} — deduplicate via equals/hashCode.
• {\f1\cf5 sorted() / sorted(Comparator)} — natural/custom order.
• {\f1\cf5 limit(n), skip(n)} — truncation / offset.
• {\f1\cf5 takeWhile(Predicate), dropWhile(Predicate) (JDK 9+)} — prefix/suffix slicing until
predicate flips.

Stream configuration
• {\f1\cf5 sequential(), parallel()} — mode hints for execution.
• {\f1\cf5 unordered()} — allow relaxed ordering when safe.
• {\f1\cf5 onClose(Runnable)} — callback when stream closes.
--- code ---

// Example: flatten and deduplicate sorted tags


List<String> tags = posts.stream()
.flatMap(p -> p.getTags().stream())
.map(String::toLowerCase) // common
.distinct()
.sorted()
.toList(); // JDK 16+
--- end code ---

Terminal Operations
• {\f1\cf5 forEach(Consumer) / forEachOrdered(Consumer)} — consume; ordered variant
preserves encounter order.
• {\f1\cf5 toArray() / toArray(IntFunction<A[]>)} — materialize array.
• {\f1\cf5 reduce(identity, accumulator) / reduce(accumulator) / reduce(identity, accumulator,
combiner)} — fold.
• {\f1\cf5 collect(Collector) / collect(supplier, accumulator, combiner)} — general reduction.
• {\f1\cf5 min/max(Comparator)}, {\f1\cf5 count()}
• {\f1\cf5 anyMatch/allMatch/noneMatch(Predicate)} — short-circuit checks.
• {\f1\cf5 findFirst()/findAny()} — Optional results; in parallel, {\f1\cf5 findAny} may be faster.
• {\f1\cf5 toList() (JDK 16+)} — unmodifiable List (common & recommended).
--- code ---
// Reduce vs Collect
int sum = numbers.stream().reduce(0, Integer::sum);
// reduce
int sum2 = numbers.stream().collect(Collectors.summingInt(x -> x)); //
collect
--- end code ---

Primitive Streams (IntStream, LongStream, DoubleStream) — What’s


Special?
• Avoid boxing overhead; provide numeric ops: {\f1\cf5 sum(), average(), summaryStatistics(),
range(), rangeClosed()}
• Conversions: {\f1\cf5 mapToObj, boxed, asLongStream, asDoubleStream}
• Corner case: {\f1\cf5 average()} returns {\f1\cf5 OptionalDouble} — handle empty streams.
--- code ---

IntSummaryStatistics s = IntStream.of(1,2,3).summaryStatistics();
System.out.println(s.getCount()+", "+s.getSum()+", "+s.getAverage());
--- end code ---

Optionals from Streams


• {\f1\cf5 findFirst/findAny/min/max} → {\f1\cf5 Optional<T>}
• Primitive variants: {\f1\cf5 OptionalInt/Long/Double}
• Common handling: {\f1\cf5 orElse, orElseGet, orElseThrow, ifPresent, ifPresentOrElse}

Parallel Streams — Use with Care


• Good for CPU-bound, associative operations over large, non-contentious data.
• Avoid with I/O, synchronization, or tiny datasets.
• Ensure {\f1\cf5 combiner} in {\f1\cf5 reduce/collect} is associative and side-effect free.
--- code ---

// Parallel frequency count (Collector is associative)


Map<String, Long> freq = words.parallelStream()
.collect(Collectors.groupingByConcurrent(String::toString,
Collectors.counting()));
--- end code ---

Collectors — The Swiss Army Knife

Materializing
• {\f1\cf5 toList()} (modifiable unspecified) and {\f1\cf5 toUnmodifiableList()}
• {\f1\cf5 toSet()}, {\f1\cf5 toUnmodifiableSet()}
• {\f1\cf5 toCollection(Supplier<C>)} — choose collection type (e.g., LinkedHashSet).
• {\f1\cf5 joining() / joining(delim[, prefix, suffix])} — concatenate CharSequence.
Maps
• {\f1\cf5 toMap(keyMapper, valueMapper)} — may throw on duplicate keys.
• {\f1\cf5 toMap(kMapper, vMapper, mergeFn)} — resolve duplicates (common).
• {\f1\cf5 toMap(kMapper, vMapper, mergeFn, mapSupplier)} — choose map type.
• {\f1\cf5 toUnmodifiableMap(...)} (JDK 10+)
• {\f1\cf5 toConcurrentMap(...)} — concurrent accumulation.
--- code ---

// Safe toMap with merge on duplicate keys (keep larger value)


Map<String,Integer> bestScore =
entries.stream().collect(Collectors.toMap(
e -> e.name(),
e -> e.score(),
Integer::max
));
--- end code ---

Grouping & Partitioning


• {\f1\cf5 groupingBy(classifier)} — Map<K,List<V>>
• {\f1\cf5 groupingBy(classifier, downstream)} — Map<K,R>
• {\f1\cf5 groupingBy(classifier, mapFactory, downstream)} — choose map type (e.g.,
LinkedHashMap, TreeMap).
• {\f1\cf5 groupingByConcurrent(...)} — concurrent version.
• {\f1\cf5 partitioningBy(predicate)} — Map<Boolean,List<T>>
• {\f1\cf5 partitioningBy(predicate, downstream)}
--- code ---

// Group employees by department and count


Map<String, Long> counts =
emps.stream().collect(Collectors.groupingBy(
Emp::dept, Collectors.counting()
));
--- end code ---

Math & Stats


• {\f1\cf5 counting()}
• {\f1\cf5 summingInt/Long/Double(mapper)}
• {\f1\cf5 averagingInt/Long/Double(mapper)}
• {\f1\cf5 summarizingInt/Long/Double(mapper)}

Transformers & Advanced


• {\f1\cf5 mapping(mapper, downstream)} — map + collect in one pass.
• {\f1\cf5 flatMapping(mapperToStream, downstream)} (JDK 9+) — flatMap + collect.
• {\f1\cf5 filtering(predicate, downstream)} (JDK 9+) — filter within group.
• {\f1\cf5 reducing(identity, mapper, op)} — reduction as a collector.
• {\f1\cf5 collectingAndThen(downstream, finisher)} — post-process result.
• {\f1\cf5 teeing(down1, down2, merger) (JDK 12+)} — combine two collectors.
--- code ---

// Top N per group using collectingAndThen


Map<String, List<Employee>> top3 =
emps.stream().collect(Collectors.groupingBy(
Employee::dept,
Collectors.collectingAndThen(
Collectors.toList(),
list -> list.stream()
.sorted(Comparator.compari
ng(Employee::score).reversed())
.limit(3)
.toList()
)
));
--- end code ---

Collector Mechanics (for Custom Collectors)


• A Collector has {\f1\cf5 supplier, accumulator, combiner, finisher, characteristics}
• Characteristics: {\f1\cf5 UNORDERED, CONCURRENT, IDENTITY_FINISH}
• Rule: combiner must merge two partial results associatively; safe under parallelism.
--- code ---

// Minimal custom Collector: joining ints with brackets


Collector<Integer,StringJoiner,String> bracketJoin =
Collector.of(
() -> new StringJoiner(", ", "[", "]"),
(sj, i) -> sj.add(String.valueOf(i)),
(a, b) -> a.merge(b),
StringJoiner::toString
);
String s = Stream.of(1,2,3).collect(bracketJoin); // [1, 2, 3]
--- end code ---

Corner Cases & Gotchas (Quick Hits)


• Empty {\f1\cf5 min/max/average} → empty Optional; handle default.
• {\f1\cf5 Collectors.toMap} duplicates throw {\f1\cf5 IllegalStateException} unless merge
function provided.
• {\f1\cf5 peek} may not run without a terminal operation; don’t rely on side effects.
• Parallel {\f1\cf5 forEach} is unordered; use {\f1\cf5 forEachOrdered} for order (slower).
• {\f1\cf5 Stream.toList()} is unmodifiable; trying to {\f1\cf5 add} throws {\f1\cf5
UnsupportedOperationException}.
• {\f1\cf5 Files.lines} creates a stream that must be closed (try-with-resources).
• {\f1\cf5 distinct} uses equals/hashCode; mutable elements can break it.
• Avoid shared mutable state in lambdas; use collectors instead.
Most Useful Day-to-Day APIs
• {\f1\cf5 filter → map → collect(toList())}
• {\f1\cf5 flatMap} for one-to-many transformations
• {\f1\cf5 groupingBy + counting/summing/collectingAndThen}
• {\f1\cf5 toMap with merge function}
• {\f1\cf5 toList()} (JDK 16+) over {\f1\cf5 Collectors.toList()} when you want unmodifiable results
• {\f1\cf5 takeWhile/dropWhile} for streaming prefixes/suffixes

Worked Examples

Frequency Map
--- code ---

Map<String, Long> freq =


words.stream()
.map(String::toLowerCase)
.collect(Collectors.groupingBy(w -> w,
Collectors.counting()));
--- end code ---

First Non-Empty String


--- code ---

Optional<String> first =
strings.stream().filter(s -> s != null && !
s.isBlank()).findFirst();
--- end code ---

Safe toMap with Duplicates


--- code ---

Map<String, String> latest =


entries.stream().collect(Collectors.toMap(
e -> e.key(),
e -> e.value(),
(a,b) -> b // keep last
));
--- end code ---

API Reference — Stream<T> (by category)

Creation
of, ofNullable, empty, generate, iterate (2 overloads), builder; Arrays.stream;
Collection.stream/parallelStream; Files.lines; Pattern.splitAsStream, BufferedReader.lines
Intermediate
filter, map, mapToInt/Long/Double, flatMap, flatMapToInt/Long/Double, mapMulti, distinct, sorted,
peek, limit, skip, takeWhile, dropWhile, boxed, parallel, sequential, unordered, onClose

Terminal
forEach, forEachOrdered, toArray, reduce(3), collect(Collector), collect(supplier,acc,combiner), min, max,
count, anyMatch, allMatch, noneMatch, findFirst, findAny, toList (JDK 16+)

API Reference — IntStream/LongStream/DoubleStream


range, rangeClosed (Int/Long); sum, average, min, max, count, summaryStatistics; map, mapToObj,
flatMap, mapMulti; distinct, sorted, limit, skip; boxed; asDoubleStream/asLongStream; parallel,
sequential; collect; reduce; anyMatch/allMatch/noneMatch; findFirst/findAny; toArray; iterate/generate

API Reference — Collectors


toList, toUnmodifiableList, toSet, toUnmodifiableSet, toCollection, toMap (3 overloads),
toUnmodifiableMap, toConcurrentMap (3 overloads), joining (3 overloads), counting,
summingInt/Long/Double, averagingInt/Long/Double, summarizingInt/Long/Double, mapping, filtering,
flatMapping, reducing (3 overloads), collectingAndThen, partitioningBy (2 overloads), groupingBy (3
overloads), groupingByConcurrent (3 overloads), teeing (JDK 12+)

Under the Hood: Spliterators


• A Stream is backed by a {\f1\cf5 Spliterator} with characteristics like {\f1\cf5 ORDERED,
DISTINCT, SORTED, SIZED, NONNULL, IMMUTABLE, CONCURRENT, SUBSIZED}
• Parallel splits work best with balanced, efficiently splittable sources (e.g., ArrayList).

Interview Tips & Patterns


• Explain laziness and stateless vs stateful operations with examples.
• Know {\f1\cf5 toMap} duplicate handling and {\f1\cf5 groupingBy + downstream} combos.
• Avoid side effects; prefer collectors and pure functions.
• Choose sequential vs parallel based on workload and data size.

You might also like