Stream Operations
Streams, introduced in Java 8, provide a powerful and expressive way to process collections of data. They enable functional-style operations on collections, allowing for concise and readable code while supporting parallel execution.
Introduction to Streams
Section titled “Introduction to Streams”A stream is a sequence of elements supporting sequential and parallel aggregate operations. Streams are not data structures; they take input from collections, arrays, or I/O channels and don’t modify the original data source.
Key Characteristics
Section titled “Key Characteristics”- Not a data structure: Streams don’t store elements; they convey elements from a source through a pipeline of operations
- Functional in nature: Operations produce results without modifying the source
- Laziness-seeking: Many stream operations are lazy, allowing for optimization
- Possibly unbounded: Streams can represent infinite sequences
- Consumable: Elements are visited only once during the life of a stream
Stream Pipeline Structure
Section titled “Stream Pipeline Structure”A stream pipeline consists of:
- Source: Where the stream comes from (collection, array, generator function, I/O channel)
- Intermediate operations: Transform the stream into another stream (filter, map, sorted, etc.)
- Terminal operation: Produces a result or side-effect (collect, forEach, reduce, etc.)
// Stream pipeline exampleList<String> result = people.stream() // Source .filter(p -> p.getAge() > 18) // Intermediate operation .map(Person::getName) // Intermediate operation .collect(Collectors.toList()); // Terminal operationCreating Streams
Section titled “Creating Streams”From Collections
Section titled “From Collections”List<String> list = Arrays.asList("apple", "banana", "cherry");
// Create a stream from a collectionStream<String> stream = list.stream();
// Create a parallel streamStream<String> parallelStream = list.parallelStream();From Arrays
Section titled “From Arrays”String[] array = {"apple", "banana", "cherry"};
// Create a stream from an arrayStream<String> stream = Arrays.stream(array);
// Create a stream from array with rangeStream<String> partialStream = Arrays.stream(array, 1, 3); // banana, cherryFrom Static Factory Methods
Section titled “From Static Factory Methods”// Empty streamStream<String> emptyStream = Stream.empty();
// Stream with explicit valuesStream<String> valueStream = Stream.of("apple", "banana", "cherry");
// Infinite streamsStream<Integer> infiniteIntegers = Stream.iterate(0, n -> n + 1);Stream<Double> infiniteRandoms = Stream.generate(Math::random);
// Bounded streams from infinite sourcesStream<Integer> first10Integers = Stream.iterate(0, n -> n + 1).limit(10);From Other Sources
Section titled “From Other Sources”// From a file (lines as stream)try (Stream<String> lines = Files.lines(Paths.get("file.txt"))) { // Process lines}
// From a string (characters as stream)IntStream charStream = "hello".chars();
// From a regex patternPattern pattern = Pattern.compile(",");Stream<String> patternStream = pattern.splitAsStream("a,b,c");Intermediate Operations
Section titled “Intermediate Operations”Intermediate operations transform a stream into another stream. They are lazy and only executed when a terminal operation is invoked.
Filtering Operations
Section titled “Filtering Operations”List<Person> people = getPeople(); // Assume this returns a list of Person objects
// Filter by predicateStream<Person> adults = people.stream() .filter(person -> person.getAge() >= 18);
// Limit number of elementsStream<Person> first10Adults = adults.limit(10);
// Skip elementsStream<Person> skippedAdults = adults.skip(5);
// Remove duplicatesStream<String> uniqueNames = people.stream() .map(Person::getName) .distinct();Mapping Operations
Section titled “Mapping Operations”// Transform each elementStream<String> names = people.stream() .map(Person::getName);
// Transform to primitives (avoid boxing/unboxing)IntStream ages = people.stream() .mapToInt(Person::getAge);
// Flatten nested streamsList<List<String>> nestedList = Arrays.asList( Arrays.asList("a", "b"), Arrays.asList("c", "d"));
Stream<String> flattenedStream = nestedList.stream() .flatMap(Collection::stream); // [a, b, c, d]Sorting Operations
Section titled “Sorting Operations”// Natural order sortingStream<String> sortedNames = people.stream() .map(Person::getName) .sorted();
// Custom comparatorStream<Person> sortedByAge = people.stream() .sorted(Comparator.comparingInt(Person::getAge));
// Reverse orderStream<Person> sortedByAgeDesc = people.stream() .sorted(Comparator.comparingInt(Person::getAge).reversed());
// Multiple criteriaStream<Person> sortedComplex = people.stream() .sorted(Comparator.comparing(Person::getLastName) .thenComparing(Person::getFirstName) .thenComparingInt(Person::getAge));Peeking Operations
Section titled “Peeking Operations”// Peek is useful for debuggingStream<Person> peekStream = people.stream() .filter(p -> p.getAge() > 30) .peek(p -> System.out.println("Filtered: " + p.getName())) .map(Person::getName) .peek(name -> System.out.println("Mapped: " + name));Terminal Operations
Section titled “Terminal Operations”Terminal operations produce a result or side-effect and end the stream pipeline.
Collection Operations
Section titled “Collection Operations”// Collect to ListList<String> nameList = people.stream() .map(Person::getName) .collect(Collectors.toList());
// Collect to SetSet<String> nameSet = people.stream() .map(Person::getName) .collect(Collectors.toSet());
// Collect to MapMap<String, Integer> nameToAgeMap = people.stream() .collect(Collectors.toMap( Person::getName, // Key mapper Person::getAge // Value mapper ));
// Handle duplicate keysMap<String, Person> lastNameToPerson = people.stream() .collect(Collectors.toMap( Person::getLastName, Function.identity(), (existing, replacement) -> existing // Keep first occurrence on duplicate ));
// Collect to specific collection typeLinkedList<String> linkedList = people.stream() .map(Person::getName) .collect(Collectors.toCollection(LinkedList::new));Reduction Operations
Section titled “Reduction Operations”// Countlong count = people.stream().count();
// Any/All/None matchboolean anyAdults = people.stream().anyMatch(p -> p.getAge() >= 18);boolean allAdults = people.stream().allMatch(p -> p.getAge() >= 18);boolean noAdults = people.stream().noneMatch(p -> p.getAge() >= 18);
// Find operationsOptional<Person> anyPerson = people.stream().findAny();Optional<Person> firstPerson = people.stream().findFirst();
// Min/MaxOptional<Person> oldest = people.stream() .max(Comparator.comparingInt(Person::getAge));Optional<Person> youngest = people.stream() .min(Comparator.comparingInt(Person::getAge));
// ReduceOptional<Integer> totalAge = people.stream() .map(Person::getAge) .reduce(Integer::sum);
// Reduce with identity valueint totalAgeWithIdentity = people.stream() .map(Person::getAge) .reduce(0, Integer::sum);
// Reduce with combiner (for parallel streams)int totalAgeParallel = people.stream() .reduce(0, (subtotal, person) -> subtotal + person.getAge(), // Accumulator Integer::sum // Combiner );Iteration and Side-Effects
Section titled “Iteration and Side-Effects”// ForEach (side-effect operation)people.stream().forEach(System.out::println);
// ForEachOrdered (maintains encounter order in parallel streams)people.parallelStream().forEachOrdered(System.out::println);
// ToArrayPerson[] personArray = people.stream().toArray(Person[]::new);Advanced Collectors
Section titled “Advanced Collectors”The Collectors utility class provides many powerful collectors for common reduction operations.
Grouping and Partitioning
Section titled “Grouping and Partitioning”// Group by a classifierMap<String, List<Person>> peopleByLastName = people.stream() .collect(Collectors.groupingBy(Person::getLastName));
// Group by with downstream collectorMap<String, Long> countByLastName = people.stream() .collect(Collectors.groupingBy( Person::getLastName, Collectors.counting() ));
// Multi-level groupingMap<String, Map<Integer, List<Person>>> peopleByLastNameAndAge = people.stream() .collect(Collectors.groupingBy( Person::getLastName, Collectors.groupingBy(Person::getAge) ));
// Partition by (special case of groupingBy with boolean predicate)Map<Boolean, List<Person>> adultPartition = people.stream() .collect(Collectors.partitioningBy(p -> p.getAge() >= 18));
// Access partitioned resultsList<Person> adults = adultPartition.get(true);List<Person> minors = adultPartition.get(false);Statistics and Summarization
Section titled “Statistics and Summarization”// Countinglong personCount = people.stream() .collect(Collectors.counting());
// Summingint totalAge = people.stream() .collect(Collectors.summingInt(Person::getAge));
// Averagingdouble averageAge = people.stream() .collect(Collectors.averagingInt(Person::getAge));
// Min/Max by comparatorOptional<Person> oldestPerson = people.stream() .collect(Collectors.maxBy(Comparator.comparingInt(Person::getAge)));
// Statistics summaryIntSummaryStatistics ageStats = people.stream() .collect(Collectors.summarizingInt(Person::getAge));
System.out.println("Count: " + ageStats.getCount());System.out.println("Sum: " + ageStats.getSum());System.out.println("Min: " + ageStats.getMin());System.out.println("Max: " + ageStats.getMax());System.out.println("Average: " + ageStats.getAverage());String Joining
Section titled “String Joining”// Simple joiningString allNames = people.stream() .map(Person::getName) .collect(Collectors.joining());
// Joining with delimiterString commaSeparatedNames = people.stream() .map(Person::getName) .collect(Collectors.joining(", "));
// Joining with delimiter, prefix, and suffixString formattedNames = people.stream() .map(Person::getName) .collect(Collectors.joining(", ", "[", "]"));Custom Collectors
Section titled “Custom Collectors”// Custom collector to create an ImmutableList (using Guava)ImmutableList<Person> immutablePeople = people.stream() .collect(Collectors.collectingAndThen( Collectors.toList(), ImmutableList::copyOf ));
// Custom collector with mappingMap<String, Person> nameToPersonMap = people.stream() .collect(Collectors.toMap( Person::getName, Function.identity() ));Parallel Streams
Section titled “Parallel Streams”Parallel streams leverage the fork/join framework to execute operations in parallel across multiple threads.
// Create a parallel stream from a collectionList<String> result = people.parallelStream() .filter(p -> p.getAge() > 18) .map(Person::getName) .collect(Collectors.toList());
// Convert sequential stream to parallelStream<Person> parallelStream = people.stream().parallel();
// Convert parallel stream to sequentialStream<Person> sequentialStream = people.parallelStream().sequential();When to Use Parallel Streams
Section titled “When to Use Parallel Streams”Good candidates for parallel streams:
- Large data sets
- Operations that are computationally intensive
- Operations where elements can be processed independently
- When you have enough cores available
Poor candidates for parallel streams:
- Small data sets (overhead may exceed benefits)
- Operations with side effects
- Operations that rely on encounter order
- When using sources or sinks that are inherently sequential
// Example: CPU-intensive calculation on large datasetlong sum = IntStream.range(0, 10_000_000) .parallel() .map(n -> performExpensiveComputation(n)) .sum();Common Use Cases and Patterns
Section titled “Common Use Cases and Patterns”Filtering and Transforming
Section titled “Filtering and Transforming”// Find all unique, non-empty, lowercase wordsList<String> words = Arrays.asList("Hello", "World", "", "Java", "hello", "world");
List<String> uniqueLowercase = words.stream() .filter(s -> !s.isEmpty()) .map(String::toLowerCase) .distinct() .collect(Collectors.toList());// Result: [hello, world, java]Finding and Matching
Section titled “Finding and Matching”// Find the first person over 30 with a specific last nameOptional<Person> match = people.stream() .filter(p -> p.getAge() > 30) .filter(p -> "Smith".equals(p.getLastName())) .findFirst();
// Check if any person is from a specific cityboolean anyFromNewYork = people.stream() .anyMatch(p -> "New York".equals(p.getCity()));Aggregation and Summarization
Section titled “Aggregation and Summarization”// Calculate total salary by departmentMap<String, Double> totalSalaryByDept = employees.stream() .collect(Collectors.groupingBy( Employee::getDepartment, Collectors.summingDouble(Employee::getSalary) ));
// Find highest paid employee by departmentMap<String, Optional<Employee>> topEarnerByDept = employees.stream() .collect(Collectors.groupingBy( Employee::getDepartment, Collectors.maxBy(Comparator.comparingDouble(Employee::getSalary)) ));Data Transformation
Section titled “Data Transformation”// Convert list of employees to a map of employee ID to employeeMap<Long, Employee> employeeMap = employees.stream() .collect(Collectors.toMap( Employee::getId, Function.identity() ));
// Extract a specific property from all objectsList<String> allEmails = employees.stream() .map(Employee::getEmail) .collect(Collectors.toList());Complex Data Processing
Section titled “Complex Data Processing”// Find average salary by department for employees with age > 30Map<String, Double> avgSalaryByDept = employees.stream() .filter(e -> e.getAge() > 30) .collect(Collectors.groupingBy( Employee::getDepartment, Collectors.averagingDouble(Employee::getSalary) ));
// Count employees by department and positionMap<String, Map<String, Long>> countByDeptAndPosition = employees.stream() .collect(Collectors.groupingBy( Employee::getDepartment, Collectors.groupingBy( Employee::getPosition, Collectors.counting() ) ));Collecting to Custom Objects
Section titled “Collecting to Custom Objects”// Collect stream results into a custom summary objectclass DepartmentSummary { private final String name; private final long employeeCount; private final double totalSalary; private final double averageSalary;
// Constructor, getters...}
List<DepartmentSummary> summaries = employees.stream() .collect(Collectors.groupingBy(Employee::getDepartment)) .entrySet().stream() .map(entry -> { String dept = entry.getKey(); List<Employee> deptEmployees = entry.getValue(); long count = deptEmployees.size(); double totalSalary = deptEmployees.stream() .mapToDouble(Employee::getSalary) .sum();
return new DepartmentSummary( dept, count, totalSalary, totalSalary / count ); }) .collect(Collectors.toList());Interview Questions on Stream Operations
Section titled “Interview Questions on Stream Operations”1. What is the difference between intermediate and terminal operations in streams?
Section titled “1. What is the difference between intermediate and terminal operations in streams?”Answer:
- Intermediate operations (like
filter,map,sorted) transform a stream into another stream, are lazy (not executed until a terminal operation is invoked), and can be chained together. - Terminal operations (like
collect,forEach,reduce) produce a result or side-effect, trigger the execution of the stream pipeline, and end the stream (it cannot be reused after).
2. Explain the difference between map and flatMap operations.
Section titled “2. Explain the difference between map and flatMap operations.”Answer:
maptransforms each element of a stream into exactly one element of another stream using a one-to-one mapping function.flatMaptransforms each element of a stream into zero or more elements of another stream using a one-to-many mapping function, and then flattens these multiple streams into a single stream.
// map: Stream<T> -> Stream<R> (one-to-one)Stream<String> upperCaseNames = names.stream() .map(String::toUpperCase);
// flatMap: Stream<T> -> Stream<Stream<R>> -> Stream<R> (one-to-many + flattening)Stream<String> allWords = sentences.stream() .flatMap(sentence -> Arrays.stream(sentence.split(" ")));3. What is the purpose of the collect operation?
Section titled “3. What is the purpose of the collect operation?”Answer: The collect operation is a terminal operation that transforms elements of a stream into a different form by accumulating them into a collection, summarizing them, or performing some other aggregation operation. It uses a Collector which encapsulates the functions needed to accumulate elements into a mutable result container and optionally transform the result.
4. How do parallel streams work in Java?
Section titled “4. How do parallel streams work in Java?”Answer: Parallel streams leverage the fork/join framework to execute operations in parallel across multiple threads. When a parallel stream is created, the stream is split into multiple substreams, operations are performed on these substreams in parallel, and then the results are combined. This can lead to performance improvements for large data sets and computationally intensive operations, but introduces complexities like non-deterministic ordering and thread-safety concerns.
5. What are the potential issues with using parallel streams?
Section titled “5. What are the potential issues with using parallel streams?”Answer:
- Thread safety: Operations must be stateless and non-interfering
- Overhead: For small data sets, the overhead of parallelization may exceed benefits
- Non-deterministic ordering: Results may come in different order than the input
- Common pool contention: Uses the common ForkJoinPool which might be used by other parts of the application
- Cost of merging results: Some operations are expensive to merge across threads
6. Explain the difference between findFirst and findAny.
Section titled “6. Explain the difference between findFirst and findAny.”Answer:
findFirst()returns anOptionaldescribing the first element of the stream, respecting the encounter order if defined.findAny()returns any element of the stream, without guaranteeing which one. In parallel streams,findAny()may be more efficient as it doesn’t require coordination to respect encounter order.
7. How would you implement a custom collector?
Section titled “7. How would you implement a custom collector?”Answer: A custom collector can be implemented by providing the four functions required by the Collector interface:
public static <T> Collector<T, ?, CustomResult> toCustomResult() { return Collector.of( CustomAccumulator::new, // Supplier: creates the accumulator CustomAccumulator::add, // Accumulator: adds an element CustomAccumulator::combine, // Combiner: combines two accumulators CustomAccumulator::finishResult // Finisher: final transformation );}Alternatively, you can use Collectors.collectingAndThen() to transform the result of an existing collector.
8. What is the difference between reduce and collect operations?
Section titled “8. What is the difference between reduce and collect operations?”Answer:
reducecombines elements into a single result by repeatedly applying a combining operation. It’s best for operations where the result type is the same as the element type or when combining elements in a simple way.collectis more versatile and designed for mutable reduction operations that accumulate elements into a mutable container. It’s better for complex aggregations, especially when the result type differs from the element type.
9. How can you avoid ConcurrentModificationException when using streams with collections?
Section titled “9. How can you avoid ConcurrentModificationException when using streams with collections?”Answer: Streams operate on a snapshot of the collection at the time the stream is created, so modifying the source collection during stream processing won’t cause a ConcurrentModificationException. However, if you need to modify a collection based on stream results:
// Incorrect - may cause ConcurrentModificationExceptionlist.forEach(item -> { if (someCondition(item)) { list.remove(item); // Modifying the collection during iteration }});
// Correct - collect elements to remove first, then remove themList<Item> itemsToRemove = list.stream() .filter(this::someCondition) .collect(Collectors.toList());
list.removeAll(itemsToRemove);
// Alternatively, use removeIf (Java 8+)list.removeIf(this::someCondition);10. How would you debug a stream pipeline?
Section titled “10. How would you debug a stream pipeline?”Answer: The peek() operation is useful for debugging streams as it allows you to perform an action on each element without modifying the stream:
List<String> result = people.stream() .filter(p -> p.getAge() > 30) .peek(p -> System.out.println("After filter: " + p)) .map(Person::getName) .peek(name -> System.out.println("After map: " + name)) .collect(Collectors.toList());Other debugging approaches include:
- Breaking complex pipelines into smaller parts
- Using logging instead of System.out.println in peek
- Using a debugger with conditional breakpoints
- Converting intermediate results to collections for inspection