Learn collect() Gathering Stream Elements into a Collection | Terminal Operations in the Stream API

You are already familiar with terminal operations and have even used them in previous examples and exercises. Now it's time to take a closer look at how they work. First up is the collect() method, which is one of the key terminal operations in Stream API.

The collect() Method

It is one of the most powerful tools when working with streams, allowing us to accumulate results into a List, Set, or Map, as well as perform complex groupings and statistical calculations.

There are two implementations of the collect() method—let's explore both.

Using collect() with Functional Interfaces

The collect() method in Stream API can be used with three functional interfaces to give full control over data collection:

Supplier<R> supplier – creates an empty collection (R) where elements will be stored. For example, ArrayList::new initializes a new list;
BiConsumer<R, ? super T> accumulator – adds stream elements (T) to the collection (R). For instance, List::add appends items to a list;
BiConsumer<R, R> combiner – merges two collections when parallel processing is used. For example, List::addAll combines lists into one.

All three components work together to provide flexibility in data collection. First, the supplier creates an empty collection that will be used to accumulate elements from the stream. Then, the accumulator adds each element as the stream processes them. This flow remains straightforward in a sequential stream.

However, when working with parallel streams (parallelStream()), things get more complex.

The data processing is split across multiple threads, with each thread creating its own separate collection. Once the processing is complete, these individual collections need to be merged into a single result. This is where the combiner comes in, efficiently combining the separate parts into one unified collection.

Practical Example

You work for an online store and have a list of products. Your task is to collect only the products that cost more than $500 using the collect() method with three parameters.

Main.java


              12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
            
package com.example;

import java.util.ArrayList;
import java.util.List;

public class Main {
    public static void main(String[] args) {
        // Initial list of products
        List<Product> productList = List.of(
                new Product("Laptop", 1200.99),
                new Product("Phone", 599.49),
                new Product("Headphones", 199.99),
                new Product("Monitor", 299.99),
                new Product("Tablet", 699.99)
        );

        // Filtering and collecting products over $500 using `collect()`
        List<Product> expensiveProducts = productList.parallelStream()
                .filter(product -> product.getPrice() > 500) // Keep only expensive products
                .collect(
                        ArrayList::new,                      // Create a new list
                        (list, product) -> list.add(product), // Add each product to the list
                        ArrayList::addAll                     // Merge lists (if the stream is parallel)
                );

        // Print the result
        System.out.print("Products over $500: " + expensiveProducts);
    }
}

class Product {
    private String name;
    private double price;

    Product(String name, double price) {
        this.name = name;
        this.price = price;
    }

    public String getName() {
        return name;
    }

    public double getPrice() {
        return price;
    }

    @Override
    public String toString() {
        return name + " ($" + price + ")";
    }
}

The collect() method takes three arguments, each defining a different step in collecting elements into a list:

ArrayList::new (Supplier) → creates an empty ArrayList<Product> to store the results;
(list, product) -> list.add(product) (BiConsumer) → adds each Product to the list if it meets the filter condition (price > 500);
ArrayList::addAll (BiConsumer) → merges multiple lists when using parallel streams, ensuring all filtered products are combined into a single list.

Even though the third parameter is mainly for parallel processing, it’s required by collect().

Using collect() with the Collector Interface

In addition to working with three functional interfaces, the collect() method in Stream API can also be used with predefined implementations of the Collector interface.

This approach is more flexible and convenient since it provides built-in methods for working with collections.

The Collector<T, A, R> interface consists of several key methods:

Supplier<A> supplier() – creates an empty container for accumulating elements;
BiConsumer<A, T> accumulator() – defines how elements are added to the container;
BinaryOperator<A> combiner() – merges two containers when parallel processing is used;
Function<A, R> finisher() – transforms the container into the final result.

As you can see, this structure is similar to the collect() method that works with functional interfaces, but it introduces the finisher() method. This additional step allows for extra processing on the collected data before returning the final result—for example, sorting the list before returning it.

Additionally, the Collector interface provides the characteristics() method, which defines properties that help optimize stream execution:

These characteristics help Stream API optimize performance. For example, if a collection is inherently unordered, specifying UNORDERED can prevent unnecessary sorting, making the operation more efficient.

Practical Example

Imagine you run an online store and need to process product prices before collecting them. For instance, you want to round each price to the nearest whole number, remove duplicates, and sort the final list.

Main.java


              12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
            
package com.example;

import java.util.*;
import java.util.function.*;
import java.util.stream.Collector;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        List<Double> prices = List.of(1200.99, 599.49, 199.99, 599.49, 1200.49, 200.0);

        // Using a custom `Collector` to round prices, remove duplicates, and sort
        List<Integer> processedPrices = prices.parallelStream()
                                .collect(new RoundedSortedCollector());

        System.out.println("Processed product prices: " + processedPrices);
    }
}

// Custom `Collector` that rounds prices, removes duplicates, and sorts them
class RoundedSortedCollector implements Collector<Double, Set<Integer>, List<Integer>> {

    @Override
    public Supplier<Set<Integer>> supplier() {
        // Creates a `HashSet` to store unique rounded values
        return HashSet::new;
    }

    @Override
    public BiConsumer<Set<Integer>, Double> accumulator() {
        // Rounds price and adds to the set
        return (set, price) -> set.add((int) Math.round(price)); 
    }

    @Override
    public BinaryOperator<Set<Integer>> combiner() {
        return (set1, set2) -> {
            set1.addAll(set2); // Merges two sets
            return set1;
        };
    }

    @Override
    public Function<Set<Integer>, List<Integer>> finisher() {
        return set -> set.stream()
                .sorted() // Sorts the final list
                .toList();
    }

    @Override
    public Set<Characteristics> characteristics() {
        // Order is not important during accumulation
        return Set.of(Characteristics.UNORDERED);
    }
}

You start processing the data by passing it into a custom Collector called RoundedSortedCollector.

This collector first accumulates all prices into a Set<Integer>, ensuring that duplicates are automatically removed. Before adding each value, it rounds the price using Math.round(price) and converts it to an int. For example, both 1200.99 and 1200.49 will become 1200, while 199.99 will round up to 200.

If the stream runs in parallel mode, the combiner() method merges two sets by adding all elements from one set into another. This step is crucial for multi-threaded environments.

In the final stage, after all prices are collected, the finisher() method transforms the set into a sorted list. It converts the Set<Integer> into a stream, applies sorted() to arrange values in ascending order, and then collects them into a List<Integer>.

As a result, you get a sorted list of unique, rounded prices that can be used for further calculations or display purposes.

1. What does the `collect()` method do in Stream API?

2. What additional capability does the `Collector` interface provide compared to `collect()` with functional interfaces?

Everything was clear?

Thanks for your feedback!

Section 3. Chapter 1

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Swipe to show menu

The collect() Method

There are two implementations of the collect() method—let's explore both.

Using collect() with Functional Interfaces

The collect() method in Stream API can be used with three functional interfaces to give full control over data collection:

Supplier<R> supplier – creates an empty collection (R) where elements will be stored. For example, ArrayList::new initializes a new list;
BiConsumer<R, ? super T> accumulator – adds stream elements (T) to the collection (R). For instance, List::add appends items to a list;
BiConsumer<R, R> combiner – merges two collections when parallel processing is used. For example, List::addAll combines lists into one.

However, when working with parallel streams (parallelStream()), things get more complex.

Practical Example

You work for an online store and have a list of products. Your task is to collect only the products that cost more than $500 using the collect() method with three parameters.

Main.java


              12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
            
package com.example;

import java.util.ArrayList;
import java.util.List;

public class Main {
    public static void main(String[] args) {
        // Initial list of products
        List<Product> productList = List.of(
                new Product("Laptop", 1200.99),
                new Product("Phone", 599.49),
                new Product("Headphones", 199.99),
                new Product("Monitor", 299.99),
                new Product("Tablet", 699.99)
        );

        // Filtering and collecting products over $500 using `collect()`
        List<Product> expensiveProducts = productList.parallelStream()
                .filter(product -> product.getPrice() > 500) // Keep only expensive products
                .collect(
                        ArrayList::new,                      // Create a new list
                        (list, product) -> list.add(product), // Add each product to the list
                        ArrayList::addAll                     // Merge lists (if the stream is parallel)
                );

        // Print the result
        System.out.print("Products over $500: " + expensiveProducts);
    }
}

class Product {
    private String name;
    private double price;

    Product(String name, double price) {
        this.name = name;
        this.price = price;
    }

    public String getName() {
        return name;
    }

    public double getPrice() {
        return price;
    }

    @Override
    public String toString() {
        return name + " ($" + price + ")";
    }
}

The collect() method takes three arguments, each defining a different step in collecting elements into a list:

ArrayList::new (Supplier) → creates an empty ArrayList<Product> to store the results;
(list, product) -> list.add(product) (BiConsumer) → adds each Product to the list if it meets the filter condition (price > 500);
ArrayList::addAll (BiConsumer) → merges multiple lists when using parallel streams, ensuring all filtered products are combined into a single list.

Even though the third parameter is mainly for parallel processing, it’s required by collect().

Using collect() with the Collector Interface

In addition to working with three functional interfaces, the collect() method in Stream API can also be used with predefined implementations of the Collector interface.

This approach is more flexible and convenient since it provides built-in methods for working with collections.

The Collector<T, A, R> interface consists of several key methods:

Supplier<A> supplier() – creates an empty container for accumulating elements;
BiConsumer<A, T> accumulator() – defines how elements are added to the container;
BinaryOperator<A> combiner() – merges two containers when parallel processing is used;
Function<A, R> finisher() – transforms the container into the final result.

Additionally, the Collector interface provides the characteristics() method, which defines properties that help optimize stream execution:

Practical Example

Main.java


              12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
            
package com.example;

import java.util.*;
import java.util.function.*;
import java.util.stream.Collector;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        List<Double> prices = List.of(1200.99, 599.49, 199.99, 599.49, 1200.49, 200.0);

        // Using a custom `Collector` to round prices, remove duplicates, and sort
        List<Integer> processedPrices = prices.parallelStream()
                                .collect(new RoundedSortedCollector());

        System.out.println("Processed product prices: " + processedPrices);
    }
}

// Custom `Collector` that rounds prices, removes duplicates, and sorts them
class RoundedSortedCollector implements Collector<Double, Set<Integer>, List<Integer>> {

    @Override
    public Supplier<Set<Integer>> supplier() {
        // Creates a `HashSet` to store unique rounded values
        return HashSet::new;
    }

    @Override
    public BiConsumer<Set<Integer>, Double> accumulator() {
        // Rounds price and adds to the set
        return (set, price) -> set.add((int) Math.round(price)); 
    }

    @Override
    public BinaryOperator<Set<Integer>> combiner() {
        return (set1, set2) -> {
            set1.addAll(set2); // Merges two sets
            return set1;
        };
    }

    @Override
    public Function<Set<Integer>, List<Integer>> finisher() {
        return set -> set.stream()
                .sorted() // Sorts the final list
                .toList();
    }

    @Override
    public Set<Characteristics> characteristics() {
        // Order is not important during accumulation
        return Set.of(Characteristics.UNORDERED);
    }
}

You start processing the data by passing it into a custom Collector called RoundedSortedCollector.

If the stream runs in parallel mode, the combiner() method merges two sets by adding all elements from one set into another. This step is crucial for multi-threaded environments.

As a result, you get a sorted list of unique, rounded prices that can be used for further calculations or display purposes.

1. What does the `collect()` method do in Stream API?

2. What additional capability does the `Collector` interface provide compared to `collect()` with functional interfaces?

Everything was clear?

Thanks for your feedback!

Section 3. Chapter 1

collect() Gathering Stream Elements into a Collection

The collect() Method

Using collect() with Functional Interfaces

Practical Example

Using collect() with the Collector Interface

Practical Example

1. What does the collect() method do in Stream API?

2. What additional capability does the Collector interface provide compared to collect() with functional interfaces?

Awesome!

collect() Gathering Stream Elements into a Collection

The collect() Method

Using collect() with Functional Interfaces

Practical Example

Using collect() with the Collector Interface

Practical Example

1. What does the collect() method do in Stream API?

2. What additional capability does the Collector interface provide compared to collect() with functional interfaces?

1. What does the `collect()` method do in Stream API?

2. What additional capability does the `Collector` interface provide compared to `collect()` with functional interfaces?

1. What does the `collect()` method do in Stream API?

2. What additional capability does the `Collector` interface provide compared to `collect()` with functional interfaces?