Course Content
Multithreading in Java
Multithreading in Java
Parallel Stream API
You're probably already familiar with the Stream API
, its methods, and how it works (If not, study this topic and then come back to this chapter).
A regular data stream is not parallel, that is, no matter how convenient and beautiful it may be in code, using Stream API without using parallelStream()
method, with a large amount of data, can greatly affect performance.
There is also a parallel()
method that can be used after conversion to stream.
The difference is that parallelStream()
creates a parallel stream directly from the collection, whereas parallel()
converts an existing serial stream into a parallel stream.
Note
And most of all, we as programmers do not need to do anything except to change the
stream()
method toparallelStream()
. Stream API does everything by itself and optimizes our program!
Example: Processing a List of Numbers
Suppose we have a list of numbers and we want to find the sum of squares of all numbers in the list.
Main
package com.example; import java.util.Arrays; import java.util.List; public class Main { public static void main(String[] args) { // Create a list of integers List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); // Sequential stream to sum the squares of numbers int sumSequential = numbers.stream() // Create a sequential stream from the list .mapToInt(n -> n * n) // Map each number to its square .sum(); // Sum the squares // Print the result of the sequential sum System.out.println("Sum of squares (sequential): " + sumSequential); // Parallel stream to sum the squares of numbers int sumParallel = numbers.parallelStream() // Create a parallel stream from the list .mapToInt(n -> n * n) // Map each number to its square .sum(); // Sum the squares // Print the result of the parallel sum System.out.println("Sum of squares (parallel): " + sumParallel); } }
As you can see we just replaced stream()
with parallelStream()
AND THAT'S ALL. In this example it won't give any gain, because in a single-threaded environment an array of 10 characters will be executed faster. Because in the implementation of Stream API does a lot of actions to distribute the task between threads.
Note
Stream API also decides itself how many threads it will use for this task, so that it will be as efficient as possible.
How it Works Under the Hood:
1. Creating a parallel stream: When you call parallelStream()
, Java creates a parallel stream based on the original data source;
2. Using ForkJoinPool(we'll explore later): Parallel streams use a common thread pool, ForkJoinPool.commonPool()
, which manages a group of worker threads;
3. Splitting: Data in a parallel thread is divided into parts using the Spliterator
interface;
4. Processing: Each worker thread in ForkJoinPool
processes its part of the data;
5. Merging: After processing the data, the worker threads merge the results.
Advantages of Parallel Streams
Increased performance is one of the key benefits of parallel threads, as they enable task distribution across multiple threads, resulting in faster processing on multi-core processors.
Additionally, the ease of use of the parallel threads API makes it simple to integrate into existing code, eliminating the need for complex thread management.
Furthermore, scalability is a significant advantage, as parallel threads automatically adjust to the number of available processor cores, optimizing task execution efficiently.
1. Which class is used by parallel threads to control threads?
2. Which method is used to create a parallel stream?
3. What does the Spliterator interface do in the context of parallel streams?
Thanks for your feedback!