Course Content
Stream API
Stream API
Eliminating Duplicates with the distinct() Method
In real-world development, you often encounter situations where data contains duplicates that need to be removed. For example, imagine you're compiling a list of conference attendees, but due to system errors, some names have been recorded twice.
The distinct()
method helps solve this problem by eliminating duplicate elements from a stream.
This method returns a stream containing only unique elements, filtering out any duplicates.
How It Works
The distinct()
method relies on hashCode()
to quickly detect potential duplicates and equals()
to confirm whether they are truly identical. If two objects have different hash codes, they are considered unique. If the hash codes match, equals()
is called to verify their equality based on specific criteria.
Together, these methods form the hashCode()
and equals()
contract, ensuring proper comparison and duplicate removal.
Instead of writing them manually, IntelliJ IDEA allows us to generate them automatically.
In IntelliJ IDEA, you opened the code generation menu (Alt + Insert
on Windows/Linux, Cmd + N
on Mac) and selected equals()
and hashCode()
. After choosing the fields to include in the comparison, IDEA automatically generated the necessary methods.
Practical Example
A factory keeps track of produced parts, but the report contains duplicates as well as defective parts labeled DEFECT
. The goal is to clean the list, keeping only unique, non-defective parts, sort them by name, and display them in the format: Name - Serial Number
.
Main
package com.example; import java.util.List; import java.util.stream.Collectors; public class Main { public static void main(String[] args) { List<Part> parts = List.of( new Part("SN001", "Gear"), new Part("SN002", "Bolt"), new Part("SN003", "Nut"), new Part("SN001", "Gear"), new Part("SN004", "DEFECT Shaft"), new Part("SN005", "Screw"), new Part("SN002", "Bolt"), new Part("SN006", "DEFECT Washer") ); List<String> processedParts = parts.stream() .distinct() .filter(part -> !part.getName().contains("DEFECT")) .sorted((p1, p2) -> p1.getName().compareToIgnoreCase(p2.getName())) .map(part -> part.getName() + " - " + part.getSerialNumber()) .toList(); System.out.println(processedParts); } } class Part { private String serialNumber; private String name; public Part(String serialNumber, String name) { this.serialNumber = serialNumber; this.name = name; } public String getSerialNumber() { return serialNumber; } public String getName() { return name; } @Override public boolean equals(Object o) { if (this == o) return true; if (!(o instanceof Part)) return false; Part part = (Part) o; return serialNumber.equals(part.serialNumber); } @Override public int hashCode() { return serialNumber.hashCode(); } @Override public String toString() { return name + " - " + serialNumber; } }
To remove duplicates and filter out defective parts, the equals()
and hashCode()
methods must be correctly implemented to compare parts by serial number.
After that, you use distinct()
to eliminate duplicates, filter()
to remove defective parts containing DEFECT
in their name, sorted()
to order the remaining parts, and map()
to format them as strings.
1. What is used to compare elements in the distinct()
method?
2. If two objects have the same hashCode()
, but their equals()
method returns false, will they be considered the same in the distinct()
method?
Thanks for your feedback!