Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Sets and Tuples | Efficient Use of Data Structures
Optimization Techniques in Python
course content

Course Content

Optimization Techniques in Python

Optimization Techniques in Python

1. Understanding and Measuring Performance
2. Efficient Use of Data Structures
3. Enhancing Performance with Built-in Tools

bookSets and Tuples

Before we proceed with sets and tuples, it is important to mention that we won't discuss dictionaries here.

Set

A set offers an average O(1) time complexity for insertions, deletions, and lookups, meaning these operations are performed in constant time, regardless of the set’s size. This makes sets much faster than lists for membership testing and operations like adding or removing elements, where lists require O(n) time complexity (time grows linearly with the size of the list) in the worst case.

When to use:

  • You need unique elements, ensuring that there are no duplicates in your collection;
  • Fast membership testing is required, making sets ideal for tasks like checking for the existence of an item;
  • You are performing operations such as set unions, intersections, or differences, which sets support with optimized methods;
  • Order does not matter, as sets are inherently unordered, and there’s no need to maintain insertion order.
1234567891011121314151617181920
# Removing duplicates from a list using a set numbers = [1, 3, 2, 3, 5, 4, 5] unique_numbers = set(numbers) print(f'Unique Numbers: {unique_numbers}') # Fast membership testing names = {'Alice', 'Bob', 'Charlie'} print(f'Is Alice in the set? {"Alice" in names}') print(f'Is Eve in the set? {"Eve" in names}') # Set operations: union, intersection, and difference set_a = {1, 4, 3, 2} set_b = {3, 5, 4, 6} print(f'Union: {set_a.union(set_b)}') print(f'Intersection: {set_a.intersection(set_b)}') print(f'Difference: {set_a.difference(set_b)}') # Removing elements from a set safely with discard names.discard('Alice') # Safe removal, no error if the element doesn't exist print(f'Names after removal: {names}')
copy

Let's now compare the performance of a set to a list in membership testing:

12345678910111213141516171819
import os os.system('wget https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/8d21890f-d960-4129-bc88-096e24211d53/section_1/chapter_3/decorators.py 2>/dev/null') from decorators import timeit_decorator # Create a large list and set with the same elements large_list = list(range(10000000)) large_set = set(large_list) # Test membership for an element at the end element_to_find = 9999999 @timeit_decorator(number=50) def test_membership(element, collection): return element in collection print('List:') print(test_membership(element_to_find, large_list)) print('Set:') print(test_membership(element_to_find, large_set))
copy

Tuple

Tuples are typically used when you need to ensure that the data cannot be changed or as a key in a dict or element in set (because tuples are hashable).

  • Better than lists: when you need immutable data, want to use the collection as a dictionary key or set element, or when you need memory-efficient storage for a fixed-size collection of items;

  • Better than NumPy arrays: when your data is non-numerical or when immutability is crucial. While NumPy arrays are designed for numerical computations and are mutable by default, tuples provide safety for non-numerical data or small, structured collections that must remain constant.

12345678
# Each tuple in the list represents an immutable student record students = [ (1834, 'James', 'Johnson'), (2749, 'Alice', 'Smith'), (4923, 'Bob', 'Brown') ] # Attempting to modify a tuple will raise a TypeError students[0][1] = 'Fred'
copy

Since the students list contains student records (ID, first name, last name) that need to be read-only, it is better to use tuples for each record instead of lists. Additionally, as mentioned above, tuples are slightly more memory-efficient compared to lists.

1. Which of the following scenarios is best suited for using a set instead of a list?
2. You have a dataset with millions of records and need to frequently check if specific values exist within it. Which data structure is the most efficient for this purpose?
3. You are creating a record for each student that includes a unique ID, first name, and last name. The data should not be altered once created. Which data structure would be the most appropriate?
Which of the following scenarios is best suited for using a set instead of a list?

Which of the following scenarios is best suited for using a set instead of a list?

Select the correct answer

You have a dataset with millions of records and need to frequently check if specific values exist within it. Which data structure is the most efficient for this purpose?

You have a dataset with millions of records and need to frequently check if specific values exist within it. Which data structure is the most efficient for this purpose?

Select the correct answer

You are creating a record for each student that includes a unique ID, first name, and last name. The data should not be altered once created. Which data structure would be the most appropriate?

You are creating a record for each student that includes a unique ID, first name, and last name. The data should not be altered once created. Which data structure would be the most appropriate?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 2. Chapter 2
We're sorry to hear that something went wrong. What happened?
some-alt