Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Vertical and Horizontal Partitioning | Design Patterns for Scalability and Flexibility
Database Design Patterns

bookVertical and Horizontal Partitioning

Partitioning is a fundamental strategy in database design that helps you manage large datasets by breaking a table into smaller, more manageable pieces. Two primary partitioning techniques are vertical partitioning and horizontal partitioning. Each has its own use cases and benefits, and both are widely used in real-world systems to improve performance and scalability.

Horizontal partitioning—often called sharding—involves splitting a table into multiple tables that each hold a subset of the rows. For example, in an e-commerce application with millions of orders, you might partition the orders table by year so that all orders from 2023 are stored separately from those in 2024. This makes queries targeting a specific year faster and makes maintenance tasks, like archiving old data, simpler.

Vertical partitioning splits a table into multiple tables, each containing a subset of the columns. This is useful when certain columns are rarely accessed together or contain large, infrequently used data, such as images or large text fields. For instance, you might separate frequently queried columns (like order_id, customer_id, and order_date) from less-used columns (like extra_info) into different tables, reducing the size of the main table and speeding up common queries.

-- Horizontal partitioning: Partitioning the orders table by order_date (already present in schema)
-- The following demonstrates creating a new yearly partition for 2025 orders

CREATE TABLE orders_2025 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

-- Vertical partitioning: Separating rarely used columns into a new table
-- Suppose we want to move 'extra_info' from orders to a separate table

CREATE TABLE order_extra_info (
    order_id INT PRIMARY KEY REFERENCES orders(order_id),
    extra_info JSONB
);

-- To populate the new table with existing data:
INSERT INTO order_extra_info (order_id, extra_info)
SELECT order_id, extra_info FROM orders WHERE extra_info IS NOT NULL;

-- Optionally, you could then remove 'extra_info' from the main orders table if desired

Each partitioning strategy offers distinct benefits and trade-offs. Horizontal partitioning improves query performance for large tables by limiting the number of rows scanned for each query and can simplify data management tasks such as backups and archiving. It also lays the groundwork for scaling out to multiple servers if needed. However, horizontal partitioning can complicate queries that need to access data across partitions and may require careful planning to avoid data hotspots.

Vertical partitioning reduces the size of frequently accessed tables, resulting in faster queries for common operations. By separating infrequently used or large columns, you minimize the amount of data loaded into memory during queries, which can be critical for performance. The downside is that queries needing data from both partitions now require joins, potentially increasing complexity and execution time.

Choosing the right partitioning strategy depends on your application's access patterns and growth expectations. For example, if you often query recent orders, horizontal partitioning by date will be highly effective. If your table has a few large columns that are rarely needed, vertical partitioning will help keep your main table lean and fast.

-- Efficient query on horizontally partitioned orders: get all 2024 orders for customer_id 1
SELECT * FROM orders
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
  AND customer_id = 1;

-- Efficient query on vertically partitioned data: join orders with order_extra_info for detailed info
SELECT o.order_id, o.customer_id, o.product_id, o.order_date, e.extra_info
FROM orders o
LEFT JOIN order_extra_info e ON o.order_id = e.order_id
WHERE o.order_id = 1;

1. What is horizontal partitioning?

2. How does vertical partitioning differ from horizontal partitioning?

3. What is a benefit of partitioning large tables?

question mark

What is horizontal partitioning?

Select the correct answer

question mark

How does vertical partitioning differ from horizontal partitioning?

Select the correct answer

question mark

What is a benefit of partitioning large tables?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 1

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

bookVertical and Horizontal Partitioning

Svep för att visa menyn

Partitioning is a fundamental strategy in database design that helps you manage large datasets by breaking a table into smaller, more manageable pieces. Two primary partitioning techniques are vertical partitioning and horizontal partitioning. Each has its own use cases and benefits, and both are widely used in real-world systems to improve performance and scalability.

Horizontal partitioning—often called sharding—involves splitting a table into multiple tables that each hold a subset of the rows. For example, in an e-commerce application with millions of orders, you might partition the orders table by year so that all orders from 2023 are stored separately from those in 2024. This makes queries targeting a specific year faster and makes maintenance tasks, like archiving old data, simpler.

Vertical partitioning splits a table into multiple tables, each containing a subset of the columns. This is useful when certain columns are rarely accessed together or contain large, infrequently used data, such as images or large text fields. For instance, you might separate frequently queried columns (like order_id, customer_id, and order_date) from less-used columns (like extra_info) into different tables, reducing the size of the main table and speeding up common queries.

-- Horizontal partitioning: Partitioning the orders table by order_date (already present in schema)
-- The following demonstrates creating a new yearly partition for 2025 orders

CREATE TABLE orders_2025 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

-- Vertical partitioning: Separating rarely used columns into a new table
-- Suppose we want to move 'extra_info' from orders to a separate table

CREATE TABLE order_extra_info (
    order_id INT PRIMARY KEY REFERENCES orders(order_id),
    extra_info JSONB
);

-- To populate the new table with existing data:
INSERT INTO order_extra_info (order_id, extra_info)
SELECT order_id, extra_info FROM orders WHERE extra_info IS NOT NULL;

-- Optionally, you could then remove 'extra_info' from the main orders table if desired

Each partitioning strategy offers distinct benefits and trade-offs. Horizontal partitioning improves query performance for large tables by limiting the number of rows scanned for each query and can simplify data management tasks such as backups and archiving. It also lays the groundwork for scaling out to multiple servers if needed. However, horizontal partitioning can complicate queries that need to access data across partitions and may require careful planning to avoid data hotspots.

Vertical partitioning reduces the size of frequently accessed tables, resulting in faster queries for common operations. By separating infrequently used or large columns, you minimize the amount of data loaded into memory during queries, which can be critical for performance. The downside is that queries needing data from both partitions now require joins, potentially increasing complexity and execution time.

Choosing the right partitioning strategy depends on your application's access patterns and growth expectations. For example, if you often query recent orders, horizontal partitioning by date will be highly effective. If your table has a few large columns that are rarely needed, vertical partitioning will help keep your main table lean and fast.

-- Efficient query on horizontally partitioned orders: get all 2024 orders for customer_id 1
SELECT * FROM orders
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
  AND customer_id = 1;

-- Efficient query on vertically partitioned data: join orders with order_extra_info for detailed info
SELECT o.order_id, o.customer_id, o.product_id, o.order_date, e.extra_info
FROM orders o
LEFT JOIN order_extra_info e ON o.order_id = e.order_id
WHERE o.order_id = 1;

1. What is horizontal partitioning?

2. How does vertical partitioning differ from horizontal partitioning?

3. What is a benefit of partitioning large tables?

question mark

What is horizontal partitioning?

Select the correct answer

question mark

How does vertical partitioning differ from horizontal partitioning?

Select the correct answer

question mark

What is a benefit of partitioning large tables?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 1
some-alt