Transitioning from Oracle to PostgreSQL: Partitioning

As databases grow, managing large tables becomes more challenging. Table partitioning is a tried-and-tested approach that helps break down large tables into smaller, more manageable segments, enhancing performance, maintainability, and scalability.

What is Table Partitioning?

Table partitioning is a database design technique that divides a large table into smaller, more manageable sub-tables called partitions. Each partition holds a subset of the data based on specific criteria, such as date ranges, categories, or hash values. While partitioning makes it seem like you’re working with a single large table, behind the scenes, queries and operations are distributed across multiple partitions.

This approach serves several key purposes:

  1. Performance Improvement: Partitioning allows databases to focus operations (like SELECT, UPDATE, or DELETE) on relevant partitions instead of scanning the entire table. For instance, when querying a sales table for a specific month, only the partition corresponding to that month is accessed, significantly reducing the I/O load and boosting performance.
  2. Better Manageability: By splitting large tables into smaller segments, maintenance tasks such as indexing, backups, and archiving can be performed on individual partitions. This keeps operations manageable, even for tables with billions of rows.
  3. Efficient Data Retention and Archiving: Data retention policies are easier to enforce when using partitioning. For example, old partitions can be quickly archived or dropped when data is no longer needed, without affecting the rest of the table.

In both Oracle and PostgreSQL, partitioning is a crucial feature for DBAs managing high-volume databases. Although both systems offer range, list, and hash partitioning methods, the implementation and management vary, which is why understanding the nuances is critical for a seamless transition.

Overview of Partitioning in Oracle and PostgreSQL

Oracle’s partitioning is advanced and flexible, supporting a variety of methods, including range, list, hash, reference, and composite partitioning. One standout feature is interval partitioning, which automatically creates partitions when new data arrives outside existing partitions. This reduces manual effort and ensures optimal performance for growing datasets.

PostgreSQL introduced declarative partitioning in version 10, supporting similar methods like range, list, hash, and composite partitioning. Although PostgreSQL lacks automated partition creation, it remains a highly adaptable system that supports extensions to implement this functionality.

Detailed Comparison of Partitioning Methods

Range Partitioning

Range partitioning is a method of dividing a table into smaller, more manageable partitions based on a specified range of values in one or more columns. It is particularly useful for sequential data, such as time-series or financial transactions, where rows are grouped into partitions according to defined boundaries (e.g., months or years). This approach helps optimize query performance and makes data management more efficient by narrowing down operations to relevant partitions.

Both Oracle and PostgreSQL support range partitioning. PostgreSQL requires manual partition creation, while Oracle supports interval partitioning for automated growth.

PostgreSQL Example

CREATE TABLE sales (

    sale_id SERIAL PRIMARY KEY,

    sale_date DATE,

    amount NUMERIC

)

PARTITION BY RANGE (sale_date);


CREATE TABLE sales_2024 PARTITION OF sales

    FOR VALUES FROM ('2024-01-01') TO ('2024-12-31');

Oracle Example

CREATE TABLE sales (

    sale_id NUMBER,

    sale_date DATE,

   amount NUMBER
)

PARTITION BY RANGE (sale_date)

INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))

(

    PARTITION p1 VALUES LESS THAN (TO_DATE('01-01-2024', 'DD-MM-YYYY'))

);

List Partitioning

List partitioning divides a table into smaller partitions based on discrete, predefined values in one or more columns. Unlike range partitioning, which groups data by ranges, list partitioning assigns rows to partitions based on specific values (e.g., region, department, or category). This method is ideal for organizing categorical data, making it easy to manage and query data that falls into distinct groups.

List partitioning works similarly in both systems. Oracle allows sub-partitions within list partitions, providing extra granularity, while PostgreSQL keeps things straightforward.

PostgreSQL Example

CREATE TABLE customers (

    customer_id SERIAL PRIMARY KEY,

    customer_name TEXT,

    region TEXT

) PARTITION BY LIST (region);

CREATE TABLE customers_north PARTITION OF customers

    FOR VALUES IN ('North');

CREATE TABLE customers_south PARTITION OF customers

    FOR VALUES IN ('South');

CREATE TABLE customers_west PARTITION OF customers

    FOR VALUES IN ('West');

Oracle Example

CREATE TABLE customers (

    customer_id NUMBER,

    customer_name VARCHAR2(50),

    region VARCHAR2(20)

)

PARTITION BY LIST (region) (

    PARTITION north VALUES ('North'),

    PARTITION south VALUES ('South'),

    PARTITION west VALUES ('West')

);

Hash Partitioning

Hash partitioning divides a table into smaller partitions by applying a hash function to the values in one or more columns, distributing rows evenly across partitions. Unlike range or list partitioning, hash partitioning ensures balanced data distribution, making it ideal for scenarios where data does not naturally group into ranges or categories. This method helps prevent data skew, optimizing performance for workloads that require uniform access across partitions.

PostgreSQL added support in version 11, but Oracle’s hybrid options, combining hash with other partitioning methods, offer more flexibility.

PostgreSQL Example

CREATE TABLE orders (

    order_id SERIAL PRIMARY KEY,

    customer_id INT,

    order_date DATE,

    amount NUMERIC

) PARTITION BY HASH (customer_id);

CREATE TABLE orders_p0 PARTITION OF orders

    FOR VALUES WITH (MODULUS 4, REMAINDER 0);

CREATE TABLE orders_p1 PARTITION OF orders

    FOR VALUES WITH (MODULUS 4, REMAINDER 1);

CREATE TABLE orders_p2 PARTITION OF orders

    FOR VALUES WITH (MODULUS 4, REMAINDER 2);

CREATE TABLE orders_p3 PARTITION OF orders

    FOR VALUES WITH (MODULUS 4, REMAINDER 3);

In this example, the orders table is hash partitioned by the customer_id column, with four partitions created using a modulus of 4.

Oracle Example

CREATE TABLE orders (

    order_id NUMBER,

    customer_id NUMBER,

    order_date DATE,

    amount NUMBER

)

PARTITION BY HASH (customer_id)

PARTITIONS 4;

 

Composite Partitioning

Composite partitioning combines two or more partitioning methods to create a multi-level partitioning scheme within a table. This approach allows for greater flexibility in managing data, as it segments rows based on multiple criteria. For example, you could first apply range partitioning by date, and then further divide each partition by list or hash partitioning based on categories or hash values. Composite partitioning is particularly useful for handling complex datasets where data needs to be organized by both time and another attribute, like region or customer type.

Both systems support combining range, list, or hash partitions, catering to datasets that require segmentation based on multiple criteria.

PostgreSQL Example

CREATE TABLE sales (

    sale_id SERIAL PRIMARY KEY,

    sale_date DATE,

    region TEXT,

    amount NUMERIC

) PARTITION BY RANGE (sale_date);

-- Create a range partition for 2024

CREATE TABLE sales_2024 PARTITION OF sales

    FOR VALUES FROM ('2024-01-01') TO ('2024-12-31')

   PARTITION BY LIST (region);

-- Create list partitions within the 2024 range partition

CREATE TABLE sales_2024_north PARTITION OF sales_2024

   FOR VALUES IN ('North');

CREATE TABLE sales_2024_south PARTITION OF sales_2024

    FOR VALUES IN ('South');


Oracle Example

CREATE TABLE sales (

    sale_id NUMBER,

    sale_date DATE,

    region VARCHAR2(20),

    amount NUMBER

)

PARTITION BY RANGE (sale_date)

SUBPARTITION BY LIST (region) (

    PARTITION sales_2024 VALUES LESS THAN (TO_DATE('01-01-2025', 'DD-MM-YYYY'))

   (
        SUBPARTITION sales_2024_north VALUES ('North'),

       SUBPARTITION sales_2024_south VALUES ('South')
   )
);

Implementation and Management Differences

Declarative Partitioning in PostgreSQL 

PostgreSQL’s approach requires manual partition management. Each partition is a separate table under a parent table, making it necessary to create new partitions as data grows. Although this adds some administrative overhead, it provides greater flexibility and control.

Note that you can achieve automatic partition creation in PostgreSQL using tools like pg_partman, as explained in the next section. 

Native Partitioning in Oracle 

Oracle’s interval partitioning simplifies management by automatically creating partitions when needed. Additionally, Oracle’s tools, like the Partition Advisor, help identify tables that would benefit from partitioning, making management easier for DBAs.

Automating Partition Management in PostgreSQL

While PostgreSQL lacks built-in automation, tools like pg_partman can help automate partition creation. It is a popular tool that simplifies managing partitions, reducing manual tasks.

#Example of pg_partman setup for automatic partition creation

CREATE EXTENSION pg_partman;

SELECT create_parent('public.sales', 'sale_date', 'time', 'monthly');

By using tools like these, DBAs can maintain efficient partitioning without constant manual intervention, making PostgreSQL more user-friendly.

Partition Pruning and Optimization Techniques

Partition pruning ensures that only relevant partitions are accessed during queries, improving performance. Here is how each database approaches pruning:

PostgreSQL

PostgreSQL supports pruning during both planning and execution, making it efficient for parameterized queries.

Example: If a query filters by a date range, PostgreSQL will automatically prune irrelevant partitions.

Oracle

Oracle uses partition-wise joins and partition-wise aggregation to limit operations to relevant partitions, providing a more refined optimization mechanism.

Best Practices for Migrating from Oracle to PostgreSQL

Manual Partition Management

Since PostgreSQL doesn’t have automatic partitioning, DBAs should plan for manual management or leverage tools like pg_partman. Scheduling scripts or cron jobs to create partitions can help reduce manual overhead.

Adapt Indexing Strategy

PostgreSQL requires different indexing strategies, replacing global indexes with local or partial ones. Carefully analyze query patterns to decide which indexes to create for each partitioned table.

Optimize Pruning

To maximize pruning, ensure that partition keys are used consistently in WHERE clauses, JOINs, and subqueries. PostgreSQL’s runtime partition pruning is effective for handling parameterized queries.

Conclusion

While both Oracle and PostgreSQL offer robust partitioning, they require different approaches. PostgreSQL’s partitioning may be more manual, but its flexibility makes it powerful. By adapting to PostgreSQL’s declarative partitioning and optimizing query patterns, DBAs can ensure a successful migration with minimal performance impact.

Leave A Comment