PostgreSQL Partition Manager — ‘pg_partman’ — is an open source extension freely available, widely supported, and actively maintained by the PostgreSQL community.

pg_partman creates and manages both time-based and serial-based table partition sets. Users use it to automate and create partitions by breaking large tables into smaller partitions for better performance.

In this post, initially, we will discuss the creation of the pg_partman extension following partitioning for newly created tables. First, we will need to install the OS with the necessary control files for the appropriate PostgreSQL version for installing pg_partman on the database.

Installation of pg_partman on the database

I recommend creating the schema before creating the extension to install the extension’s objects. This step is not mandatory, but after installation, you cannot change it. By default, the extension creates objects in the ‘public’ schema.

Establishing a dedicated schema for an extension serves the purpose of storing its objects, including tables, functions, procedures, and more. This approach facilitates smoother management and simplifies tasks like handling backups, upgrades, and migrations.

Below, I present the steps and failures encountered when attempting to change the schema after creating the extension.

Step one: As an example,  let us try to create an extension without Qualifying SCHEMA.

Step two: Next, we demonstrate a failure to set up the schema for an already created extension.

Eventually, here are the steps for Installation of pg_partman along with schema.

Step 1: Create schema for installation of the extension

Step 2: Create extension pg_partman in schema partman

Implementation of native partitioning

Native partitioning is also known as declarative partitioning. PostgreSQL allows you to declare that a table is divided into partitions with a list of columns or expressions to be used as the partition key.

However, you can only implement range partitioning using pg_partman, either with time-based or serial-based methods. This is because you can only predict when a new partition needs to be created in the case of range partitioning on dates or timestamps.

Time-based partitioning by range with template

Unique constraints on the partitioned table must include all partitioning columns. Extension pg_partman helps to manage this by using a template table to manage properties that currently are not supported by native partitioning.

Properties like INDEX and unique constraint creations on the template table are created on all partitions.

1. First, create a table with native partitioning type by range using DATE typed column

2. Next, create a template table

3. Add constraints to the template table needed for all partitioned tables

4. Create parent to create initial child partitioned tables

Additionally, I have listed below the various options that can be used with the ‘create_parent’ function and their respective values:

Argument Data TypeDefinitionValues
p_parent_tabletextParent Partitioned table ( Original table to have existed before creation & MUST be schema-qualified even if it is a public schema.)schema.tablename
p_controltextYou should partition the column, either integer or time-based.Column name
p_typetextYou can use the Partman type based on your performance improvement and flexibility requirements.native,partman
p_intervaltextInteger range or time interval for each partition. (The generic intervals of “yearly” to  “quarter-hour”) (yearly, quarterly, monthly, weekly, daily, hourly, half-hour, quarter-hour )Any other interval that validates time.For ID based integer ranges.
p_template_tabletextTemplate table name, if not provided, will create one in the schema the extension is installed.schema.tablename
p_premakeintegerAdditional partitions to stay ahead of the current partition.We create a default of 4 (resulting in the creation of 5 partitions, including the current one).
p_start_partitiontextThis allows specifying the first partition of a set instead of it being automatically determined. It must be a valid timestamp (for time-based) or a positive integer (for id-based) value.CURRENT_TIMESTAMP /(CURRENT_TIMESTAMP + ‘1 hour’::interval)::text)
p_automatic_maintenancetextThe parameter determines whether maintenance is automatically managed when run_maintenance() is called without a table parameter or by a background worker process.DEFAULT ‘on’::text

 

Because we specified ‘2’ for premake in the create_parent function, you can observe that two partitions ahead have been created, as shown in the table definition below.

To illustrate, we insert around 10,000 rows to see the data flush to its appropriate partitions if it exists or moves to the default partition.

By utilizing functions, we can seamlessly create child partitions and efficiently transfer data from the default table to these child tables.

As an illustration, the usage of partman.partition_data_proc shown below, which can move data batch-wise to partitioned tables.

As a result, we can see below the newly created table structure along with the new child partition.

Consequently, after using the function, you will notice that the data from the default table has been moved to the newly created partitioned table.

Creating the partitioned child tables ahead

Subsequently, we can modify the table partman.part_config settings to make changes by modifying the premake, which controls the partitions ahead.

Automatic maintenance of partitions

You can plan to create partitions ahead of the INSERTS to prevent data from moving into the default_partition or the main table. You can achieve this by using functions to create newly partitioned tables and move the data.

For instance, we can use the function run_maintenance() to create new partitions in advance, and there are two ways to do it.

1. Using background worker
2. Cron Scheduler

1. Setting up background worker

To start with, we can set “shared_preload_libraries” for partition maintenance operations to run automatically. You can set the parameter either by using ALTER SYSTEM or by manually modifying the config file.

Afterward, perform a restart of the cluster either using systemctl or pg_ctl.

Configuration options background worker

1. pg_partman_bgw.interval: Number of seconds between calls to run_maintenance(). The default is 3600 (1 hour).
– For time-based partitions, intervals can be chosen based on the partition interval at least daily or twice daily.
– For serial-based partitions, you can choose the partition interval based on the data inserted on a daily basis.
2. pg_partman_bgw.dbname: Required. The database(s) that run_maintenance() will run on. If multiple, use a comma-separated list. If not set, BGW will do nothing.
3. pg_partman_bgw.role: The role that run_maintenance() will run as. The default is “postgres”. You are allowed to use only a single role name.
4. pg_partman_bgw.analyze: Optional argument, by default whenever a new child table is created, an analyze is run on the parent table of the partition set to ensure constraint exclusion works. This analyze can be skipped by setting this to false and help increase the speed of moving large amounts of data. If you set this to false, we strongly recommend that you perform a manual analyze of the partition upon completion to ensure the statistics are updated properly.

2. Cron scheduler

   00 01,12 * * * psql -c “SELECT pg_partman.run_maintenance()”

Advantages of pg_partman

  • Using pg_partman, you can define data retention to automatically move data into archived partitions or drop the partition.
  • Query Performance will improve when dealing with partitioned data specifically.
  • Vacuum efficiency will improve as data resides in smaller partitions, reducing the time required for table maintenance compared to dealing with large sets of data.
  • You can attach or detach partitions without downtime concurrently, and it will not impact ongoing database operations.

Limitations of pg_partman

  • pg_partman only supports partitioning of data types that are integer or date/timestamp.
  • Generally, when opting for native partitioning, the parent table should already be created as range partitioned but not list partitioned.
  • Unique constraints on partitioned tables must include all partitioning columns.

Conclusion

To summarize, we discussed creating and managing partitions effectively using PostgreSQL Partition Manager(pg_partman) along with the creation of the partition ahead of Inserting data using time-based. I will provide a more detailed usage of pg_partman in upcoming blog posts.

Please review the blogs below for reference:

Performing ETL Using Inheritance in PostgreSQL

PostgreSQL Partitioning Using Traditional Methods

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Subscribe
Notify of
guest

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Vasu

Hi Neha , Thanks for sharing this blog .One small suggestion is , please provide the package from where you can get , downloading , installing that packages …links , steps too… I know this is open source and can get from Google .. But I am sure you also encounters issues while installing this package , gathering link and while setting up this partman packages etc…this is what lagging so much from the blogs … But it you refer my blog , pre steps , like downloading , intslaling , errors , troubleshooting also is there like that pls maintain from your end also.Thank you .