PostgreSQL allows its users to set parameters at different scopes, and the same parameter can be specified at different places and using different methods. And there could be conflicts. Someone might be wondering why certain changes are not coming into effect, so it is important to understand/recollect the scope and priority of settings.

In this blog, I am trying to list the options available for users and list them in increasing order of priority. The purpose is to give a high-level view to users.

1. Compile time parameter settings

These are the set of parameters that are set at the time of compilation. This acts as the default value for PostgreSQL. We can check these values in the boot_val field of pg_settings.

These compile time settings have the least priority and can be overridden in any other levels. However, some of these parameters cannot be modified by any other means. Changing these values at compile time is not intended for common use. If a PostgreSQL user wants to change these values, they need to recompile the PostgreSQL from the source code. Some are exposed through the configure command line option. Some such configuration options are:  --with-blocksize=<BLOCKSIZE>    This sets table block size in kB. The default is 8kb.  --with-segsize=<SEGSIZE>  This sets table segment size in GB. The default is 1GB. This means PostgreSQL creates a new file in the data directory as the table size exceeds 1GB.   --with-wal-blocksize=<BLOCKSIZE>   sets WAL block size in kB, and the default is 8kB.

Most of the parameters have compile time defaults. That is the reason why we can start running PostgreSQL by specifying a very minimal number of parameter values.

2. Data directory/initialization-specific parameter settings

Parameters can also be specified at the data directory initialization time.  Some of these parameters cannot be changed by other means or are difficult to change.

For example, the wal_segment_size, which determines the WAL segment file, is such a parameter. PostgreSQL generates WAL segment files of 16MB by default, and it can be specified at the time of initialization only. This is the level at which decisions on whether to use data_checksums  need to be taken. This can be changed later using the pg_checksums utility, but that will be a painful exercise on a big database.

The default character encoding and locale settings to be used can be specified at this level. But this can be specified at the subsequent levels also. You may refer to initdb options for more information:  https://www.postgresql.org/docs/current/app-initdb.html.

Those parameters taken from the specific data directory initialization, which overrides the built-in parameters, can be checked like this:

This override includes some of the calculated auto-tune values for that environment.

3. PostgreSQL parameters set by environment variables

PostgreSQL executables, including the postmaster, are honoring many environment variables. But they are generally used by client tools. The most common parameter used by the PostgreSQL server (postmaster) will be PGDATA, which sets the parameter data_directory.  These parameters can be specified by the service managers like systemd.

For many automation/scripting, this will be handy. Here is an example:

As we can see, PostgreSQL took the port as 5434.

4. Configuration files

Probably, this is the method every novice user will be aware of. The fundamental configuration file is postgresql.conf, and it is the most common place to have global settings. PostgreSQL looks for a configuration file in the PGDATA by default, but an alternate location can be specified using the command line parameter config_file  of postmaster.  The parameter specifications can be split into multiple files and directories because Postgresql supports  include and  include_dir directives in the configuration files. So, there can be nested/cascaded configurations.

PostgreSQL rereads all its configuration files if it receives a SIGHUP signal. If the same parameter is set in multiple locations, the last to read will be will be considered. Among all configuration files, postgresql.auto.conf gets the highest priority because that is the file to read the last. That is where all “ALTER SYSTEM SET/RESET” commands keep the information.

5. Command line argument to postmaster

The postmaster, aka Postgres, has a feature to set parameters as command-line arguments (it has features to get the values also). This is one of the most reliable methods used by many of the external tools to manage PostgreSQL service. For example, the high availability solution Patroni passes some of the most critical parameters as a command line argument.  Here is how the Postgres process with command-line options looks in a Patroni environment

So Patroni can ensure that there is no local configuration mistake that can adversely affect the availability and stability of the cluster. But changing this is possible only at the server startup. Obviously, this has higher precedence over the values from the configuration files. The scope will be at the instance level. This answers many of the Patroni user’s questions on why they cannot change some of the parameters directly in the parameter file. PostgreSQL users can check those parameters which came as command line arguments like:

Parameters specifications up to this level can have “postmaster” context.  The concept of “context” is discussed as part of the next section.

6. Database level setting

All options discussed so far have a global scope. Meaning they are applicable for the entire instance. But there could be reasons why a PostgreSQL user wants to change that at a specific database level. For example, one of the databases might be handling an OLTP workload where query parallelism may not be really needed and may have an adverse impact. But another database might be an OLAP system.

The concept of context

At this stage, we should recollect another concept called the “context” of a parameter. For example, the network port at which PostgreSQL listens cannot be changed at the individual database level. A change of such parameters requires a PostgreSQL restart. So we say that the context of parameter “ port” is postmaster. A change to such parameters requires the postmaster — the main process of the PostgreSQL — to restart. Please refer to the documentation: https://www.postgresql.org/docs/current/view-pg-settings.html to understand different contexts of PostgreSQL parameters.  We won’t be allowed to change a set of parameters at this level onwards; any attempt will be prevented.

The max_connections is something to specify at the global (instance) level by the postmaster, and it requires restart.

There are other sets of parameters that need to be communicated through postmaster only, even though they can be changed without restarting the server. That context is called sighup. Because we can signal the postmaster, and it will re-read such parameters and propagate the same to all its child processes, changing them at the database level will be prevented.

You may even consider looking at the PostgreSQL source code:

https://github.com/postgres/postgres/blob/6fde2d9a005a5bc04aa059d3faeb865c8dd322ce/src/backend/utils/misc/guc.c#L3376

for a much deeper understanding of the logic of “context” and what is allowed in which level.

7. User-level settings

Each user can have their preferred parameter settings so that all sessions created by that user will have that setting in place.  Please remember that this user-level setting has a higher preference than database-level settings. Users can check their own user-level settings, like this:

8. Database – user combination

PostgreSQL allows us to have parameter settings that will be applicable when a particular user/role connects to a particular database.

For example:

Setting at this level has even higher priority than everything mentioned before.

9. Parameters by the client connection request

There is an option to specify parameters while making a new connection. It can be passed to PostgreSQL as part of the connection string.

For example, I want to connect to the database to perform some bulk data loading and manipulation (ETL), and I don’t want to wait for any WAL writing. If, at all, there is any crash in between, I am OK to perform the ETL again. So, I am going to request a connection with synchronous_commit  off.

10. Session-level setting

Each session can decide on the settings for that session at that point in time or execution. The sessions are allowed to modify this session-level setting as and when required.

A good use case is that, suppose we are going to rebuild a big index. We know that it is going to use considerable maintenance_work_mem.  Setting this at the session level simplifies our life without affecting other sessions.

11. Transaction-level settings

PostgreSQL allows us to specify parameters at a very small scope, like transaction level.

Here is an example of discouraging sequential scans in a particular transaction block.

I prefer transaction-level settings because the changes are very local to the transaction, and they will be reverted back once the transaction is completed. This is the most preferred place to set the work_mem to minimize the impact.

12. Object-level settings

PostgreSQL allows us to specify the parameter specific to a program block, like a PL/pgSQL function. So, the setting goes as part of the function definition.

Here is an example of the function definition to test the function-level settings.

PostgreSQL parameters summary

PostgreSQL parameter specification is very flexible and powerful, so understanding the scope, context, and priority is important for every user. A rule of thumb could be the broader the scope, the lower the priority.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments