While working as a DBA, we perform many regular tasks, and one of them is upgrading our database systems. There are some techniques to perform a PostgreSQL database upgrade, such as data dump and import, logical replication, or in-site upgrade using pg_upgrade.

The last is a good option when you can afford some downtime and desire to reuse the same server where your current instance is running.

The process using pg_upgrade is well documented, and you can easily find the instructions with little googling.

However, there are a few occasions when you face some unusual conditions that require additional research. In this blog post, I will show you a particular example I faced when working with an upgrade exercise using pg_upgrade.

As a regular procedure, the following steps were followed:

  • Install the same PostgreSQL packages you already have installed in the current version for the new one.
  • Run pg_upgrade with the –check flag to verify the compatibility between the current and the new version clusters.
  • If all is good, perform the upgrade removing the –check flag.

You might consider some extra steps, such as verifying the existence of gin/gist indexes, unknown datatypes, or planning the upgrade of any standby server using the rsync approach. These are not in this blog’s scope to keep this example simple.

Ideally, all the three above steps execute with no issues, and you get your new PostgreSQL version up and running.

There are some situations where the second step, the –check one, fails, and you need to investigate and fix it before moving forward. A “regular” case is you missed installing a required package in the new version. You will end with something like the following:

In this case, the pg_upgrade –check command will provide a hint:

For example, if checking the content of the loadable_libraries.txt file, you see something like the next:

You can immediately identify the missing package as the pg_repack, so you need to install it for the new version.

Well, sometimes, this information is not quite “evident.” Let’s review the case.

Example case

When working with the upgrade exercise, the goal was to move from PostgreSQL 11 to PostgreSQL 12. This was a specific requirement, but the situation could also happen when upgrading to a different version.

Following the main process, I installed the same packages as the existing version 11 to version 12 and verified.

Looking good, then we can execute the second step and verify the cluster’s compatibility.

verify the cluster's compatibility

Mmmh, something is missing. Let’s check what could be.

Ok, a “randomness” library doesn’t sound like a usual library, but we can review some details to find it.

I could miss some packages for the new version; let’s check again.

postgresql packages

The same packages are installed for both versions.

The next idea is to verify the extensions we have installed in the source database.

verify the extensions postgresql

Here, nothing looks “related” to the randomness library.

Just to double-check check, we can even try to find the related libraries from the PostgreSQL $libdir directory. For example, with the following bash snippet.

Checked for version 11:

And the same for version 12:

Everything seems OK, so where is the randomness library coming from?

There is something I am missing. I tried pg_dump with the –binary-upgrade flag to review some other details regarding the extensions

From the PostgreSQL documentation we can see the following statement about this flag:

–binary-upgrade

This option is for use by in-place upgrade utilities. Its use for other purposes is not recommended or supported. The behavior of the option may change in future releases without notice.

But the intention is not to restore. Just take a look at the extra information we can get.

The –binary-upgrade option will produce a “verbose” output related to the extensions. The CREATE EXTENSION IF NOT EXISTS commands will be replaced with DROP EXTENSION IF EXISTS …, SELECT pg_catalog.binary_upgrade_create_empty_extension() calls.

Also, you will see a section with the explicit creation of the C language functions for the extensions. An example is the next for the pg_stat_statements extension:

Looking at the following line:

You can identify the physical library ($libdir/pg_stat_statements) used to build the Postgres function and the C function (‘pg_stat_statements_reset‘) within that library.

For some extra good material about writing extensions in PostgreSQL, you can take a look at the blog post Writing PostgreSQL Extensions is Fun – C Language, it really worth it.

Well, while doing this review about the extension’s definition, I saw the next one:

Aha! There is the randomness library call! A custom C function is using it.

We should remember PostgreSQL includes support for some procedural languages with its base distribution, such as  PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. So, having C functions is perfectly supported; however, it is not what we might say the usual.

As described in the PostgreSQL documentation, there are a few paths where the libraries can be placed:

The following algorithm is used to locate the shared object file based on the name given in the CREATE FUNCTION command:

1. If the name is an absolute path, the given file is loaded.
2. If the name starts with the string $libdir, that part is replaced by the PostgreSQL package library directory name, which is determined at build time.
3. If the name does not contain a directory part, the file is searched for in the path specified by the configuration variable dynamic_library_path.
4. Otherwise (the file was not found in the path, or it contains a non-absolute directory part), the dynamic loader will try to take the name as given, which will most likely fail. (It is unreliable to depend on the current working directory.)

For this case, I searched all the paths and did not find the randomness. No library. It seems somebody was playing around to build a C function and at some point, removed the library but left the Postgres function behind (this was a DEV environment).

After confirming this “public“, “bytea_size” function is not used, the solution was to DROP it, then retry the pg_upgrade –check command.

bytea_size postgresql

After this point, the pg_upgrade was successful.

Conclusion

As we all already know, PostgreSQL is really powerful, and its core capabilities can be extended with custom functions and extensions. Writing your own in C language functions requires some extra knowledge, but you can definitely get very interesting results.

Remember, every time a custom C function is created, PostgreSQL itself doesn’t compile the C code, which should be done beforehand, but the CREATE FUNCTION command is recorded literally in the system catalogs. So it will be referenced for any other load calls, including the pg_upgrade command.

Subscribe
Notify of
guest

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Abdul Sayeed

Hi David,

This is something DBAs usually misses while performing upgrade. Your blog post will be helpful for proper preparation before upgrading.

Thank you for writing it.

Regards,
Abdul Sayeed