PostgreSQL is a fantastic database, but if you’re storing images, video, audio files, or other large data objects, you need to “toast” them to get optimal performance. This post will look at using The Oversized-Attribute Storage Technique (TOAST) to improve performance and scalability. 

PostgreSQL uses fixed-sized pages, which makes it challenging to store huge data values directly. To solve this problem, large data values are compressed and broken into multiple smaller chunks. This process is done automatically and does not significantly impact how the database is used. This technique, called TOAST, improves how large data values are stored and used in the database.

Therefore, TOAST is a storage technique used in PostgreSQL to handle large data objects such as images, videos, and audio files. The TOAST technique allows for the efficient storage of large data objects by breaking them into smaller chunks and storing them separately from the main table. This can improve the performance of queries and indexing and reduce the amount of disk space required to store the data.

TOAST tables are created automatically by PostgreSQL when a table contains a column of type OID, bytea, or any other data type with the TOASTable storage class. The TOAST table is then used to store the large data objects, while the main table stores a reference to the TOAST table.

Here’s an example of using the TOAST technique in PostgreSQL:

  • Create a table with a large data column:

  • Insert a large image into the table:

  • Query the table to see that the large data object is stored in a TOAST table:

In the example, the images table contains a column named data of type bytea, which can hold extensive binary data. When a large image is inserted into the table, PostgreSQL automatically creates a TOAST table to store the image data separately from the main table. The pg_class system catalog table is then queried to show that a TOAST table has been created.

It’s important to note that while TOAST tables help store large data objects, they can add complexity to the database and should be used with care. Also, in some cases, the query performance will decrease when the data is spread among different tables, depending on the query conditions.

If you have a lot of large data, you don’t need to query/index; another option you could consider is to store it outside the DB in the file system and store a reference to it in DB, similar to how a TOAST table works.

In PostgreSQL, you can use the different TOAST storage strategies by setting the “storage” attribute on a column.

Where “data_type” is the data type of the column (e.g., text, bytea), and “strategy” is one of the four TOAST storage strategies (PLAIN, EXTENDED, EXTERNAL, MAIN).

You can also use the pg_attribute table in the system catalog to check which strategy a column uses.

It’s worth noting that the default storage strategy for most columns is ‘EXTENDED’ (compression and out-of-line storage) and that you can change the storage strategy of a table column at any time. However, it’s important to note that changing a column’s storage strategy may affect your queries’ performance and table size. Thus, testing your specific use case with different storage strategies is advisable to determine which provides the best performance.

  1. PLAIN strategy: This strategy disables both compression and out-of-line storage and also disables using single-byte headers for varlena types. This is the only strategy available for non-TOAST-able data types, such as integers or booleans. Example: If you have a table with a column of integers, you don’t want to keep them in line.
  2. EXTENDED strategy: This strategy allows both compression and out-of-line storage. This is the default strategy for most TOAST-able data types, such as text or bytea. The system will first attempt to compress the data; if the row is too large, it will store it out-of-line. Example: You have a table with a large column of text and want to reduce its size on the disk; this strategy will try to compress it first. If it doesn’t fit, the row will be stored out of line.
  3. EXTERNAL strategy: This strategy allows out-of-line storage but disables compression. This strategy is useful for text and bytea columns frequently accessed with substring operations. Accessing these columns will be faster because the system only needs to fetch the required parts of the out-of-line value. Example: You have a table with a large column of text and want to improve performance when substring operations are needed; this strategy will store it out of line and avoid compression
  4. MAIN strategy: This strategy allows compression but disables out-of-line storage. Out-of-line storage will still be performed, but only as a last resort when there is no other way to make the row small enough to fit on a page. Example: You have a table with a large column of data not accessed frequently, and you want to compress it to save disk space; this strategy will compress it but will avoid storing it out of line.

Navigating the pitfalls of using TOAST in PostgreSQL

While the TOAST technique can be useful to handle large data objects in PostgreSQL, you might hit a few problems. Here are a few common issues and ways to address them:

  1. Increased storage space: Because TOAST tables store large data objects separately from the main table, they can increase the amount of disk space required to store the data. This can be especially problematic if the table contains many large data objects. To address this issue, consider compressing the data before storing it in the TOAST table or using a storage solution optimized for handling large data objects such as file systems or object storage.
  2. Query performance: Queries involving large data objects stored in TOAST tables can be slower than those with smaller data objects. This is because the database needs to fetch the data from the TOAST table before it can be used in the query. To address this issue, try creating indexes on the TOAST table or consider using a caching layer to reduce the number of times data needs to be fetched from the TOAST table.
  3. Vacuum performance: PostgreSQL runs a process called “vacuum,” which reclaims disk space from deleted or updated rows to maintain the database’s performance. The vacuum process could be slow when there are a lot of large data objects stored in TOAST tables. To address this issue, try running the vacuum process during periods of low database activity, or consider using a storage solution optimized for handling large data objects such as file systems or object storage.
  4. Limited data types: The TOAST table is created only for the columns defined as oid, bytea, or any other data type with the TOASTable storage class. You can’t use a TOAST table for data types such as text or varchar, which can also be large.

Tackling the TOAST table growth issue in PostgreSQL: Strategies and solutions

A common issue with the TOAST system in PostgreSQL is that the size of the TOAST table can grow out of control. This can happen when large amounts of data are inserted into the table, causing the table to become larger than the available disk space. There are several ways to address this issue:

  1. Increasing disk space: The simplest solution is to increase the amount of disk space available to the PostgreSQL instance. This will allow the TOAST table to continue growing and should be considered a temporary solution.
  2. VACUUM and ANALYZE: Running the VACUUM and ANALYZE commands can help to reclaim space in the TOAST table that is no longer needed. Vacuum reclaims space occupied by dead rows, and ANALYZE will help the query planner to make more accurate decisions.
  3. Set a size limit to the TOAST table: You can use the max_toast_size configuration parameter to set a maximum size limit for the TOAST table. Once the table reaches this size, any additional data will be rejected.
  4. Choose a more appropriate storage strategy: As explained previously, choosing a more appropriate storage strategy for your data type and the access pattern can help to avoid unnecessary growth of the TOAST table.
  5. Archiving old data: Removing old or rarely accessed data from your table will help reduce the table’s size. Also, consider archiving old data to a different storage location, such as tape or cloud storage.
  6. Compressing the data: If you are using plain or external storage, you can also consider compressing your data before storing it in the table. Thus, you can use less disk space.

Conclusion

In conclusion, the TOAST system in PostgreSQL is a powerful feature that allows the database to handle large column values that would not fit in a single database block. The system uses a variety of strategies for storing these columns, including PLAIN, EXTENDED, EXTERNAL, and MAIN. Each strategy has its advantages and uses cases, and the appropriate strategy will depend on the specific requirements of your application.

For example, you may use the EXTERNAL strategy if you have a table with a large column of text and want to improve performance when substring operations are needed. When designing a table, consider the size and type of data stored in the columns and choose an appropriate storage strategy that will meet your application’s performance and space requirements. It’s also possible to change the storage strategy of a column at any time, although this may affect the performance of your queries and the size of your table. Thus, testing different strategies before settling on the best one is highly recommended.

Ultimately, understanding the TOAST system and how to use it effectively can significantly improve the performance and scalability of your PostgreSQL applications.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Subscribe
Notify of
guest

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Richard O'Riordan

Hi there, very good article, just one question, you have the following line “You can’t use a TOAST table for data types such as text or varchar, which can also be large.” my understanding is that columns of type varchar and text will get toasted if over 2kb. Are you saying that these data types do not get stored in a TOAST table?
Thanks.

Gurmokh

Contradiction in the article.
Extended strategy it states:

This is the default strategy for most TOAST-able data types, such as text or bytea.

But then in the pitfalls of TOAST you state.

You can’t use a TOAST table for data types such as text or varchar,

Tobias

yes contradiction, thought the same.