The book PostgreSQL 14 Internals has been available in PDF format for quite a while, but recently, the ability to order a printed copy became available (https://postgrespro.com/community/books/internals).

This 548-page tome from PostgresPro covers the spectrum from data organization to details on the many available indexing options.  This book has five major sections on MVCC and Isolation (108 pages), Buffer Cache and WAL (53 pages), Locks (42 pages), Query Execution (154 pages), and the types of indexes (127 pages). I mentioned that this material in PDF format was available in from a recent talk on PostgreSQL at Percona Live, and many folks wanted more details.  The book arrived a few days after I placed my order (under $30). 

I had been reading sections of the PDF version but immediately ordered a printed version when I found it was available.  Books like this often require referring to another page for details while reading. I find that it is easier to do this with a book than on a screen. Plus, I tend to write notes in the margins, leave Post-It notes in certain areas, and dog ear pages for future reference, which is hard to do with PDF files.

Is this book an easy read?

Is this an easy read?  The subject matter is well-written and communicates the material very well. The illustrations are clear, and there are many reference pointers to the documentation or source code where needed. If you are new to the internals of PostgreSQL, this book, along with the source code and some caffeine, should give you a great start on the subject. But you do not need to refer to the source code as the author covers the concepts with easy-to-read text and clear examples.  This is a great tool for learning the intricacies of PostgreSQL.

How to read this book

The best way to read this book is from the first page to the last, as the author builds on previous definitions and explanations.  For instance, chapter eight covers the situation where tuple vacuuming (removing the outdated copies of rows in a table) can not return the reclaimed space to the operating system. Having previously covered the structure of tables and rows, this chapter explores how VACUUM FULL rebuilds tables and indexes. Now, this operation blocks all reads and writes during the operation. The author shows how using the pgstatttuple extension can show an estimated percentage of space filled with useful data. The author has done an excellent job of presenting all this information in a clear, concise fashion in small, easy-to-consume tidbits so that the reader does not feel swamped with details. 

Version 14?

Some of you may be asking, “PostgreSQL 14? Isn’t 16 the current release?”  The good news is that PostgreSQL’s once-a-year release cycle does not include that much change to the vast majority of the code, and changes are highlighted in the release notes. The material in this book teaches the concepts at a level that you will not be thrown off by minor changes.

For example, Chapter 10 covered the Write-Ahead Log or WAL, which is an area of great concern and some mysticism. In eighteen pages, the subject of the WAL is covered with an overview of how it works. In case of a power failure or other catastrophic calamity, the server needs a way to contain all the essential information needed to repeat operations at the time of the problem, and that is the purpose of the WAL. Since disk flushing is going to lag behind what is in memory, those changes in the data are written to the WAL.

The author starts with an overview of the operations performed by the WAL and proceeds into the logical structure. The WAL buffers are set up as a ring buffer where newer entries overwrite the older, which comes with a proviso to make sure you are getting the data synched to disk at an appropriate rate. The book has a wonderful example of how to track the position of the WAL by showing you how it changes as the data changes. Next, the author delves into the physical structure of the WAL and how to investigate that information. The rest of the chapter covers checkpointing and configuration. There is a lot of material in Chapter 10, but it is covered in a concise, clear fashion that illustrates very well even the most complex subjects.

Do you need a copy?

So, do you need a copy of the hardcopy book? If you are happy with the PDF version, then you probably do not spend the money.  But if you are like me and tend to access sections randomly, scribble notes on pages, or just prefer the feel of a book in your hands, then the investment is a good one. Buy this book if you are looking for a deeper understanding of how the PostgreSQL server really works, and you will not be disappointed. 

If you are new to PostgreSQL, possibly coming from another database, this book covers all the terminology you stumble across, such as WAL. Chapter 10 fully explains the whys and hows the Write Ahead Log is implemented with wonderful examples of how it all works. This should be required reading for anyone interested in the fine points of PostgreSQL. So, if you are in this group, you need this book, either the hard copy version or the PDF.  

Yes, you need a copy of this book if for no other reason than someone asks a specific question about PostgreSQL that you cannot answer (correctly) off the top of your head. In these cases, the book is invaluable for articulating what the database server is doing.

And a big thank you to Egor Rogov for writing this book.

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments