PostgreSQL Buffer Cache: A Practical Guide

Introduction

Before adding indexes or application-level caching to optimize PostgreSQL performance, it’s worth understanding how a relational database like PostgreSQL manages memory. While we’ll focus on PostgreSQL’s implementation, the concepts discussed here are fundamental to understanding memory management in most relational database systems.

PostgreSQL keeps frequently accessed data in memory, sometimes providing the performance boost we need without additional complexity of introducing more fined-grained caches in application level. Let’s try to come up with a basic idea of this mechanism through practical examples to hopefully better inform our optimization decisions.

Understanding the Fundamentals

Before diving into practical examples, let’s clarify some key PostgreSQL concepts:

  • Relation: Any database object that contains rows. Tables, indexes, and sequences are all relations in PostgreSQL.
  • Page: The basic unit of storage in PostgreSQL (typically 8KB). Each relation is stored as a collection of pages on disk.
  • Buffer: When a page is loaded into memory, it becomes a buffer. Think of buffers as the in-memory representation of disk pages.
  • Buffer Cache: A shared memory area where PostgreSQL keeps frequently accessed pages.

Our Toolkit: PostgreSQL Buffer Cache Monitoring with pg_buffercache and pg_class

To explore buffer cache behavior, we’ll use two main PostgreSQL tools:

  1. pg_buffercache extension: Provides real-time visibility into shared buffer cache content, allowing us to track which pages are in memory and their current state. See the PostgreSQL documentation for more details.
  2. pg_class system catalog: Contains metadata about database objects (tables, indexes, etc.)

Let’s start by installing the pg_buffercache extension:

1
CREATE EXTENSION IF NOT EXISTS pg_buffercache;

Practical Exploration

We’ll follow these steps to understand buffer cache behavior:

  1. Create a test table with predictable data size
  2. Observe how data is stored in pages
  3. Use pg_buffercache to observe how pages are loaded into memory during queries
  4. Add an index to see how it affects page loading patterns
  5. Track how buffer cache state changes when we modify data
  6. See how system processes handle dirty (modified) pages
  7. Compare query performance for cached vs uncached data

Setting Up Our Test Environment

Let’s create a test table without any indexes:

1
2
3
4
5
6
7
8
9
CREATE TABLE buffer_test (
    id      SERIAL,
    data    TEXT
);

-- Insert 100k rows with 1KB data each
INSERT INTO buffer_test (data)
SELECT repeat('x', 1024)
FROM generate_series(1, 100000);

Understanding Table Size

First, let’s examine our table’s size using pg_class:

1
2
3
SELECT relpages, reltuples 
FROM   pg_class 
WHERE  relname = 'buffer_test';
1
2
3
| relpages | reltuples |
|----------|-----------|
| 0        | 0         |

The zero values indicate that table statistics haven’t been updated. Let’s fix that:

1
2
3
4
5
ANALYZE buffer_test;

SELECT relpages, reltuples 
FROM   pg_class 
WHERE  relname = 'buffer_test';
1
2
3
| relpages | reltuples |
|----------|-----------|
| 14,286   | 100,000   |

Here:

  • relpages: Number of disk pages the table uses
  • reltuples: Estimated number of rows

Monitoring Cache Behavior

Let’s verify our cache is empty:

1
2
3
4
5
SELECT    COUNT(*) 
FROM      pg_buffercache b
JOIN      pg_class c 
  ON      b.relfilenode = c.relfilenode
WHERE     c.relname = 'buffer_test';
1
2
3
| count |
|-------|
| 0     |

The relfilenode column in pg_buffercache helps us identify which buffers belong to our table. Now, let’s query a specific row:

1
2
3
4
5
6
7
8
SELECT * FROM buffer_test WHERE id = 70000;

-- Check cache after the query
SELECT    COUNT(*) 
FROM      pg_buffercache b
JOIN      pg_class c 
  ON      b.relfilenode = c.relfilenode
WHERE     c.relname = 'buffer_test';
1
2
3
| count |
|-------|
| 32    |

You may have expected to see only one page loaded into memory since the row we queried earlier belongs to one page. But it’s not the case. Since we have not introduced any indexes on the id column yet, database cannot efficiently find that one page and it has to do a sequential scan which causes PostgreSQL to read through all table pages from the beginning until it finds our target row. The number of pages loaded depends on various factors including the database’s buffer replacement strategy.

Adding an Index

Let’s add an index:

1
CREATE INDEX ON buffer_test(id);

Now restart PostgreSQL server to clear the cache. As an example, this is how I did it on my installed version on macOS using pg_ctl:

1
sudo -u postgres pg_ctl restart -D /Library/PostgreSQL/17/data

After restart, query the same row:

1
2
3
SELECT    ctid, id 
FROM      buffer_test 
WHERE     id = 70000;
1
2
3
| ctid      | id    |
|-----------|-------|
| (9999,7)  | 70000 |

The ctid (Tuple ID) is a special system column in PostgreSQL that represents the physical location of a row version within its table. Every row in a PostgreSQL table has a Tuple ID that consists of two numbers: the block number (or page number) and the tuple index within that block. Here ctid shows our row is on page 9999. Let’s check the buffer cache:

1
2
3
4
5
6
7
SELECT    bufferid, 
          relblocknumber, 
          isdirty 
FROM      pg_buffercache b  
JOIN      pg_class c 
  ON      b.relfilenode = c.relfilenode  
WHERE     c.relname = 'buffer_test';
1
2
3
| bufferid | relblocknumber | isdirty |
|----------|----------------|---------|
| 149      | 9999           | false   |

With the index, PostgreSQL loaded only the needed page. Key columns used here:

  • bufferid: Unique identifier for the buffer in shared memory
  • relblocknumber: Page number within the relation
  • isdirty: Indicates if the page has been modified

Observing Dirty Pages

Let’s modify our row:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
UPDATE    buffer_test 
SET       data = 'modified data' 
WHERE     id = 70000;

SELECT    bufferid, 
          relblocknumber, 
          isdirty
FROM      pg_buffercache b
JOIN      pg_class c 
  ON      b.relfilenode = c.relfilenode
WHERE     c.relname = 'buffer_test';
1
2
3
| bufferid | relblocknumber | isdirty |
|----------|----------------|---------|
| 149      | 9999           | true    |

The page is now marked dirty, indicating pending changes.

Understanding Checkpoints

By calling CHECKPOINT we can force writing dirty pages to disk. By default, PostgreSQL runs automatic checkpoints:

  1. Every checkpoint_timeout seconds (default: 5 minutes)
  2. When WAL or Write-Ahead Logging reaches max_wal_size (default: 1 GB)

Now let’s force a checkpoint to write dirty pages to disk:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
CHECKPOINT;

-- Check cache state
SELECT    bufferid, 
          relblocknumber, 
          isdirty
FROM      pg_buffercache b
JOIN      pg_class c 
  ON      b.relfilenode = c.relfilenode
WHERE     c.relname = 'buffer_test';
1
2
3
| bufferid | relblocknumber | isdirty |
| -------- | -------------- | ------- |
| 149      | 9999           | false   |

Not dirty anymore but still in cache!

Comparing Cached vs Uncached Access

Let’s demonstrate the performance benefit of the buffer cache by comparing access times for cached and uncached data. First we query the page number for other rows around the row with id 70000 that we have been working with so far:

1
2
3
SELECT    ctid, id 
FROM      buffer_test 
WHERE     id between 69997 and 70003;
1
2
3
4
5
6
7
8
9
| ctid      | id    |  
|-----------|-------|  
| (9999,4)  | 69997 |  
| (9999,5)  | 69998 |  
| (9999,6)  | 69999 |  
| (9999,8)  | 70000 |  
| (10000,1) | 70001 |  
| (10000,2) | 70002 |  
| (10000,3) | 70003 |

Now restart the database server once more to make sure we continue with a clean cache. Then query the id 70000 as we did before to load the page 9999 into buffer cache. Now we are ready to perform our comparison using EXPLAIN ANALYSE:

1
2
3
4
5
EXPLAIN ANALYSE
SELECT    ctid, id 
FROM      buffer_test 
WHERE     id = 69997;
-- Execution Time: 0.046 ms
1
2
3
4
5
EXPLAIN ANALYSE
SELECT    ctid, id 
FROM      buffer_test 
WHERE     id = 10;
-- Execution Time: 0.604 ms

As you see the execution time for accessing the cached data (row with id 69997 belongs the cached page 9999) is about 13 times smaller than the uncached access!

Conclusion

We explored PostgreSQL’s buffer cache through practical examples that demonstrated its basic memory management behavior. By creating a test table and using monitoring tools, we observed how data pages move between disk and memory during different operations. Our experiments showed how queries without indexes lead to sequential scans that load multiple pages into memory, while adding an index allowed PostgreSQL to load only the specific page needed. We also saw how pages get marked as “dirty” when modified and remain in cache even after a checkpoint writes them to disk. Finally, we demonstrated how PostgreSQL’s buffer cache optimization works in practice by comparing query times between accessing rows from previously loaded pages versus pages that required fresh disk reads.

Built with Hugo
Theme Stack designed by Jimmy