aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Design
  • Engineering
  • Practices

Using VACUUM To Accelerate Transaction ID Freezing In Cloud SQL For PostgreSQL

  • aster.cloud
  • March 15, 2022
  • 7 minute read

PostgreSQL uses transaction IDs (TXIDs, XIDs) to implement Multi-Version Concurrency Control semantics (MVCC). The PostgreSQL documentation explains the role of TXIDs as follows:

PostgreSQL’s MVCC transaction semantics depend on being able to compare transaction ID (XID) numbers: a row version with an insertion XID greater than the current transaction’s XID is “in the future” and should not be visible to the current transaction. But since transaction IDs have limited size (32 bits), a cluster that runs for a long time (more than 4 billion transactions) would suffer transaction ID wraparound: the XID counter wraps around to zero, and all of a sudden transactions that were in the past appear to be in the future – which means their output become invisible. In short, catastrophic data loss.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

To overcome the transaction ID wraparound problem, PostgreSQL uses a vacuum mechanism that freezes committed transaction IDs and releases them for further use. You can think of this mechanism as “recycling” of transaction IDs that keeps the database operating despite using a finite number to store the ID.

Vacuum can operate as a background task called autovacuum (enabled by default), and it can also be invoked manually using the VACUUM command. Autovacuum is designed as a low-priority task that yields to regular workload, so the speed and effectiveness of the process depends on database activity. In some cases, autovacuum might not be able to recycle transaction IDs quickly enough, reaching a point where PostgreSQL initiates a special type of vacuum called “aggressive vacuum” or “anti-wraparound vacuum”.

If TXID utilization continues to increase despite autovacuum’s best efforts, the database stops accepting commands to prevent transaction ID wraparound and consequently data loss.

This blog post demonstrates ways of accelerating the vacuum process on your Cloud SQL PostgreSQL instance to avoid the problems described above.

Note: This post uses vacuum features introduced in PostgreSQL 12. In order to follow the instructions, your database must be running PostgreSQL version 12 or later.

Autovacuum versus manual vacuum for TXID freezing

The following are the main stages  of a vacuum operation, as shown in the pg_stat_progress_vacuum view:

  1. Scanning heap: Vacuum is scanning the heap. It will prune and defragment each page if required, and possibly perform TXID freezing activity.
  2. Vacuuming indexes: Vacuum is removing the dead row versions from indexes.
  3. Vacuuming heap: Vacuum is removing dead row versions from the heap (table).
  4. Truncating heap: Vacuum is truncating the heap to return empty pages at the end of the table to the operating system.

The task of freezing old transaction IDs requires the “scanning heap” and “vacuuming heap” phases, but not the “vacuuming indexes” and “truncating heap” phases. Consequently, when running a vacuum for the sole purpose of freezing transaction IDs, it can be beneficial to skip the optional phases. Large tables with multiple indexes are particularly susceptible to spending a considerable amount of time in the “vacuuming indexes” stage, and skipping that stage can considerably reduce vacuum time.

Read More  Anthos Config Management: Enforcing The CIS Benchmark With Policy Controller

The optional phases are not skipped during automatic vacuuming. You can however cancel an autovacuum operation already in progress, and run a customized manual vacuum instead. You can also use the manual vacuum technique proactively to prevent PostgreSQL from initiating an aggressive autovacuum later.

Note: PostgreSQL 14 introduced a vacuum_failsafe_age parameter that provides equivalent functionality as part of autovacuum. PostgreSQL 14 users can still follow the manual procedure to become familiar with vacuum performance concepts, but the built-in functionality might be a better long-term choice.

Manual vacuum guide

Overview

This section will walk you through the following steps:

  1. Check transaction ID utilization in each database.
  2. Identify the autovacuum operations that can be canceled and re-run manually using an optimized VACUUM command.
  3. Run an optimized VACUUM command for each table and monitor its progress.
  4. Optionally, rebuild indexes on each table that was vacuumed manually.
  5. Check database metrics again to verify that transaction ID utilization went down after the manual vacuum.

Note that although vacuum operations do not cause downtime, they do introduce additional load in the database, and their run time is difficult to predict. You may want to use a clone to test and familiarize yourself with the procedure before attempting it on production databases.

Check transaction ID utilization before vacuuming

To determine transaction ID utilization for each database, connect to your Cloud SQL instance as the user postgres and run the following query:

 

SELECT 
  datname, 
  age(datfrozenxid) AS frozen_xid_age, 
  ROUND(
    100 *(
      age(datfrozenxid)/ 2146483647.0 :: float
    )
  ) consumed_txid_pct, 
  current_setting('autovacuum_freeze_max_age'):: int - age(datfrozenxid) AS remaining_aggressive_vacuum 
FROM 
  pg_database 
WHERE 
  datname NOT IN (
    'cloudsqladmin', 'template0', 'template1'
  );

 

The output will look like this:

 

datname        | frozen_xid_age | consumed_txid_pct | remaining_aggressive_vacuum 
---------------+----------------+-------------------+-----------------------------
 test_db       |     2146483648 |               100 |                 -1946483648
 postgres      |     2146483648 |               100 |                 -1946483648
 ..

 

A value of >80% in the consumed_txid_pct column indicates that tables in that database are in need of vacuuming to recycle transaction IDs. It is likely that autovacuum is already running on those tables, and it can be sped up as described earlier.

Follow the steps in the next section for each database that needs vacuuming.

Choose tables to vacuum manually

Connect to the database and run the following query to list tables that are currently processed by the autovacuum daemon:

 

SELECT p.pid,
      p.datname,
      p.query,
      p.backend_type,
      a.phase,
      a.heap_blks_scanned / a.heap_blks_total::float * 100 AS "% scanned",
      a.heap_blks_vacuumed / a.heap_blks_total::float * 100 AS "% vacuumed",
      pg_size_pretty(pg_table_size(a.relid)) AS "table size",
pg_size_pretty(pg_indexes_size(a.relid)) AS "indexes size",
      pg_get_userbyid(c.relowner) AS owner
 FROM pg_stat_activity p
 JOIN pg_stat_progress_vacuum a ON a.pid = p.pid
 JOIN pg_class c ON c.oid = a.relid
WHERE p.query LIKE 'autovacuum%';

 

Read More  The New Google Cloud Region In Dallas, Texas Is Now Open

The output will look like this:

 

-[ RECORD 1 ]+-------------------------------------------
pid          | 278394
datname      | test_db
query        | autovacuum: VACUUM public.sample2
backend_type | autovacuum worker
phase        | scanning heap
% scanned    | 80.52268841262791
% vacuumed   | 0
table size   | XX
indexes size | XX
owner        | test_user
-[ RECORD 2 ]+-------------------------------------------
pid          | 286964
datname      | test_db
query        | autovacuum: VACUUM public.pgbench_accounts
backend_type | autovacuum worker
phase        | vacuuming indexes
% scanned    | 100
% vacuumed   | 0
table size   | XX
indexes size | XX
owner        | test_user
-[ RECORD 3 ]+-------------------------------------------
pid          | 271948
datname      | test_db
query        | autovacuum: VACUUM ANALYZE public.sample1
backend_type | autovacuum worker
phase        | scanning heap
% scanned    | 22.971656445210044
% vacuumed   | 0
table size   | XX
indexes size | XX
owner        | test_user

 

Each record in the output corresponds to one autovacuum operation running in the database. Review the output and identify any records where the vacuum is in the “vacuuming indexes” phase as shown in the phase field. This indicates an operation that can potentially be sped up by canceling the autovacuum and performing a manual vacuum instead.

If there are multiple tables eligible for manual vacuuming, focus on the largest ones first. The larger the table, the longer the autovacuum process can take. Therefore, applying this procedure on the largest tables first can produce the biggest gains.

Follow the steps in the next section for each table you identified. Subsequent examples will use information from RECORD 2 highlighted above.

Vacuum a table manually

Ensure that these prerequisites are met before proceeding:

  1. In order to run a VACUUM command on a table, you must be logged in as the table owner.
  2. In order to cancel an ongoing autovacuum, you need permission  to execute the pg_cancel_backend function. The user postgres holds this permission by default, but other users must be granted access explicitly. To grant access, connect to the instance as postgres and run the following command:

 

GRANT pg_signal_backend TO <user_name>;

 

Once the prerequisites are met, execute the following commands to cancel the ongoing autovacuum and issue a manual vacuum:

 

SELECT 
  pg_cancel_backend(< pid >) 
FROM 
  pg_stat_progress_vacuum 
WHERE 
  relid = '<schema_name>.<table_name>' :: regclass;
VACUUM (
  TRUNCATE off, INDEX_CLEANUP false, 
  VERBOSE, FREEZE
) < schema_name >.< table_name >;

 

For example:

 

SELECT 
  pg_cancel_backend(286964) 
FROM 
  pg_stat_progress_vacuum 
WHERE 
  relid = 'public.pgbench_accounts' :: regclass;
VACUUM (
  TRUNCATE off, INDEX_CLEANUP false, 
  VERBOSE, FREEZE
) public.pgbench_accounts;

 

You can monitor the progress of your manual vacuum using the pg_stat_progress_vacuum view:

 

SELECT 
  *, 
  relid :: regclass, 
  heap_blks_scanned / heap_blks_total :: float * 100 "% scanned", 
  heap_blks_vacuumed / heap_blks_total :: float * 100 "% vacuumed" 
FROM 
  pg_stat_progress_vacuum;

 

After the vacuum completes, you can optionally reindex the table. Our optimized VACUUM command contained the INDEX_CLEANUP false clause, which skips the index optimization stage. Bypassing index optimization doesn’t cause any immediate issues, but if you frequently vacuum the same tables with INDEX_CLEANUP false, it can lead to index bloat in the long term. You may want to REINDEX your table periodically if index bloat becomes a problem. You can read more about index bloat here.

Read More  Force Terraform Resource Recreation

Use the REINDEX command to reindex a table:

 

REINDEX (VERBOSE) TABLE CONCURRENTLY <schema_name>.<table_name>;

 

For example:

 

REINDEX (VERBOSE) TABLE CONCURRENTLY public.pgbench_accounts;

 

Vacuuming tables proactively

The above procedure focuses on tables that are actively processed by autovacuum. You can also vacuum your tables proactively, based on what you know about their size and the volume of workload they receive.

Even when you vacuum your tables routinely and proactively, continue to monitor transaction ID utilization and try to keep it as low as possible across all databases. For context: by default, Postgres starts an aggressive vacuum when transaction ID utilization reaches 10%. This threshold can be configured using the autovacuum_freeze_max_age setting.

If you find that the default autovacuum behavior is not sufficient for your workload (e.g. it’s routinely unable to reclaim transaction IDs quickly enough), you should consider tuning the autovacuum parameters. Finding optimal configuration can involve some trial and error, but a well-tuned autovacuum can reduce or even eliminate the need for proactive manual vacuuming. Note that a more aggressive autovacuum can potentially impact the performance of regular workloads, so the related settings are best changed and validated in small increments.

Checking transaction ID utilization after vacuuming

After you’ve completed manual vacuuming, check transaction ID utilization again:

 

SELECT 
  datname, 
  age(datfrozenxid) AS frozen_xid_age, 
  ROUND(
    100 *(
      age(datfrozenxid)/ 2146483647.0 :: float
    )
  ) consumed_txid_pct, 
  current_setting('autovacuum_freeze_max_age'):: int - age(datfrozenxid) AS remaining_aggressive_vacuum 
FROM 
  pg_database 
WHERE 
  datname NOT IN (
    'cloudsqladmin', 'template0', 'template1'
  );

 

You should see that the value in consumed_txid_pct field is now lower than it was before:

 

datname        | frozen_xid_age | consumed_txid_pct | remaining_aggressive_vacuum 
---------------+----------------+-------------------+-----------------------------
 test_db       |     2146483648 |                XX |                 -1946483648
 postgres      |     2146483648 |                XX |                 -1946483648
(5 rows)

 

If the value is still very high, it might be driven by certain tables that weren’t covered by the prior procedure. You can obtain TXID information at table level to identify tables that still need vacuuming.

For example, this query will show top ten tables ordered by transaction ID utilization:

 

SELECT c.relname AS table_name,
      age(c.relfrozenxid) AS frozen_xid_age,
      ROUND(100 * (age(c.relfrozenxid) / 2146483647)) AS consumed_txid_pct,
      pg_size_pretty(pg_total_relation_size(c.oid)) AS table_size
 FROM pg_class c
 JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE c.relkind IN ('r', 't', 'm')
  AND n.nspname NOT IN ('pg_toast')
ORDER BY 2 DESC
LIMIT 10;

 

Learn More

In this blog post, we have demonstrated the ways of accelerating the vacuum process on your Cloud SQL PostgreSQL instance to avoid the transaction ID wraparound. Here is the link that describes what you can do when your database runs into Transaction ID Wraparound protection in PostgreSQL.

 

By: Naresh Gandi (PostgreSQL Database Engineer, Google Cloud)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Cloud SQL
  • Databases
  • Google Cloud
  • PostgreSQL
  • Tutorial
You May Also Like
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024
IBM and Ferrari Premium Partner
View Post
  • Data
  • Engineering

IBM Selected as Official Fan Engagement and Data Analytics Partner for Scuderia Ferrari HP

  • November 7, 2024
View Post
  • Engineering

Transforming the Developer Experience for Every Engineering Role

  • July 14, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.