A comprehensive collection of production-ready PostgreSQL DBA scripts for operational monitoring, performance analysis, and maintenance. Built from the perspective of a 14+ year veteran managing 1000+ databases with 10TB+ workloads in banking and fintech environments.
This repository contains 40+ production-grade SQL scripts organized by operational category. Each script is extensively documented with:
- Real-world context from managing 1000+ databases in banking/fintech
- Practical thresholds and alerting guidelines
- Troubleshooting paths for common issues
- Remediation commands with safety considerations
- Runbooks for incident response
The scripts address the critical DBA patterns encountered at scale:
- Connection pool exhaustion and idle-in-transaction cascades
- Cache hit ratio optimization for 10TB+ workloads
- Replication lag detection in high-availability systems
- Long-running query identification and termination
- Dead tuple accumulation and bloat prevention
- XID wraparound risk monitoring
- Performance optimization for multi-tenant environments
Identify and resolve query performance issues before they impact users.
| Script | Purpose | Key Metrics |
|---|---|---|
active_queries.sql |
Monitor currently executing queries | Duration, wait events, state |
slow_queries.sql |
Top 20 slowest queries by mean execution time | Execution time, row count, statistics |
cache_hit_ratio.sql |
Buffer pool effectiveness (target: >95%) | Cache hits, disk reads, ratios |
index_usage.sql |
Identify unused and inefficient indexes | Scan count, cache hit ratio, size |
table_seq_scans.sql |
Find tables with excessive sequential scans | Scan ratio, table size, missing indexes |
query_plan_analysis.sql |
EXPLAIN ANALYZE helper with interpretation guide | Planning, execution, node costs |
Use these when: Queries are slow, users report timeouts, CPU/I/O spikes
Manage VACUUM operations and prevent table bloat in high-churn environments.
| Script | Purpose | Key Metrics |
|---|---|---|
dead_tuples.sql |
Tables with high dead tuple counts | Dead ratio, tuples needing cleanup |
autovacuum_status.sql |
Monitor active autovacuum workers | Phase, progress, duration |
vacuum_stats.sql |
Vacuum and analyze history by table | Last VACUUM times, frequencies |
bloat_check.sql |
Estimate table and index bloat | Dead space percentage, bloat ratio |
xid_wraparound.sql |
Transaction ID age and wraparound risk | XID age, days until wraparound |
Use these when: Performance degrades gradually, tables grow despite deletes, VACUUM takes hours
Monitor streaming and logical replication in HA configurations.
| Script | Purpose | Key Metrics |
|---|---|---|
streaming_status.sql |
Physical replication lag (standby status) | Write/flush/replay lag, sync state |
logical_replication.sql |
Logical replication slots and lag | Slot status, WAL retention, lag |
wal_stats.sql |
WAL generation and archiving status | WAL size, generation rate, archive status |
replication_slots.sql |
All replication slots with lag analysis | Slot lag, WAL retention, inactive slots |
Use these when: Disaster recovery is concern, standby falls behind, WAL archiving fails
Prevent connection pool exhaustion and manage connection lifecycle.
| Script | Purpose | Key Metrics |
|---|---|---|
active_connections.sql |
Real-time connection breakdown | By state, user, database, client |
idle_transactions.sql |
CRITICAL: Idle-in-transaction sessions | Duration, blocking impact, termination |
connection_limits.sql |
Current vs max connections analysis | Per-user, per-database limits |
blocking_locks.sql |
Lock chains and blocking relationships | Blocked/blocking queries, wait time |
deadlock_detection.sql |
Recent deadlocks and conflict analysis | Deadlock frequency, patterns |
Use these when: "too many connections" errors, connection pool leaks, users complain of hangs
Ensure compliance and audit database access controls.
| Script | Purpose | Key Metrics |
|---|---|---|
role_audit.sql |
All users with privileges and attributes | Superusers, expiry dates, membership |
privilege_check.sql |
Schema and table privilege audit | Grants by user, excessive privileges |
pg_hba_review.sql |
Authentication rules review | Connection types, auth methods, SSL |
password_expiry.sql |
Password expiry monitoring | Expired/expiring accounts, compliance |
Use these when: Compliance audit, security review, password policy enforcement needed
Plan maintenance and manage database growth.
| Script | Purpose | Key Metrics |
|---|---|---|
database_sizes.sql |
Database and table sizes with trends | Growth rate, capacity planning |
tablespace_usage.sql |
Disk usage per tablespace | Space utilization, rebalancing |
extension_inventory.sql |
Installed extensions across cluster | Versions, dependencies, availability |
config_diff.sql |
Non-default PostgreSQL settings | Current vs default configuration |
long_running_queries.sql |
Queries exceeding time thresholds | Duration, termination commands |
Use these when: Capacity planning, configuration reviews, storage approaching limits
Overall cluster health and I/O performance analysis.
| Script | Purpose | Key Metrics |
|---|---|---|
quick_health_check.sql |
Single-query comprehensive health dashboard | Status of all critical metrics |
io_stats.sql |
I/O statistics and cache effectiveness | Reads, hits, cache hit ratio |
checkpoint_stats.sql |
Checkpoint frequency and impact | Checkpoint I/O, duration, effectiveness |
bgwriter_stats.sql |
Background writer effectiveness | Write distribution, memory pressure |
Use these when: General health assessment, hourly monitoring, incident triage
Install these extensions in your PostgreSQL cluster for full functionality:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements; -- CRITICAL for slow_queries.sql
CREATE EXTENSION IF NOT EXISTS pgstattuple; -- For bloat_check.sql
CREATE EXTENSION IF NOT EXISTS pg_trgm; -- For full-text search in queriesAdd to postgresql.conf:
# Essential for query analysis
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = 'all'
# Logging for troubleshooting
log_connections = on
log_disconnections = on
log_min_duration_statement = 1000 # Log queries > 1 second
log_lock_waits = onReload configuration:
systemctl reload postgresqlScripts require SUPERUSER or appropriate role privileges:
-- Connect as superuser
psql -U postgres -d postgres -f script_name.sql
-- Or create a DBA role with necessary permissions
CREATE ROLE dba_monitor WITH LOGIN CREATEROLE;
GRANT pg_monitor TO dba_monitor; -- Grants read-only access to pg_stat_*
GRANT EXECUTE ON FUNCTION pg_stat_statements() TO dba_monitor;Run a single script:
psql -U postgres -d mydb -f performance/active_queries.sqlRun all performance scripts:
for script in performance/*.sql; do
echo "=== $(basename $script) ==="
psql -U postgres -d mydb -f "$script"
echo ""
doneSave output to file:
psql -U postgres -d mydb -f performance/slow_queries.sql > slow_queries_report.csvRun health check every hour:
# Add to crontab
0 * * * * psql -U postgres -d postgres -f /path/to/health/quick_health_check.sql >> /var/log/postgres_health.logCopy scripts to pgAdmin custom queries:
- Login to pgAdmin
- Server → Connect
- Tools → Custom queries
- Paste script contents
- Save with descriptive name
Each script includes recommended thresholds. Customize for your environment:
| Metric | Default | Banking/FinTech | Analytics |
|---|---|---|---|
| Cache Hit Ratio | >95% | >98% (strict) | >90% (acceptable) |
| Idle in Transaction | 0 max | Kill after 5 min | Tolerate longer |
| Replication Lag | <1s | <100ms (critical) | <10s |
| XID Age | <50% | <20% | <50% |
| Connection Usage | <80% | <70% (strict) | <85% |
| Long Query | >5 min | >1 min (alert) | >1 hour (ok) |
-
Run:
quick_health_check.sql- Identify bottleneck -
If CRITICAL cache hit ratio:
- Run:
cache_hit_ratio.sql- Confirm - Run:
slow_queries.sql- Find expensive queries - Fix: Add indexes or increase
shared_buffers
- Run:
-
If CRITICAL dead tuples:
- Run:
dead_tuples.sql- Find affected tables - Fix:
VACUUM ANALYZEthe tables - Review:
autovacuum_status.sqlto ensure autovacuum is working
- Run:
- Run:
connection_limits.sql- Check current usage - Run:
idle_transactions.sql- Find leaked connections - Run:
active_queries.sql- See what's executing - Fix:
SELECT pg_terminate_backend(pid)for idle-in-transaction - Review: Application connection handling code
- Run:
streaming_status.sql- Check lag metrics - Run:
slow_queries.sql- Find standby-slow queries - Check:
wal_stats.sql- WAL generation rate - Optimize:
query_plan_analysis.sqlon slow queries - Monitor: Increase
max_wal_sendersif needed
- Run:
database_sizes.sql- Identify largest tables - Run:
bloat_check.sql- Check for wasted space - Options:
- Archive old data and DELETE
- Add table partitioning
- Increase tablespace disk allocation
- Run:
VACUUM FULLafter cleanup
This is a reference implementation. Contribute improvements by:
- Adding new scripts for uncovered scenarios
- Improving existing scripts with better logic
- Sharing real-world use cases and thresholds
- Testing compatibility with newer PostgreSQL versions
MIT License - See LICENSE file for details
Copyright (c) 2024-2026 Vijayakumar Poura / AMVANA Software
Permission is granted to use, modify, and distribute these scripts in any way, provided the MIT license terms are followed.
- PostgreSQL community for comprehensive system views
- Banking/fintech industry standards for performance requirements
- DBA colleagues who shared patterns from large-scale deployments
- Open-source PostgreSQL tools that inspired these scripts
- PostgreSQL Documentation
- PostgreSQL Wiki - Performance Optimization
- Postgres.fm Blog
- Depesz's PostgreSQL Knowledge
Author: Vijayakumar Poura (@pouradba)
Organization: AMVANA Software
Experience: 14+ years managing 1000+ databases, 10TB+ workloads in banking/fintech
Repository: pouradba/DBA-daytoday-scripts