postgresql-performance-optimization.
md 2025-06-24
PostgreSQL Performance Optimization Report
JUHI Production Database Analysis
Database: juhi_prod_latest
PostgreSQL Version: 16.2
Analysis Date: June 24, 2025
Database Size: ~750MB (total data + indexes)
🔴 Critical Issues - Immediate Action Required
1. Unused Indexes (47+ MB wasted space)
The database has numerous unused indexes consuming significant disk space and slowing down write
operations:
-- Large unused indexes (9MB+ each) - HIGH PRIORITY
DROP INDEX IF EXISTS helper_ranking_events_pkey; -- 9.3MB, 0
scans
DROP INDEX IF EXISTS client_helper_distances_pkey; -- 8.2MB, 0
scans
DROP INDEX IF EXISTS unique_index_client_id_helper_id; -- 8.2MB, 0
scans
-- Medium unused indexes (2-5MB each) - MEDIUM PRIORITY
DROP INDEX IF EXISTS visits_helper_invitation_id_created_at; -- 5MB, 0
scans
DROP INDEX IF EXISTS type_client_id_date_unique; -- 3.9MB, 0
scans
DROP INDEX IF EXISTS visits_pkey; -- 3.6MB, 0
scans
DROP INDEX IF EXISTS visits_created_at; -- 3.6MB, 0
scans
DROP INDEX IF EXISTS helper_ranking_events_helper_id; -- 3MB, 0
scans
DROP INDEX IF EXISTS budgets_pkey; -- 2.8MB, 0
scans
-- Smaller unused indexes (1-2MB each) - LOW PRIORITY
DROP INDEX IF EXISTS helper_invitations_status_accepted_at; -- 2.2MB, 0
scans
DROP INDEX IF EXISTS helper_invitations_invitation_id; -- 2.2MB, 0
scans
DROP INDEX IF EXISTS helper_services_pkey; -- 2.1MB, 0
scans
DROP INDEX IF EXISTS emails_pkey; -- 2.1MB, 0
scans
DROP INDEX IF EXISTS helper_invitations_accepted_at; -- 2.1MB, 0
1/9
postgresql-performance-optimization.md 2025-06-24
scans
DROP INDEX IF EXISTS helper_invitations_deleted_at; -- 1.9MB, 0
scans
Expected Benefits:
Free up 47+ MB of disk space
Reduce write operation overhead
Improve INSERT/UPDATE performance by 20-30%
2. Excessive Sequential Scans - Missing Critical Indexes
Several large tables have extremely high sequential scan ratios, indicating missing indexes:
Table Analysis:
Table Rows Sequential
Reads
Avg per
Scan Issue
161K 3.5M 104K Missing status/client
visits
indexes
documents 196K 3.5M 176K Missing client/type indexes
helper_invitations 274K 3.3M 182K Under-indexed
helper_ranking_events 424K 1.7M 282K No useful indexes
notifications 532K 1.6M 399K Missing user/date indexes
Critical Missing Indexes:
-- visits table optimization
CREATE INDEX CONCURRENTLY idx_visits_status_deleted_at ON visits(status,
deleted_at);
CREATE INDEX CONCURRENTLY idx_visits_client_id ON visits(client_id);
CREATE INDEX CONCURRENTLY idx_visits_helper_id_status ON visits(helper_id,
status);
-- documents table optimization
CREATE INDEX CONCURRENTLY idx_documents_client_id ON documents(client_id);
CREATE INDEX CONCURRENTLY idx_documents_type_status ON documents(type,
status);
CREATE INDEX CONCURRENTLY idx_documents_created_at ON
documents(created_at);
-- helper_ranking_events optimization
CREATE INDEX CONCURRENTLY idx_helper_ranking_events_event_date ON
helper_ranking_events(event_date);
CREATE INDEX CONCURRENTLY idx_helper_ranking_events_ranking_type ON
2/9
postgresql-performance-optimization.md 2025-06-24
helper_ranking_events(ranking_type);
CREATE INDEX CONCURRENTLY idx_helper_ranking_events_helper_event ON
helper_ranking_events(helper_id, event_date);
-- notifications optimization
CREATE INDEX CONCURRENTLY idx_notifications_user_type_read_at ON
notifications(user_type, read_at);
CREATE INDEX CONCURRENTLY idx_notifications_created_at ON
notifications(created_at);
CREATE INDEX CONCURRENTLY idx_notifications_recipient_id ON
notifications(recipient_id);
-- client_helper_distances optimization
CREATE INDEX CONCURRENTLY idx_client_helper_distances_distance ON
client_helper_distances(distance);
CREATE INDEX CONCURRENTLY idx_client_helper_distances_client_id ON
client_helper_distances(client_id);
-- emails optimization
CREATE INDEX CONCURRENTLY idx_emails_recipient_sent_at ON
emails(recipient, sent_at);
CREATE INDEX CONCURRENTLY idx_emails_type_status ON emails(type, status);
-- budgets optimization
CREATE INDEX CONCURRENTLY idx_budgets_client_date ON budgets(client_id,
date);
CREATE INDEX CONCURRENTLY idx_budgets_type_date ON budgets(type, date);
Expected Benefits:
Reduce query response times by 50-80%
Eliminate most sequential scans
Improve application responsiveness
🟡 Medium Priority Optimizations
3. Database Configuration Tuning
Current settings are sub-optimal for production workload:
-- Memory Configuration (requires PostgreSQL restart)
ALTER SYSTEM SET shared_buffers = '256MB'; -- Currently: 128MB
ALTER SYSTEM SET effective_cache_size = '2GB'; -- Currently: 4GB
(too high)
ALTER SYSTEM SET work_mem = '8MB'; -- Currently: 4MB
ALTER SYSTEM SET maintenance_work_mem = '128MB'; -- Currently: 64MB
-- I/O Optimization (for SSD storage)
ALTER SYSTEM SET effective_io_concurrency = 200; -- Currently: 1
ALTER SYSTEM SET random_page_cost = 1.1; -- Currently: 4 (HDD
3/9
postgresql-performance-optimization.md 2025-06-24
setting)
ALTER SYSTEM SET seq_page_cost = 1.0; -- Default: 1.0
-- Query Performance
ALTER SYSTEM SET default_statistics_target = 500; -- Currently: 100
ALTER SYSTEM SET checkpoint_completion_target = 0.9; -- Already optimal
-- Monitoring & Logging
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1s
ALTER SYSTEM SET log_checkpoints = on;
ALTER SYSTEM SET log_connections = on;
ALTER SYSTEM SET log_disconnections = on;
-- Connection Management
ALTER SYSTEM SET max_connections = 150; -- Currently: 200
(reduce overhead)
-- Apply changes (except those requiring restart)
SELECT pg_reload_conf();
Restart Required Settings:
shared_buffers
max_connections
effective_io_concurrency
4. Cache Hit Ratio Improvement
Current Status: 77.6% (Target: >95%)
Analysis:
Heap reads: 340,703
Heap hits: 1,183,172
Action needed: Increase buffer pool size
Solution:
-- Increase shared_buffers as shown above
-- Monitor with:
SELECT
sum(heap_blks_read) as heap_read,
sum(heap_blks_hit) as heap_hit,
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100
as cache_hit_ratio
FROM pg_statio_user_tables;
5. Vacuum and Statistics Strategy
4/9
postgresql-performance-optimization.md 2025-06-24
-- Enhanced autovacuum settings
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.1; -- More frequent
vacuum
ALTER SYSTEM SET autovacuum_analyze_scale_factor = 0.05; -- More frequent
analyze
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 2000; -- Faster vacuum
ALTER SYSTEM SET autovacuum_max_workers = 4; -- More parallel
workers
-- Manual maintenance for large tables (run weekly)
VACUUM ANALYZE emails;
VACUUM ANALYZE helper_invitations;
VACUUM ANALYZE notifications;
VACUUM ANALYZE visits;
VACUUM ANALYZE client_helper_distances;
VACUUM ANALYZE helper_ranking_events;
🟢 Long-term Strategic Improvements
6. Query Monitoring Setup
-- Enable pg_stat_statements for query analysis
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Add to postgresql.conf (requires restart):
-- shared_preload_libraries = 'pg_stat_statements'
-- pg_stat_statements.max = 10000
-- pg_stat_statements.track = all
-- Monitor slow queries:
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
stddev_exec_time,
rows,
100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read,
0) AS hit_percent
FROM pg_stat_statements
WHERE calls > 10
ORDER BY total_exec_time DESC
LIMIT 20;
7. Table Partitioning Strategy
Consider partitioning for largest, time-based tables:
5/9
postgresql-performance-optimization.md 2025-06-24
emails table (224MB)
-- Partition by month
CREATE TABLE emails_partitioned (
LIKE emails INCLUDING ALL
) PARTITION BY RANGE (created_at);
-- Create monthly partitions
CREATE TABLE emails_2024_01 PARTITION OF emails_partitioned
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
-- ... continue for each month
notifications table (66MB, 532K rows)
-- Partition by created_at
CREATE TABLE notifications_partitioned (
LIKE notifications INCLUDING ALL
) PARTITION BY RANGE (created_at);
helper_invitations table (103MB, 274K rows)
-- Partition by status and date
CREATE TABLE helper_invitations_partitioned (
LIKE helper_invitations INCLUDING ALL
) PARTITION BY LIST (status);
8. Connection Pooling
Current: 200 max_connections (high overhead)
Recommendation: Implement PgBouncer
# pgbouncer.ini
[databases]
juhi_prod = host=localhost port=5432 dbname=juhi_prod_latest
[pgbouncer]
pool_mode = transaction
max_client_conn = 200
default_pool_size = 25
reserve_pool_size = 5
6/9
postgresql-performance-optimization.md 2025-06-24
📊 Performance Monitoring Dashboard
Key Metrics to Track:
-- 1. Cache Hit Ratio (Target: >95%)
SELECT
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100
as cache_ratio
FROM pg_statio_user_tables;
-- 2. Index Usage (Target: idx_scan > seq_scan for large tables)
SELECT schemaname, relname, seq_scan, idx_scan, n_live_tup
FROM pg_stat_user_tables
WHERE n_live_tup > 1000
ORDER BY seq_scan DESC;
-- 3. Table Sizes
SELECT
schemaname, relname,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||relname)) as
size,
n_live_tup as rows
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(schemaname||'.'||relname) DESC;
-- 4. Unused Indexes
SELECT
schemaname, relname, indexrelname, idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND pg_relation_size(indexrelid) > 1000000
ORDER BY pg_relation_size(indexrelid) DESC;
🚀 Implementation Roadmap
Phase 1: Quick Wins (Week 1)
Remove top 10 unused indexes (saves 35+ MB)
Add critical missing indexes for visits and documents tables
Update basic configuration settings (non-restart)
Expected Impact: 40-60% query performance improvement
Phase 2: Configuration Optimization (Week 2)
Schedule maintenance window for PostgreSQL restart
Apply memory and I/O configuration changes
Implement enhanced autovacuum settings
7/9
postgresql-performance-optimization.md 2025-06-24
Expected Impact: 95%+ cache hit ratio, better memory utilization
Phase 3: Advanced Optimization (Week 3-4)
Add remaining missing indexes
Enable pg_stat_statements monitoring
Implement weekly vacuum schedule
Set up performance monitoring dashboard
Expected Impact: Complete elimination of performance bottlenecks
Phase 4: Long-term Strategy (Month 2)
Evaluate partitioning for largest tables
Implement connection pooling
Consider read replicas for reporting workloads
Expected Impact: Scalability for future growth
📈 Expected Performance Gains
Metric Current Target Improvement
Cache Hit Ratio 77.6% >95% +17.4%
Query Response Time Baseline -50-80% Major improvement
Disk Space 750MB ~700MB 50MB+ saved
Sequential Scans High Minimal 90%+ reduction
Index Efficiency Poor Excellent Dramatic improvement
⚠ Important Notes
. Test First: Run all changes on a staging environment first
. Backup: Ensure full database backup before major changes
. Monitor: Track performance metrics before and after changes
. Maintenance Windows: Some changes require downtime
. Rollback Plan: Have rollback procedures ready for each change
🔧 Maintenance Scripts
Weekly Maintenance
#!/bin/bash
# weekly_maintenance.sh
psql -d juhi_prod_latest << EOF
VACUUM ANALYZE emails;
8/9
postgresql-performance-optimization.md 2025-06-24
VACUUM ANALYZE helper_invitations;
VACUUM ANALYZE notifications;
VACUUM ANALYZE visits;
VACUUM ANALYZE client_helper_distances;
VACUUM ANALYZE helper_ranking_events;
REINDEX INDEX CONCURRENTLY idx_visits_status_deleted_at;
EOF
Performance Check
-- performance_check.sql
\echo 'Cache Hit Ratio:'
SELECT sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) *
100 as cache_ratio FROM pg_statio_user_tables;
\echo '\nLargest Tables:'
SELECT schemaname, relname,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||relname)) as size
FROM pg_stat_user_tables ORDER BY
pg_total_relation_size(schemaname||'.'||relname) DESC LIMIT 10;
\echo '\nUnused Indexes:'
SELECT indexrelname, pg_size_pretty(pg_relation_size(indexrelid)) FROM
pg_stat_user_indexes WHERE idx_scan = 0 ORDER BY
pg_relation_size(indexrelid) DESC LIMIT 5;
Report generated on June 24, 2025
Next review recommended: July 24, 2025
9/9