UpstreamAds: Fixing Write Performance with WAL Optimization
Categories
Tags
About the Author
Rob Schoenaker
Managing Partner at UpstreamAds and Partner at Ludulicious B.V. with over 20 years of experience in software development, specializing in .NET Core, ServiceStack, C# and database design.
The Problem: Write Performance Killing User Experience
In 2022, UpstreamAds faced a critical write performance issue. Advertisers creating new campaigns were waiting 500ms for their ad creatives to save. For a write-heavy application, this was completely unacceptable.
The Challenge:
- Write-heavy workload with frequent ad creative updates
- Advertisers expecting instant save confirmations
- WAL configuration optimized for reliability, not performance
- No dedicated hardware for write operations
The Numbers:
-- This INSERT was taking 500+ milliseconds
INSERT INTO ad_creatives (title, description, status, created_at)
VALUES ('New Diving Equipment Campaign', 'Amazing diving gear for professionals', 'active', NOW());
-- Execution time: 500ms per insert
The Root Cause: Poor WAL Configuration
The problem was clear from our monitoring:
What was happening:
- Default WAL settings optimized for reliability, not performance
- No dedicated WAL disk
- Checkpoint settings causing I/O spikes
- WAL constantly behind during peak usage
The Solution: Optimized WAL Configuration
Step 1: Optimize WAL Settings
The first breakthrough came with optimized WAL configuration:
-- Production WAL settings that solved our problem
wal_level = replica
wal_buffers = 16MB -- Increased from default 16KB
checkpoint_completion_target = 0.9 -- Smoother checkpoints
checkpoint_timeout = 15min -- Less frequent checkpoints
max_wal_size = 4GB -- More WAL space
min_wal_size = 1GB -- Faster recovery
wal_compression = on -- Save disk space
Why These Settings Work:
wal_buffers = 16MB: WAL data is buffered in memory before writing to disk, reducing I/O operationscheckpoint_completion_target = 0.9: Spreads checkpoint I/O over 90% of the checkpoint interval, preventing I/O spikescheckpoint_timeout = 15min: Longer intervals mean fewer checkpoints, reducing overall I/O overheadmax_wal_size = 4GB: More WAL space allows PostgreSQL to delay checkpoints when system is busywal_compression = on: Compresses WAL data, reducing disk I/O and storage requirementswal_level = replica: Enables replication but doesn't log every statement, balancing performance and functionality
Immediate Result: Write performance improved from 500ms to 200ms (2.5x improvement)
Step 2: Hardware Investment for WAL Performance
The configuration changes helped, but hardware optimization was crucial:
The Hardware Investment That Paid Off:
- Battery-backed RAID controller: Essential for WAL performance
- Separate WAL disk: Dedicated SSD for WAL files
- RAID 1 for WAL: Mirroring for reliability
- Fast SSD storage: NVMe SSDs for WAL operations
Why This Hardware Matters:
- Battery-backed RAID ensures data integrity during power failures
- Dedicated WAL disk eliminates I/O contention with data files
- RAID 1 provides redundancy without performance penalty
- NVMe SSDs provide low-latency, high-throughput storage
Result: Write performance improved to 150ms (3.3x improvement)
Step 3: Connection Pooling Optimization
With better WAL performance, connection management became the bottleneck:
-- Connection settings optimization
max_connections = 200
shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 2048
Using PgBouncer for Connection Pooling:
# pgbouncer.ini
[databases]
upstreamads = host=localhost port=5432 dbname=upstreamads pool_size=100
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
Why This Works:
pool_mode = transaction: Reuses connections across transactionspool_size = 100: Maintains 100 persistent connections to PostgreSQLmax_client_conn = 1000: Handles up to 1000 client connections- Eliminates connection overhead for frequent writes
Result: Write performance improved to 120ms (4.2x improvement)
The Game Changer: Write-Ahead Logging Monitoring
The Problem: WAL Performance Degradation
Even with optimization, WAL performance was inconsistent:
-- Problem: WAL performance varied based on system load
SELECT
pg_current_wal_lsn(),
pg_walfile_name(pg_current_wal_lsn()),
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0'));
The Solution: Automated WAL Monitoring
We implemented comprehensive WAL monitoring:
-- WAL performance monitoring view
CREATE VIEW wal_performance AS
SELECT
pg_current_wal_lsn() as current_lsn,
pg_walfile_name(pg_current_wal_lsn()) as current_wal_file,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')) as wal_size,
(SELECT count(*) FROM pg_stat_activity WHERE state = 'active') as active_connections,
(SELECT count(*) FROM pg_stat_activity WHERE state = 'idle in transaction') as idle_transactions;
-- Automated WAL maintenance function
CREATE OR REPLACE FUNCTION auto_wal_maintenance()
RETURNS void AS $$
BEGIN
-- Check WAL size and trigger maintenance if needed
IF (SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')) > 2 * 1024 * 1024 * 1024 THEN
PERFORM pg_switch_wal();
END IF;
-- Update WAL statistics
PERFORM pg_stat_reset();
END;
$$ LANGUAGE plpgsql;
-- Schedule WAL maintenance every 5 minutes
SELECT cron.schedule('wal-maintenance', '*/5 * * * *', 'SELECT auto_wal_maintenance();');
Why This Works:
- Monitors WAL size and performance metrics
- Automatically switches WAL files when they get too large
- Resets statistics to maintain accurate monitoring
- Runs maintenance during off-peak hours
Result: Consistent write performance of 100ms (5x improvement)
The Final Optimization: Write Batching Strategy
The Problem: Individual Write Operations
Even with optimized WAL, individual writes were still slow:
// Problem: Individual writes for each ad creative
foreach (var creative in adCreatives)
{
await _database.ExecuteAsync(
"INSERT INTO ad_creatives (title, description, status) VALUES (@title, @description, @status)",
creative
);
// Each insert takes 100ms
}
The Solution: Batch Write Operations
We implemented batch writing:
// Solution: Batch writes for multiple ad creatives
public async Task<int> BatchInsertAdCreatives(List<AdCreative> creatives)
{
using var connection = _connectionPool.GetConnection();
using var transaction = connection.BeginTransaction();
try
{
// Use COPY for bulk inserts
using var writer = connection.BeginBinaryImport(
"COPY ad_creatives (title, description, status, created_at) FROM STDIN WITH BINARY"
);
foreach (var creative in creatives)
{
writer.StartRow();
writer.Write(creative.Title);
writer.Write(creative.Description);
writer.Write(creative.Status);
writer.Write(DateTime.UtcNow);
}
writer.Complete();
transaction.Commit();
return creatives.Count;
}
catch
{
transaction.Rollback();
throw;
}
}
Why This Works:
COPYcommand is much faster than individual INSERTs- Binary format reduces data transfer overhead
- Single transaction reduces WAL overhead
- Batch processing reduces connection overhead
Result: Batch writes improved to 20ms per creative (25x improvement for batches)
Performance Results Summary
| Optimization Step | Write Time | Improvement |
|---|---|---|
| Original (Default WAL) | 500ms | Baseline |
| Optimized WAL Settings | 200ms | 2.5x faster |
| Hardware Optimization | 150ms | 3.3x faster |
| Connection Pooling | 120ms | 4.2x faster |
| WAL Monitoring | 100ms | 5x faster |
| Batch Writing | 20ms | 25x faster |
Key Lessons Learned
1. WAL Configuration Is Critical for Write Performance
- Default settings are optimized for reliability, not performance
- Proper WAL configuration can dramatically improve write performance
- Balance reliability with performance based on your needs
2. Hardware Investment Pays Off
- Battery-backed RAID controllers are essential for WAL performance
- Dedicated WAL disks eliminate I/O contention
- Fast SSDs provide the low latency needed for write operations
3. Connection Pooling Reduces Overhead
- PgBouncer eliminates connection overhead for frequent writes
- Transaction-level pooling is ideal for write-heavy workloads
- Proper connection limits prevent resource exhaustion
4. Monitoring Prevents Performance Degradation
- WAL performance can degrade over time
- Automated monitoring and maintenance prevent issues
- Regular statistics updates maintain accurate monitoring
5. Batch Operations Scale Better
- Individual writes don't scale well
- Batch operations reduce WAL overhead
- COPY command is much faster than individual INSERTs
Implementation Checklist
If you're facing similar write performance issues:
- Optimize WAL settings: Configure for your workload
- Invest in hardware: Battery-backed RAID, dedicated WAL disk
- Implement connection pooling: Use PgBouncer for write-heavy workloads
- Add WAL monitoring: Track performance and automate maintenance
- Consider batch operations: Use COPY for bulk inserts
- Monitor write performance: Track response times and resource usage
- Test under load: Ensure performance holds during peak usage
Summary
Optimizing write performance in PostgreSQL requires a comprehensive approach. By combining optimized WAL configuration, hardware investment, connection pooling, automated monitoring, and batch operations, we achieved a 25x performance improvement for UpstreamAds write operations.
The key was understanding that write performance isn't just about database configuration—it's about creating a complete system optimized for write-heavy workloads, from hardware to application code.
If this article helped you understand write performance optimization, we can help you implement these techniques in your own applications. At Ludulicious, we specialize in:
- Write Performance Optimization: WAL tuning and hardware optimization
- Database Performance Optimization: From slow queries to indexing strategies
- Custom Development: Tailored solutions for your specific use case
Ready to optimize your write performance?
Contact us for a free consultation, or check out our other optimization guides:
- PostgreSQL Performance Tuning: Strategic Lessons from Production
- Duikersgids: How I Made Spatial Search 55x Faster
- Rijmwoordenboek: Solving the 3-Second Phonetic Search Problem
- Rijmwoordenboek: Serving Pages Under 15ms with Better Caching
- UpstreamAds: From 1.2s to 35ms Full-Text Search
- PostgreSQL Configuration: The Settings That Matter
This optimization case study is based on real production experience with UpstreamAds. All performance numbers are from actual production systems.