Software Teams

Queue Architecture Patterns for Reliable Background Processing

How to design queue architectures that handle failures gracefully, scale predictably, and keep your application responsive.

Queues Are Infrastructure, Not an Afterthought

Most teams add queues when something is "too slow for a web request." This reactive approach leads to fragile queue setups that fail unpredictably under load. Treating your queue architecture as infrastructure from the start prevents a category of problems that are painful to fix later.

Choosing a Queue Backend

Redis

The most common choice for Laravel applications. Fast, simple, and works well for moderate volumes.

Best for: Applications processing up to ~10,000 jobs per minute. Most web applications fall comfortably within this range.

Watch out for: Redis stores everything in memory. If your queue backs up with millions of jobs, you will run out of memory. Redis also does not guarantee message persistence through restarts unless AOF persistence is configured.

Amazon SQS / Google Cloud Pub-Sub

Managed queue services that handle scaling, persistence, and availability automatically.

Best for: Applications where operational simplicity matters more than latency. SQS handles millions of messages without you managing infrastructure.

Watch out for: Higher latency than Redis (SQS long polling adds up to 20 seconds). FIFO ordering requires specific configuration. Message size limits (256 KB for SQS) may require storing payloads elsewhere.

RabbitMQ

A full-featured message broker with routing, exchanges, and advanced delivery guarantees.

Best for: Complex messaging patterns (fan-out, topic-based routing, priority queues) and systems requiring strong delivery guarantees.

Watch out for: Operational complexity. RabbitMQ requires monitoring, cluster management, and capacity planning. Overkill for simple job queues.

Job Design Patterns

Idempotent Jobs

Every job should be safe to run multiple times with the same input. Queues guarantee at-least-once delivery, not exactly-once. Network failures, worker crashes, and timeouts can all cause a job to be retried.

class ProcessPaymentJob implements ShouldQueue
{
    public function handle(): void
    {
        // Check if already processed before doing anything
        if ($this->invoice->isPaid()) {
            return;
        }

        // Use a database transaction with a unique constraint
        // to prevent double processing
        DB::transaction(function () {
            $payment = Payment::create([
                'invoice_id' => $this->invoice->id,
                'idempotency_key' => $this->idempotencyKey,
                'amount' => $this->invoice->total,
            ]);

            $this->invoice->markPaid($payment);
        });
    }
}

Small, Focused Jobs

A job that sends an email, updates a database, calls an external API, and generates a PDF is a job that fails in four different ways. Break it into four jobs:

// Instead of one monolithic job:
class ProcessOrderJob { /* does everything */ }

// Use a chain or event-driven approach:
class CreateOrderRecord implements ShouldQueue { ... }
class ChargePaymentMethod implements ShouldQueue { ... }
class SendOrderConfirmation implements ShouldQueue { ... }
class GenerateInvoicePdf implements ShouldQueue { ... }

Each job can be retried independently. If PDF generation fails, the payment does not retry.

Job Batching

When processing a collection of related items, use Laravel's job batching:

$batch = Bus::batch([
    new ProcessInvoiceLine($line1),
    new ProcessInvoiceLine($line2),
    new ProcessInvoiceLine($line3),
])->then(function (Batch $batch) {
    // All jobs completed successfully
    FinalizeInvoice::dispatch($invoiceId);
})->catch(function (Batch $batch, Throwable $e) {
    // First failure in the batch
    NotifyAdminOfFailure::dispatch($invoiceId, $e->getMessage());
})->dispatch();

Batches give you completion callbacks, failure handling, and progress tracking for free.

Failure Handling

Retry Strategy

Not all failures are equal. Differentiate between transient and permanent failures:

class CallExternalApiJob implements ShouldQueue
{
    public int $tries = 5;

    public function backoff(): array
    {
        // Exponential backoff: 10s, 30s, 90s, 270s, 810s
        return [10, 30, 90, 270, 810];
    }

    public function handle(): void
    {
        $response = Http::post('https://api.partner.com/sync', $this->data);

        if ($response->status() === 429) {
            // Rate limited: retry after the specified delay
            $this->release($response->header('Retry-After', 60));
            return;
        }

        if ($response->status() === 400) {
            // Bad request: our data is wrong, retrying will not help
            $this->fail(new InvalidPayloadException($response->body()));
            return;
        }

        $response->throw(); // Other errors: let the retry mechanism handle it
    }
}

Dead Letter Queues

Jobs that exhaust all retries should go to a dead letter queue for investigation, not disappear silently:

class CallExternalApiJob implements ShouldQueue
{
    public function failed(Throwable $exception): void
    {
        // Log for monitoring
        Log::error('External API sync failed permanently', [
            'job' => static::class,
            'data' => $this->data,
            'exception' => $exception->getMessage(),
        ]);

        // Store for manual review and replay
        FailedJobRecord::create([
            'job_class' => static::class,
            'payload' => serialize($this->data),
            'exception' => $exception->getMessage(),
            'failed_at' => now(),
        ]);
    }
}

Queue Priority and Segregation

Separate Queues for Different Priorities

// Critical: payment processing, user-facing notifications
class ProcessPayment implements ShouldQueue
{
    public string $queue = 'critical';
}

// Default: most application jobs
class SyncInventory implements ShouldQueue
{
    public string $queue = 'default';
}

// Low priority: analytics, reporting, cleanup
class GenerateMonthlyReport implements ShouldQueue
{
    public string $queue = 'low';
}

Run workers with priority ordering:

php artisan queue:work --queue=critical,default,low

The worker processes all critical jobs before touching default jobs, and all default jobs before low-priority ones.

Dedicated Workers for Isolation

For jobs that are resource-intensive or interact with unreliable external services, run dedicated workers:

# Worker for payment processing (dedicated resources, strict monitoring)
php artisan queue:work --queue=payments --max-jobs=100

# Worker for email sending (isolated from payment processing)
php artisan queue:work --queue=emails --max-jobs=500

# Worker for everything else
php artisan queue:work --queue=default,low --max-jobs=1000

If the email provider has an outage, the email queue backs up but payment processing continues unaffected.

Monitoring

Queue monitoring is not optional in production:

  • Queue depth: How many jobs are waiting? Rising queue depth indicates workers cannot keep up.
  • Processing time: How long does each job type take? Sudden increases indicate performance regression.
  • Failure rate: What percentage of jobs fail? Spikes indicate bugs or external service issues.
  • Worker health: Are workers running? Have they crashed? Use process managers (Supervisor, systemd) with automatic restart.

Set alerts on queue depth thresholds. If the critical queue exceeds 100 pending jobs, someone should know immediately.

Let's talk about your software teams needs

Whether you're modernizing your infrastructure, navigating compliance, or building new software - we can help.

Book a 30-min Call