How to design queue architectures that handle failures gracefully, scale predictably, and keep your application responsive.
Most teams add queues when something is "too slow for a web request." This reactive approach leads to fragile queue setups that fail unpredictably under load. Treating your queue architecture as infrastructure from the start prevents a category of problems that are painful to fix later.
The most common choice for Laravel applications. Fast, simple, and works well for moderate volumes.
Best for: Applications processing up to ~10,000 jobs per minute. Most web applications fall comfortably within this range.
Watch out for: Redis stores everything in memory. If your queue backs up with millions of jobs, you will run out of memory. Redis also does not guarantee message persistence through restarts unless AOF persistence is configured.
Managed queue services that handle scaling, persistence, and availability automatically.
Best for: Applications where operational simplicity matters more than latency. SQS handles millions of messages without you managing infrastructure.
Watch out for: Higher latency than Redis (SQS long polling adds up to 20 seconds). FIFO ordering requires specific configuration. Message size limits (256 KB for SQS) may require storing payloads elsewhere.
A full-featured message broker with routing, exchanges, and advanced delivery guarantees.
Best for: Complex messaging patterns (fan-out, topic-based routing, priority queues) and systems requiring strong delivery guarantees.
Watch out for: Operational complexity. RabbitMQ requires monitoring, cluster management, and capacity planning. Overkill for simple job queues.
Every job should be safe to run multiple times with the same input. Queues guarantee at-least-once delivery, not exactly-once. Network failures, worker crashes, and timeouts can all cause a job to be retried.
class ProcessPaymentJob implements ShouldQueue
{
public function handle(): void
{
// Check if already processed before doing anything
if ($this->invoice->isPaid()) {
return;
}
// Use a database transaction with a unique constraint
// to prevent double processing
DB::transaction(function () {
$payment = Payment::create([
'invoice_id' => $this->invoice->id,
'idempotency_key' => $this->idempotencyKey,
'amount' => $this->invoice->total,
]);
$this->invoice->markPaid($payment);
});
}
}
A job that sends an email, updates a database, calls an external API, and generates a PDF is a job that fails in four different ways. Break it into four jobs:
// Instead of one monolithic job:
class ProcessOrderJob { /* does everything */ }
// Use a chain or event-driven approach:
class CreateOrderRecord implements ShouldQueue { ... }
class ChargePaymentMethod implements ShouldQueue { ... }
class SendOrderConfirmation implements ShouldQueue { ... }
class GenerateInvoicePdf implements ShouldQueue { ... }
Each job can be retried independently. If PDF generation fails, the payment does not retry.
When processing a collection of related items, use Laravel's job batching:
$batch = Bus::batch([
new ProcessInvoiceLine($line1),
new ProcessInvoiceLine($line2),
new ProcessInvoiceLine($line3),
])->then(function (Batch $batch) {
// All jobs completed successfully
FinalizeInvoice::dispatch($invoiceId);
})->catch(function (Batch $batch, Throwable $e) {
// First failure in the batch
NotifyAdminOfFailure::dispatch($invoiceId, $e->getMessage());
})->dispatch();
Batches give you completion callbacks, failure handling, and progress tracking for free.
Not all failures are equal. Differentiate between transient and permanent failures:
class CallExternalApiJob implements ShouldQueue
{
public int $tries = 5;
public function backoff(): array
{
// Exponential backoff: 10s, 30s, 90s, 270s, 810s
return [10, 30, 90, 270, 810];
}
public function handle(): void
{
$response = Http::post('https://api.partner.com/sync', $this->data);
if ($response->status() === 429) {
// Rate limited: retry after the specified delay
$this->release($response->header('Retry-After', 60));
return;
}
if ($response->status() === 400) {
// Bad request: our data is wrong, retrying will not help
$this->fail(new InvalidPayloadException($response->body()));
return;
}
$response->throw(); // Other errors: let the retry mechanism handle it
}
}
Jobs that exhaust all retries should go to a dead letter queue for investigation, not disappear silently:
class CallExternalApiJob implements ShouldQueue
{
public function failed(Throwable $exception): void
{
// Log for monitoring
Log::error('External API sync failed permanently', [
'job' => static::class,
'data' => $this->data,
'exception' => $exception->getMessage(),
]);
// Store for manual review and replay
FailedJobRecord::create([
'job_class' => static::class,
'payload' => serialize($this->data),
'exception' => $exception->getMessage(),
'failed_at' => now(),
]);
}
}
// Critical: payment processing, user-facing notifications
class ProcessPayment implements ShouldQueue
{
public string $queue = 'critical';
}
// Default: most application jobs
class SyncInventory implements ShouldQueue
{
public string $queue = 'default';
}
// Low priority: analytics, reporting, cleanup
class GenerateMonthlyReport implements ShouldQueue
{
public string $queue = 'low';
}
Run workers with priority ordering:
php artisan queue:work --queue=critical,default,low
The worker processes all critical jobs before touching default jobs, and all default jobs before low-priority ones.
For jobs that are resource-intensive or interact with unreliable external services, run dedicated workers:
# Worker for payment processing (dedicated resources, strict monitoring)
php artisan queue:work --queue=payments --max-jobs=100
# Worker for email sending (isolated from payment processing)
php artisan queue:work --queue=emails --max-jobs=500
# Worker for everything else
php artisan queue:work --queue=default,low --max-jobs=1000
If the email provider has an outage, the email queue backs up but payment processing continues unaffected.
Queue monitoring is not optional in production:
Set alerts on queue depth thresholds. If the critical queue exceeds 100 pending jobs, someone should know immediately.
Whether you're modernizing your infrastructure, navigating compliance, or building new software - we can help.
Book a 30-min Call