How to correctly order messages in distributed systems

How to correctly order messages in distributed systems

I've debugged a lot of weird bugs in distributed systems, but few are as disorienting as this one: the invoice was issued before the customer clicked "order." In a monolith, a single database transaction would have prevented it. Once you're on event-driven architecture, ordering is something you have to actively design for - it doesn't come for free.

The illusion of the FIFO queue

Most engineers early in their distributed systems journey assume picking a broker with "FIFO" in its name will solve the problem. It won't. The network is unreliable by nature. Even when AWS SQS FIFO guarantees ordering at ingestion, a single network blip, a client-library retry, or a small TCP delay is enough to shuffle your carefully ordered messages. The problem isn't keeping order inside the pipe - it's building a system that survives the moments when the pipe leaks.

The multiple-consumer trap

Scaling makes things worse. One consumer processing messages sequentially preserves order, but kills throughput. Add more consumers and you're in "Competing Consumers" territory. Here's a concrete example: message A (create order) and message B (pay order) arrive almost simultaneously. Consumer 1 picks up message A, then hits a GC pause. Consumer 2 picks up message B and processes it instantly. Your code now tries to mark an order as paid before it exists in the database.

I've seen this exact bug cause phantom payments in production. It's not hypothetical.

Idempotency is your real safety net

Stop trying to guarantee delivery order. Prepare your code for messages arriving twice, or too early. Idempotency means each operation produces the same result no matter how many times it runs. In Ruby, a message log works well as a gatekeeper:

# The logic handles incoming events in a safe, idempotent way
def process_order_payment(event)
  # Check if we have already processed this specific message ID
  return if MessageLog.exists?(message_key: event.message_id)

  ActiveRecord::Base.transaction do
    order = Order.find_by(uuid: event.order_uuid)

    if order.nil?
      # If the order hasn't arrived yet, we might want to retry later 
      # or move this message to a specialized "waiting" queue
      raise "Order not found for UUID: #{event.order_uuid}"
    end

    # Business logic: updating the payment status
    order.update!(status: 'paid', paid_at: Time.now)

    # Logging the message key to prevent duplicate processing
    MessageLog.create!(
      message_key: event.message_id,
      processed_at: Time.now
    )
  end
rescue ActiveRecord::RecordNotUnique
  # This handles race conditions where two threads try to log the same message_id
  Rails.logger.warn "Message #{event.message_id} is already being handled elsewhere"
end

MessageLog blocks duplicate processing. ActiveRecord::RecordNotUnique is the fallback when two threads race to insert the same message_id at the same moment - a narrower window than you'd think, but it happens.

Partition keys when you absolutely need ordering

If ordering matters for a specific entity, use a partition key (user_idorder_id, etc.). All events for the same order go to the same consumer, so sequentiality is preserved where it counts. Other orders can process in parallel without interference.

Optimistic locking as a version guard

Idempotency handles duplicates. Versioning handles late arrivals. If message version 3 shows up after version 5 is already applied, discard it:

-- We only update the record if the incoming version is exactly one higher
-- than the current version in the database
UPDATE orders
SET status = 'processing', 
    version = 5, 
    updated_at = NOW()
WHERE id = '123e4567-e89b-12d3-a456-426614174000' 
  AND version < 5;

Zero rows updated means the message is a duplicate or a stale echo. Either way, ignore it and move on.

Where to go from here

These patterns - idempotency, partition keys, version guards - don't require exotic infrastructure. They're application-level decisions you can make today. If you want to go deeper, look at the Inbox/Outbox pattern, which solves the "at-least-once delivery" problem at the database level and removes a whole class of race conditions you probably haven't hit yet, but will.

Happy ordering!