From microlites to microservices: How to survive in the world of event-driven systems
In large systems, monoliths are becoming obsolete, replaced by distributed architectures based on microservices. This change brings new challenges, especially in communication between components. Instead of simple database transactions, we are dealing with an asynchronous stream of events, which creates the risk of data loss, mixed message order, or duplication. In this article, we will look at several design patterns that will help you ensure that your events reach their destination, even if something goes wrong along the way.
In the beginning, there was asynchrony
In monolithic systems, where everything was stored in a single database, changes in multiple modules were easy to manage - all it took was a single, consistent transaction. In the world of distributed systems, this luxury no longer exists. The network can be unreliable, and messages can get lost or arrive in a strange order. However, this does not mean that every event must be delivered at all costs. Sometimes a "fire and forget" approach is sufficient, especially for events of lesser business importance. Take, for example, sending a welcome message to a new user. If it fails once, it's not the end of the world. It would be different if payment information were lost.
Outbox pattern: Your event will never get lost
One of the most popular ways to guarantee event delivery is the outbox pattern. It involves writing the event to a special table, called an outbox, in the same transaction that changes the business state. Then, a separate process, such as an outbox-poller, regularly scans this table and sends events to a message broker, such as RabbitMQ or Kafka.
# simplified example in Ruby on Rails
class AccountService
def create_account(user_data)
ActiveRecord::Base.transaction do
# save in db
user = User.create!(user_data)
# 2. save event in table
OutboxEvent.create!(
event_type: "user_created",
payload: { user_id: user.id }
)
end
end
end
# Next, a separate worker polls the outbox table and sends events to the message broker
class OutboxPollerWorker
include Sidekiq::Worker
def perform
OutboxEvent.order(:created_at).find_each do |event|
begin
MessageBroker.publish(event.event_type, event.payload)
event.update!(status: "sent")
rescue => e
# rescue, for example to rerun event
Rails.logger.error("Failed to send event: #{event.id}. Error: #{e.message}")
end
end
end
end
However, this pattern has its drawbacks. It introduces additional delay in delivering events and can generate problems with scalability and message ordering.
Change data capture: Events straight from the database
Another option is Change Data Capture (CDC). Instead of explicitly writing events, you simply listen to the database transaction logs. Changes in rows are captured by a separate process and converted into business events. This approach is more elegant in that it separates the logic of sending events from the business logic of the application. However, it requires additional infrastructure and tools, such as Debezium.
Order and duplicates – the two most common problems
Regardless of which pattern you choose, in distributed systems you must be prepared for duplicates and ordering issues. How to deal with this?
- Add a correlation ID: attach a unique identifier to the message so that the recipient can filter out duplicates.
- Use sequential numbers: for events that must be processed in a specific order, add a sequential number. The recipient will then be able to reject events that arrive out of order.
Time for testing
Before you implement your system, conduct a little "chaos engineering" on paper. Draw out the architecture and ask yourself: "What will happen if this event is lost? Or if it arrives twice?" Such thought experiments can reveal weaknesses in the design before they cause real problems. Because, as the old rule of distributed systems says: "Don't build them if you don't have to."
Happy delivering