Survey: Choreography vs Orchestration in Microservice Architecture

 (The content below are mostly summarized and quoted from materials in reference except the “My Thoughts” part)

In a microservices architecture it is not uncommon to encounter services which are long running and stretch across the boundary of individual microservices. There are 2 major architectures to implement such services: Orchestration and Choreography.


Orchestration

Main idea: Have a single god service to control the workflow by calling other services’ APIs

Other characteristics:

Business workflow is in one place, and it’s easy to maintain, manage and optimize.

Easy to monitor and debug the business workflow by orchestrator’s log.

Error handling is straight forward. Each service can just propagate errors through orchestrator.

Possible to apply distributed transaction such as 2PC

Even though we can call microservices’ APIs asynchronously inside orchestrator, the orchestrator’s function still needs to wait for all APIs to be finished which could occupy system resources such as memory and thread for a long time.

The orchestrator centrally manages the resiliency of the workflow and it can become a single point of failure.


Choreography

Main idea:  No single god service, only a central event bus; Each service listens to events from central event bus, handling some events and publishes some other events to event bus which could be handled by other services

Other characteristics:

Most companies adopt an event-driven architecture as part of their evolution from a monolith to microservices, and had a need for the scaling that EDA provides.

Having well-defined boundaries in your system is very important for EDA. Events are used to communicate outside your domain boundary. Bad boundaries could result in too many events to be supported by each microservice. And since it is hard to control and assume events order, the complexity is exponential to the number of events.

Idempotency of events is critical. In production, you don't want duplicate data or transactions to occur. It also facilitates testing, as you can replay events into a staging environment.

This pattern could decrease coupling. Each microservice only needs to handle and publish events and there is no god service to couple them.  

Avoiding God services and central controllers is a question of taking the responsibilities and autonomy of the teams seriously. 

The ownership for the process and the needed flow logic can be distributed. How much will primarily depend on your organizational structure which should also be reflected in your service landscape (see Conway’s Law).

(My thoughts: Even with choreography, we need a high level design of each team’s boundary and event handling process for business cases. So for both orchestration and choreography, we need some high level design across multiple teams’ domains. This can be designed and owned by some experienced engineers who have knowledge across multiple domains and teams. These engineers can come from all related sub teams and form a special committee or team for the high level design. This is similar to government structure where a president has his own counsels to coordinate among different departments. )

If a service fails to complete a business operation, it can be difficult to recover from that failure. 

It is hard to control event order.

The choreography pattern becomes a challenge if the number of services grows rapidly. Given the high number of independent moving parts, the workflow between services tends to get complex. Also, distributed tracing becomes difficult.

The role is distributed between all services and resiliency becomes less robust. Each service isn't only responsible for the resiliency of its operation but also the workflow. This responsibility can be burdensome for the service and hard to implement. Each service must retry transient, nontransient, and time-out failures, so that the request terminates gracefully, if needed. Also, the service must be diligent about communicating the success or failure of the operation so that other services can act accordingly.

Since each microservice handles and publishes events asynchronously, it is hard to monitor and debug the overall business workflow.

 

When to use what:

Use the choreography pattern if you expect to update, remove, or add new services frequently. The entire app can be modified with lesser effort and minimal disruption to existing services.

Consider choreography pattern if you experience performance bottlenecks in the central orchestrator.

Choreography pattern is a natural model for the serverless architecture where all services can be short lived, or event driven. Services can spin up because of an event, do their task, and are removed when the task is finished.

A good rule of thumb is to use orchestration when you're coordinating events within your bounded context, and to use event-driven choreography for interactions across domains.

Better to start with orchestration and use choreography when really necessary.

(My thoughts: choreography is suitable for long running, few participants and various contexts system. Because it doesn’t make sense to let an orchestration api wait for others for a long time. And few participants constrain the events complexity. 

Some process like order fulfillment (from payment to order shipment) are long running process because it takes a long time such as weeks to finish them. In such process, we can divide things into micro service such as payment, order, checkout, order delivery etc.

From organization perspective, choreography is suitable for coordination of multiple faraway teams whose domains and contexts are less coupled and very different. On the other side orchestration is suitable for close teams.)

 

Reference:

https://www.infoq.com/news/2008/09/Orchestration/ 

https://stackoverflow.com/questions/4127241/orchestration-vs-choreography 

https://docs.microsoft.com/en-us/azure/architecture/patterns/choreography

https://www.infoq.com/podcasts/event-driven-architectures-scale/

https://www.infoq.com/articles/microservice-event-choreographies/

https://theburningmonk.com/2020/08/choreography-vs-orchestration-in-the-land-of-serverless/

 

 

Popular posts from this blog

Does Free Consciousness exist ?

Software Architecture Books Summary And Highlights -- Part 1 Goal, Introduction And Index

拉美500年,荆棘丛生的自由繁荣之路