Skip to main content

· 20 min read

Seata, short for Simple Extensible Autonomous Transaction Architecture, is an all-in-one distributed transaction solution. It provides AT, TCC, Saga, and XA transaction modes. This article provides a detailed explanation of the Saga mode within Seata, with the project hosted on GitHub.

Author: Yiyuan (Chen Long), Core Developer of Distributed Transactions at Ant Financial.

Pain Points in Financial Distributed Application Development

Distributed systems face a prominent challenge where a business process requires a composition of various services. This challenge becomes even more pronounced in a microservices architecture, as it necessitates consistency guarantees at the business level. In other words, if a step fails, it either needs to roll back to the previous service invocation or continuously retry to ensure the success of all steps. - From "Left Ear Wind - Resilient Design: Compensation Transaction"

In the domain of financial microservices architecture, business processes are often more complex. Processes are lengthy, such as a typical internet microloan business process involving calls to more than ten services. When combined with exception handling processes, the complexity increases further. Developers with experience in financial business development can relate to these challenges.

During the development of financial distributed applications, we encounter several pain points:

  • Difficulty Ensuring Business Consistency

    In many of the systems we encounter (e.g., in channel layers, product layers, and integration layers), ensuring eventual business consistency often involves adopting a "compensation" approach. Without a coordinator to support this, the development difficulty is significant. Each step requires handling "rollback" operations in catch blocks, resulting in a code structure resembling an "arrow," with poor readability and maintainability. Alternatively, retrying exceptional operations, if unsuccessful, might lead to asynchronous retries or even manual intervention. These challenges impose a significant burden on developers, reducing development efficiency and increasing the likelihood of errors.

  • Difficulty Managing Business State

    With numerous business entities and their corresponding states, developers often update the entity's state in the database after completing a business activity. Lack of a state machine to manage the entire state transition process results in a lack of intuitiveness, increases the likelihood of errors, and causes the business to enter an incorrect state.

  • Difficulty Ensuring Idempotence

    Idempotence of services is a fundamental requirement in a distributed environment. Ensuring the idempotence of services often requires developers to design each service individually, using unique keys in databases or distributed caches. There is no unified solution, creating a significant burden on developers and increasing the chances of oversight, leading to financial losses.

  • Challenges in Business Monitoring and Operations; Lack of Unified Error Guardian Capability

    Monitoring the execution of business operations is usually done by logging, and monitoring platforms are based on log analysis. While this is generally sufficient, in the case of business errors, these monitors lack immediate access to the business context and require additional database queries. Additionally, the reliance on developers for log printing makes it prone to omissions. For compensatory transactions, there is often a need for "error guardian triggering compensation" and "worker-triggered compensation" operations. The lack of a unified error guardian and processing standard requires developers to implement these individually, resulting in a heavy development burden.

Theoretical Foundation

In certain scenarios where strong consistency is required for data, we may adopt distributed transaction schemes like "Two-Phase Commit" at the business layer. However, in other scenarios, where such strong consistency is not necessary, ensuring eventual consistency is sufficient.

For example, Ant Financial currently employs the TCC (Try, Confirm, Cancel) pattern in its financial core systems. The characteristics of financial core systems include high consistency requirements (business isolation), short processes, and high concurrency.

On the other hand, in many business systems above the financial core (e.g., systems in the channel layer, product layer, and integration layer), the emphasis is on achieving eventual consistency. These systems typically have complex processes, long flows, and may need to call services from other companies (such as financial networks). Developing Try, Confirm, Cancel methods for each service in these scenarios incurs high costs. Additionally, when there are services from other companies in the transaction, it is impractical to require those services to follow the TCC development model. Long processes can negatively impact performance if transaction boundaries are too extensive.

When it comes to transactions, we are familiar with ACID, and we are also acquainted with the CAP theorem, which states that at most two out of three—Consistency (C), Availability (A), and Partition Tolerance (P)—can be achieved simultaneously. To enhance performance, a variant of ACID known as BASE emerged. While ACID emphasizes consistency (C in CAP), BASE emphasizes availability (A in CAP). Achieving strong consistency (ACID) is often challenging, especially when dealing with multiple systems that are not provided by a single company. BASE systems are designed to create more resilient systems. In many situations, particularly when dealing with multiple systems and providers, BASE systems acknowledge the risk of data inconsistency in the short term. This allows new transactions to occur, with potentially problematic transactions addressed later through compensatory means to ensure eventual consistency.

Therefore, in practical development, we make trade-offs. For many business systems above the financial core, compensatory transactions can be adopted. The concept of compensatory transactions has been proposed for about 30 years, with the Saga theory emerging as a solution for long transactions. With the recent rise of microservices, Saga has gradually gained attention in recent years. Currently, the industry generally recognizes Saga as a solution for handling long transactions.

https://github.com/aphyr/dist-sagas/blob/master/sagas.pdf[1] > http://microservices.io/patterns/data/saga.html[2]

Community and Industry Solutions

Apache Camel Saga

Camel is an open-source product that implements Enterprise Integration Patterns (EIP). It is based on an event-driven architecture and offers good performance and throughput. In version 2.21, Camel introduced the Saga EIP.

The Saga EIP provides a way to define a series of related actions through Camel routes. These actions either all succeed or all roll back. Saga can coordinate distributed services or local services using any communication protocol, achieving global eventual consistency. Saga does not require the entire process to be completed in a short time because it does not occupy any database locks. It can support requests that require long processing times, ranging from seconds to days. Camel's Saga EIP is based on MicroProfile's LRA[3] (Long Running Action). It also supports the coordination of distributed services implemented in any language using any communication protocol.

The implementation of Saga does not lock data. Instead, it defines "compensating operations" for each operation. When an error occurs during the normal process execution, the "compensating operations" for the operations that have already been executed are triggered to roll back the process. "Compensating operations" can be defined on Camel routes using Java or XML DSL (Definition Specific Language).

Here is an example of Java DSL:

// Java DSL example goes here

// action
from("direct:reserveCredit")
.bean(idService, "generateCustomId") // generate a custom Id and set it in the body
.to("direct:creditReservation")

// delegate action
from("direct:creditReservation")
.saga()
.propagation(SagaPropagation.SUPPORTS)
.option("CreditId", body()) // mark the current body as needed in the compensating action
.compensation("direct:creditRefund")
.bean(creditService, "reserveCredit")
.log("Credit ${header.amount} reserved. Custom Id used is ${body}");

// called only if the saga is cancelled
from("direct:creditRefund")
.transform(header("CreditId")) // retrieve the CreditId option from headers
.bean(creditService, "refundCredit")
.log("Credit for Custom Id ${body} refunded");

XML DSL sample:

<route>
<from uri="direct:start"/>
<saga>
<compensation uri="direct:compensation" />
<completion uri="direct:completion" />
<option optionName="myOptionKey">
<constant>myOptionValue</constant>
</option>
<option optionName="myOptionKey2">
<constant>myOptionValue2</constant>
</option>
</saga>
<to uri="direct:action1" />
<to uri="direct:action2" />
</route>

Eventuate Tram Saga

Eventuate Tram Saga[4] The framework is a Saga framework for Java microservices using JDBC/JPA. Similar to Camel Saga, it also adopts Java DSL to define compensating operations:

public class CreateOrderSaga implements SimpleSaga<CreateOrderSagaData> {

private SagaDefinition<CreateOrderSagaData> sagaDefinition =
step()
.withCompensation(this::reject)
.step()
.invokeParticipant(this::reserveCredit)
.step()
.invokeParticipant(this::approve)
.build();


@Override
public SagaDefinition<CreateOrderSagaData> getSagaDefinition() {
return this.sagaDefinition;
}


private CommandWithDestination reserveCredit(CreateOrderSagaData data) {
long orderId = data.getOrderId();
Long customerId = data.getOrderDetails().getCustomerId();
Money orderTotal = data.getOrderDetails().getOrderTotal();
return send(new ReserveCreditCommand(customerId, orderId, orderTotal))
.to("customerService")
.build();

...

Apache ServiceComb Saga

ServiceComb Saga[5] is also a solution for achieving data eventual consistency in microservices applications. In contrast to TCC, Saga directly commits transactions in the try phase, and the subsequent rollback phase is completed through compensating operations in reverse. What sets it apart is the use of Java annotations and interceptors to define "compensating" services.

Architecture:

Saga consists of alpha and omega, where:

  • Alpha acts as the coordinator, primarily responsible for managing and coordinating transactions;
  • Omega is an embedded agent in microservices, responsible for intercepting network requests and reporting transaction events to alpha;

The diagram below illustrates the relationship between alpha, omega, and microservices:

ServiceComb Saga

sample:

public class ServiceA extends AbsService implements IServiceA {

private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());

@Autowired
private IServiceB serviceB;

@Autowired
private IServiceC serviceC;

@Override
public String getServiceName() {
return "servicea";
}

@Override
public String getTableName() {
return "testa";
}

@Override
@SagaStart
@Compensable(compensationMethod = "cancelRun")
@Transactional(rollbackFor = Exception.class)
public Object run(InvokeContext invokeContext) throws Exception {
LOG.info("A.run called");
doRunBusi();
if (invokeContext.isInvokeB(getServiceName())) {
serviceB.run(invokeContext);
}
if (invokeContext.isInvokeC(getServiceName())) {
serviceC.run(invokeContext);
}
if (invokeContext.isException(getServiceName())) {
LOG.info("A.run exception");
throw new Exception("A.run exception");
}
return null;
}

public void cancelRun(InvokeContext invokeContext) {
LOG.info("A.cancel called");
doCancelBusi();
}

Ant Financial's Practice

Ant Financial extensively uses the TCC mode for distributed transactions, mainly in scenarios where high consistency and performance are required, such as in financial core systems. In upper-level business systems with complex and lengthy processes, developing TCC can be costly. In such cases, most businesses opt for the Saga mode to achieve eventual business consistency. Due to historical reasons, different business units have their own set of "compensating" transaction solutions, basically falling into two categories:

  1. When a service needs to "retry" or "compensate" in case of failure, a record is inserted into the database with the status before executing the service. When an exception occurs, a scheduled task queries the database record and performs "retry" or "compensation." If the business process is successful, the record is deleted.

  2. Designing a state machine engine and a simple DSL to orchestrate business processes and record business states. The state machine engine can define "compensating services." In case of an exception, the state machine engine invokes "compensating services" in reverse. There is also an "error guardian" platform that monitors failed or uncompensated business transactions and continuously performs "compensation" or "retry."

Solution Comparison

Generally, there are two common solutions in the community and industry: one is based on a state machine or a process engine that orchestrates processes and defines compensation through DSL; the other is based on Java annotations and interceptors to implement compensation. What are the advantages and disadvantages of these two approaches?

ApproachProsCons
State Machine + DSL
- Business processes can be defined using visual tools, standardized, readable, and can achieve service orchestration functionality
- Improves communication efficiency between business analysts and developers
- Business state management: Processes are essentially state machines, reflecting the flow of business states
- Enhances flexibility in exception handling: Can implement "forward retry" or "backward compensation" after recovery from a crash
- Naturally supports asynchronous processing engines such as Actor model or SEDA architecture, improving overall throughput

- Business processes are composed of JAVA programs and DSL configurations, making development relatively cumbersome
- High intrusiveness into existing business if it is a transformation
- High implementation cost of the engine
Interceptor + Java Annotation
- Programs and annotations are integrated, simple development, low learning curve
- Easy integration into existing businesses
- Low framework implementation cost

- The framework cannot provide asynchronous processing modes such as the Actor model or SEDA architecture to improve system throughput
- The framework cannot provide business state management
- Difficult to achieve "forward retry" after crash recovery due to the inability to restore thread context

Seata Saga Approach

The introduction of Seata Saga can be found in Seata Saga Official Documentation[6].

Seata Saga adopts the state machine + DSL approach for the following reasons:

  • The state machine + DSL approach is more widely used in practical production scenarios.
  • Can use asynchronous processing engines such as the Actor model or SEDA architecture to improve overall throughput.
  • Typically, business systems above the core system have "service orchestration" requirements, and service orchestration has transactional eventual consistency requirements. These two are challenging to separate. The state machine + DSL approach can simultaneously meet these two requirements.
  • Because Saga mode theoretically does not guarantee isolation, in extreme cases, it may not complete the rollback operation due to dirty writing. For example, in a distributed transaction, if you recharge user A first and then deduct the balance from user B, if A user consumes the balance before the transaction is committed, and the transaction is rolled back, there is no way to compensate. Some business scenarios may allow the business to eventually succeed, and in cases where rollback is impossible, it can continue to retry the subsequent process. The state machine + DSL approach can achieve the ability to "forward" recover context and continue execution, making the business eventually successful and achieving eventual consistency.

In cases where isolation is not guaranteed: When designing business processes, follow the principle of "prefer long 款, not short 款." Long 款 means fewer funds for customers and more funds for institutions. Institutions can refund customers based on their credibility. Conversely, short 款 means less funding for institutions, and the funds may not be recovered. Therefore, in business process design, deduction should be done first.

State Definition Language (Seata State Language)

  1. Define the service call process through a state diagram and generate a JSON state language definition file.

  2. In the state diagram, a node can be a service call, and the node can configure its compensating node.

  3. The JSON state diagram is driven by the state machine engine. When an exception occurs, the state engine executes the compensating node corresponding to the successfully executed node to roll back the transaction.

    Note: Whether to compensate when an exception occurs can also be user-defined.

  4. It can meet service orchestration requirements, supporting one-way selection, concurrency, asynchronous, sub-state machine, parameter conversion, parameter mapping, service execution status judgment, exception capture, and other functions.

Assuming a business process calls two services, deducting inventory (InventoryService) and deducting balance (BalanceService), to ensure that in a distributed scenario, either both succeed or both roll back. Both participant services have a reduce method for inventory deduction or balance deduction, and a compensateReduce method for compensating deduction operations. Let's take a look at the interface definition of InventoryService:

public interface InventoryService {

/**
* reduce
* @param businessKey
* @param amount
* @param params
* @return
*/
boolean reduce(String businessKey, BigDecimal amount, Map<String, Object> params);

/**
* compensateReduce
* @param businessKey
* @param params
* @return
*/
boolean compensateReduce(String businessKey, Map<String, Object> params);
}

This is the state diagram corresponding to the business process:

Example State Diagram
Corresponding JSON

{
"Name": "reduceInventoryAndBalance",
"Comment": "reduce inventory then reduce balance in a transaction",
"StartState": "ReduceInventory",
"Version": "0.0.1",
"States": {
"ReduceInventory": {
"Type": "ServiceTask",
"ServiceName": "inventoryAction",
"ServiceMethod": "reduce",
"CompensateState": "CompensateReduceInventory",
"Next": "ChoiceState",
"Input": ["$.[businessKey]", "$.[count]"],
"Output": {
"reduceInventoryResult": "$.#root"
},
"Status": {
"#root == true": "SU",
"#root == false": "FA",
"$Exception{java.lang.Throwable}": "UN"
}
},
"ChoiceState": {
"Type": "Choice",
"Choices": [
{
"Expression": "[reduceInventoryResult] == true",
"Next": "ReduceBalance"
}
],
"Default": "Fail"
},
"ReduceBalance": {
"Type": "ServiceTask",
"ServiceName": "balanceAction",
"ServiceMethod": "reduce",
"CompensateState": "CompensateReduceBalance",
"Input": [
"$.[businessKey]",
"$.[amount]",
{
"throwException": "$.[mockReduceBalanceFail]"
}
],
"Output": {
"compensateReduceBalanceResult": "$.#root"
},
"Status": {
"#root == true": "SU",
"#root == false": "FA",
"$Exception{java.lang.Throwable}": "UN"
},
"Catch": [
{
"Exceptions": ["java.lang.Throwable"],
"Next": "CompensationTrigger"
}
],
"Next": "Succeed"
},
"CompensateReduceInventory": {
"Type": "ServiceTask",
"ServiceName": "inventoryAction",
"ServiceMethod": "compensateReduce",
"Input": ["$.[businessKey]"]
},
"CompensateReduceBalance": {
"Type": "ServiceTask",
"ServiceName": "balanceAction",
"ServiceMethod": "compensateReduce",
"Input": ["$.[businessKey]"]
},
"CompensationTrigger": {
"Type": "CompensationTrigger",
"Next": "Fail"
},
"Succeed": {
"Type": "Succeed"
},
"Fail": {
"Type": "Fail",
"ErrorCode": "PURCHASE_FAILED",
"Message": "purchase failed"
}
}
}

This is the state language to some extent referring to AWS Step Functions[7].

Introduction to "State Machine" Attributes:

  • Name: Represents the name of the state machine, must be unique;
  • Comment: Description of the state machine;
  • Version: Version of the state machine definition;
  • StartState: The first "state" to run when starting;
  • States: List of states, a map structure, where the key is the name of the "state," which must be unique within the state machine;

Introduction to "State" Attributes:

  • Type: The type of the "state," such as:
    • ServiceTask: Executes the service task;
    • Choice: Single conditional choice route;
    • CompensationTrigger: Triggers the compensation process;
    • Succeed: Normal end of the state machine;
    • Fail: Exceptional end of the state machine;
    • SubStateMachine: Calls a sub-state machine;
  • ServiceName: Service name, usually the beanId of the service;
  • ServiceMethod: Service method name;
  • CompensateState: Compensatory "state" for this state;
  • Input: List of input parameters for the service call, an array corresponding to the parameter list of the service method, $. represents using an expression to retrieve parameters from the state machine context. The expression uses SpringEL[8], and if it is a constant, write the value directly;
  • Output: Assigns the parameters returned by the service to the state machine context, a map structure, where the key is the key when placing it in the state machine context (the state machine context is also a map), and the value uses $. as a SpringEL expression, indicating the value is taken from the return parameters of the service, #root represents the entire return parameters of the service;
  • Status: Mapping of the service execution status, the framework defines three statuses, SU success, FA failure, UN unknown. We need to map the execution status of the service into these three statuses, helping the framework judge the overall consistency of the transaction. It is a map structure, where the key is a condition expression, usually based on the return value of the service or the exception thrown for judgment. The default is a SpringEL expression to judge the return parameters of the service. Those starting with $Exception{ indicate judging the exception type, and the value is mapped to this value when this condition expression is true;
  • Catch: Route after catching an exception;
  • Next: The next "state" to execute after the service is completed;
  • Choices: List of optional branches in the Choice type "state," where Expression is a SpringEL expression, and Next is the next "state" to execute when the expression is true;
  • ErrorCode: Error code for the Fail type "state";
  • Message: Error message for the Fail type "state";

For more detailed explanations of the state language, please refer to Seata Saga Official Documentation[6http://seata.io/zh-cn/docs/user/saga.html].

State Machine Engine Principle:

State Machine Engine Principle

  • The state diagram in the image first executes stateA, then executes stateB, and then executes stateC;
  • The execution of "states" is based on an event-driven model. After stateA is executed, a routing message is generated and placed in the EventQueue. The event consumer takes the message from the EventQueue and executes stateB;
  • When the entire state machine is started, Seata Server is called to start a distributed transaction, and the xid is generated. Then, the start event of the "state machine instance" is recorded in the local database;
  • When a "state" is executed, Seata Server is called to register a branch transaction, and the branchId is generated. Then, the start event of the "state instance" is recorded in the local database;
  • After a "state" is executed, the end event of the "state instance" is recorded in the local database, and Seata Server is called to report the status of the branch transaction;
  • When the entire state machine is executed, the completion event of the "state machine instance" is recorded in the local database, and Seata Server is called to commit or roll back the distributed transaction;

Design of State Machine Engine:

Design of State Machine Engine

The design of the state machine engine is mainly divided into three layers, with the upper layer depending on the lower layer. From bottom to top, they are:

  • Eventing Layer:

    • Implements an event-driven architecture that can push events and be consumed by a consumer. This layer does not care about what the event is or what the consumer executes; it is implemented by the upper layer.
  • ProcessController Layer:

    • Driven by the above Eventing to execute a "empty" process. The behavior and routing of "states" are not implemented. It is implemented by the upper layer.

      Based on the above two layers, theoretically, any "process" engine can be customly extended. The design of these two layers is based on the internal design of the financial network platform.

  • StateMachineEngine Layer:

    • Implements the behavior and routing logic of each type of state in the state machine engine;
    • Provides API and state machine language repository;

Practical Experience in Service Design under Saga Mode

Below are some practical experiences summarized in the design of microservices under Saga mode. Of course, these are recommended practices, not necessarily to be followed 100%. There are "workaround" solutions even if not followed.

Good news: Seata Saga mode has no specific requirements for the interface parameters of microservices, making Saga mode suitable for integrating legacy systems or services from external institutions.

Allow Empty Compensation

  • Empty Compensation: The original service was not executed, but the compensation service was executed;
  • Reasons:
    • Timeout (packet loss) of the original service;
    • Saga transaction triggers a rollback;
    • The request of the original service is not received, but the compensation request is received first;

Therefore, when designing services, it is necessary to allow empty compensation, that is, if the business primary key to be compensated is not found, return compensation success and record the original business primary key.

Hang Prevention Control

  • Hang: Compensation service is executed before the original service;
  • Reasons:
    • Timeout (congestion) of the original service;
    • Saga transaction rollback triggers a rollback;
    • Congested original service arrives;

Therefore, check whether the current business primary key already exists in the business primary keys recorded by empty compensation. If it exists, reject the execution of the service.

Idempotent Control

  • Both the original service and the compensation service need to ensure idempotence. Due to possible network timeouts, a retry strategy can be set. When a retry occurs, idempotent control should be used to avoid duplicate updates to business data.

Summary

Many times, we don't need to emphasize strong consistency. We design more resilient systems based on the BASE and Saga theories to achieve better performance and fault tolerance in distributed architecture. There is no silver bullet in distributed architecture, only solutions suitable for specific scenarios. In fact, Seata Saga is a product with the capabilities of "service orchestration" and "Saga distributed transactions." Summarizing, its applicable scenarios are:

  • Suitable for handling "long transactions" in a microservices architecture;
  • Suitable for "service orchestration" requirements in a microservices architecture;
  • Suitable for business systems with a large number of composite services above the financial core system (such as systems in the channel layer, product layer, integration layer);
  • Suitable for scenarios where integration with services provided by legacy systems or external institutions is required (these services are immutable and cannot be required to be modified).

Related Links Mentioned in the Article

[1]https://github.com/aphyr/dist-sagas/blob/master/sagas.pdf
[2]http://microservices.io/patterns/data/saga.html
[3]Microprofile 的 LRAhttps://github.com/eclipse/microprofile-sandbox/tree/master/proposals/0009-LRA
[4]Eventuate Tram Sagahttps://github.com/eventuate-tram/eventuate-tram-sagas
[5]ServiceComb Sagahttps://github.com/apache/servicecomb-pack
[6]Seata Saga 官网文档http://seata.io/zh-cn/docs/user/saga.html
[7]AWS Step Functionshttps://docs.aws.amazon.com/zh_cn/step-functions/latest/dg/tutorial-creating-lambda-state-machine.html
[8]SpringELhttps://docs.spring.io/spring/docs/4.3.10.RELEASE/spring-framework-reference/html/expressions.html

· 19 min read

Author: Yi Yuan (Chen Long), Ant Gold Services distributed transaction framework core development.
This article is based on the topic of "Distributed Transaction Seata and its Three Patterns" shared at SOFA Meetup#3 on 11 August in Guangzhou, focusing on the background and theoretical foundation of distributed transaction, as well as the principle of Seata distributed transaction and the implementation of distributed transaction in three patterns (AT, TCC, and Saga).

The video and PPT are at the end of this article.

3 Distributed Transaction Seata Three Modes Explained-Eiyuan.jpg

I. Background of the emergence of distributed transactions

1.1 Distributed Architecture Evolution - Horizontal Splitting of Database

AntGold's business database was initially a single database with a single table, but with the rapid development of the business data scale, the data volume is getting bigger and bigger, and the single database with a single table is gradually becoming a bottleneck. So we split the database horizontally, splitting the original single database and single table into database slices.

As shown in the figure below, after splitting the database and table, the original write operation that can be completed on a database may be across multiple databases, which gives rise to cross-database transaction problems.

image.png

1.2 Distributed Architecture Evolution - Business Service Splitting

In the early stage of business development, the single business system architecture of "one piece of cake" can meet the basic business needs. However, with the rapid development of the business, the system's access and business complexity are growing rapidly, single-system architecture has gradually become the bottleneck of business development, to solve the problem of high coupling and scalability of the business system demand is becoming stronger and stronger.

As shown in the figure below, Ant Financial Services splits the single business system into multiple business systems in accordance with the design principles of Service Oriented Architecture (SOA), which reduces the coupling between the systems and enables different business systems to focus on their own business, which is more conducive to the development of the business and the scaling of the system capacity.

image.png

After the business system is split according to services, a complete business often needs to call multiple services, how to ensure data consistency between multiple services becomes a difficult problem.

II. Theoretical foundation of distributed transaction

2.1 Two-stage commit protocols

16_16_18__08_13_2019.jpg

Two phase commit protocol: transaction manager coordinates resource manager in two phases, the first phase prepares resources, that is, reserve the resources needed for the transaction, if every resource manager resource reservation succeeds, the second phase resource commit is performed, otherwise the coordinated resource manager rolls back the resources.

2.2 TCC

16_16_51__08_13_2019.jpg

TCC (Try-Confirm-Cancel) is actually a two-phase commit protocol for servitisation, business developers need to implement these three service interfaces, the first phase of the service is choreographed by the business code to call the Try interface for resource reservation, the Try interface for all participants is successful, the transaction manager will commit the transaction and call the Confirm interface for each participant The transaction manager will commit the transaction and call the Confirm interface of each participant to actually commit the business operation, otherwise the Cancel interface of each participant will be called to rollback the transaction.

2.3 Saga

3 Distributed Transactions Seata Three Patterns Explained - Yi Yuan-9.jpg

Saga is a compensation protocol. In Saga mode, there are multiple participants within a distributed transaction, and each participant is an offsetting compensation service that requires the user to implement its forward and reverse rollback operations according to the business scenario.

During the execution of a distributed transaction, the forward operations of each participant are executed sequentially, and if all forward operations are executed successfully, the distributed transaction commits. If any of the forward operations fails, the distributed transaction backs out and performs a reverse rollback on the previous participants, rolling back the committed participants and returning the distributed transaction to its initial state.

Saga theory is from the paper Sagas published by Hector & Kenneth in 1987.

Saga Positive Service and Compensation Service also need to be implemented by business developers.

III. Seata and its three patterns explained in detail

3.1 Distributed transaction Seata introduction

Seata (Simple Extensible Autonomous Transaction Architecture) is a distributed transaction solution jointly open-sourced by Ant Financial Services and Alibaba in January 2019.Seata has been open-sourced for about half a year, and currently has more than 11,000 stars. Seata has been open source for about half a year, and now has more than 11,000 stars and a very active community. We warmly welcome you to participate in the Seata community construction, together will Seata become the open source distributed transaction benchmark product.

Seata: https://[github.com/apache/incubator-seata](https://github.com/apache/incubator -seata)

image.png

3.2 Distributed Transactions Seata Product Module

As shown in the figure below, there are three major modules in Seata, namely TM, RM and TC. TM and RM are integrated with the business system as clients of Seata, and TC is deployed independently as the server of Seata.

TC is deployed independently as a Seata server. image.png

The execution flow of a distributed transaction in Seata:

  • TM opens distributed transaction (TM registers global transaction record with TC);
  • According to the business scenario, arrange the resources in the transaction such as database and service (RM reports the resource readiness status to TC);
  • TM ends the distributed transaction, and the transaction phase ends (TM notifies TC to commit/rollback the distributed transaction);
  • TC aggregates the transaction information and decides whether the distributed transaction should be committed or rolled back;
  • TC notifies all RMs to commit/rollback resources, transaction phase 2 ends;

3.3 Distributed Transactions Seata Solution

Seata has four distributed transaction solutions, AT mode, TCC mode, Saga mode and XA mode.

15_49_23__08_13_2019.jpg

2.3.1 AT Mode

In January, Seata open sourced AT Mode, a non-intrusive distributed transaction solution. In AT mode, users only need to focus on their own "business SQL", the user's "business SQL" as a phase, Seata framework will automatically generate the transaction of the two-phase commit and rollback operations.

image.png

How the AT model is non-intrusive to business :
  • Phase I:

In phase 1, Seata intercepts the "business SQL", first parses the semantics of the SQL, finds the business data to be updated by the "business SQL", and then saves it as a "before image" before updating the business data. Before the business data is updated, it will save it as "before image", then execute "business SQL" to update the business data, and after the business data is updated, it will save it as "after image", and finally generate row locks. The above operations are all done within a single database transaction, which ensures the atomicity of one phase of operation.

This ensures the atomicity of a phase of operations. image3.png

  • Second-phase commit:

If the second phase is a commit, since the "business SQL" has already been committed to the database in the first phase, the Seata framework only needs to delete the snapshot data and row locks saved in the first phase to complete the data cleanup.

image 4.png

  • Phase 2 rollback:

If the second phase is a rollback, Seata needs to rollback the "business SQL" that has been executed in the first phase to restore the business data. The way to rollback is to use "before image" to restore the business data; however, before restoring, we must first verify the dirty writing, compare the "current business data in the database" and the "after image", if the two data are not in the same state, then we will use the "after image" to restore the business data. However, before restoring, we should first check the dirty writing, compare the "current business data in database" and "after image", if the two data are completely consistent, it means there is no dirty writing, and we can restore the business data, if it is inconsistent, it means there is dirty writing, and we need to transfer the dirty writing to manual processing.

image 5.png

AT mode one phase, two phase commit and rollback are automatically generated by Seata framework, user only need to write "business SQL", then can easily access distributed transaction, AT mode is a kind of distributed transaction solution without any intrusion to business.

2.3.2 TCC Mode

In March 2019, Seata open-sourced the TCC pattern, which is contributed by Ant Gold. the TCC pattern requires users to implement Try, Confirm and Cancel operations according to their business scenarios; the transaction initiator executes the Try method in the first stage, the Confirm method in the second-stage commit, and the Cancel method in the second-stage rollback.

The transaction initiator performs Try in the first stage, Confirm in the second stage, and Cancel in the second stage. image 6.png

TCC Three method descriptions:

  • Try: detection and reservation of resources;
  • Confirm: the execution of the business operation submitted; require Try success Confirm must be successful;
  • Cancel: the release of the reserved resources;

Ant Gold's practical experience in TCC
**
16_48_02__08_13_2019.jpg

1 TCC Design - Business model is designed in 2 phases:

The most important thing for users to consider when accessing TCC is how to split their business model into two phases.

Take the "debit" scenario as an example, before accessing TCC, the debit of account A can be completed with a single SQL for updating the account balance; however, after accessing TCC, the user needs to consider how to split the original one-step debit operation into two phases and implement it into three methods, and to ensure that the first-phase Try will be successful and the second-phase Confirm will be successful if Try is successful. If Try succeeds in the first stage, Confirm will definitely succeed in the second stage.

image 7.png

As shown above, the

Try method as a one-stage preparation method needs to do resource checking and reservation. In the deduction scenario, what Try has to do is to check whether the account balance is sufficient and reserve funds for transfer, and the way to reserve is to freeze the transfer funds of account A. After the execution of the Try method, although the balance of account A is still 100, but $30 of it has been frozen and cannot be used by other transactions.

The second stage, the Confirm method, performs the real debit operation; Confirm will use the funds frozen in the Try stage to perform the debit operation; after the Confirm method is executed, the $30 frozen in the first stage has been deducted from account A, and the balance of account A becomes $70.

If the second stage is a rollback, you need to release the $30 frozen in the first stage of Try in the Cancel method, so that account A is back to the initial state, and all $100 is available.

The most important thing for users to access TCC mode is to consider how to split the business model into 2 phases, implement it into 3 methods of TCC, and ensure that Try succeeds and Confirm succeeds. Compared to AT mode, TCC mode is somewhat intrusive to the business code, but TCC mode does not have the global line locks of AT mode, and the performance of TCC will be much higher than AT mode.

2 TCC Design - Allow Null Rollback:
**
16_51_44__08_13_2019.jpg

The Cancel interface needs to be designed to allow null rollbacks. When the Try interface is not received due to packet loss, the transaction manager triggers a rollback, which triggers the Cancel interface, which needs to return to the success of the rollback when it finds that there is no corresponding transaction xid or primary key during the execution of Cancel. If the transaction service manager thinks it has been rolled back, otherwise it will keep retrying, and Cancel has no corresponding business data to roll back.

3 TCC Design - Anti-Suspension Control:
**
16_51_56__08_13_2019.jpg

The implication of the suspension is that the Cancel is executed before the Try interface, which occurs because the Try times out due to network congestion, the transaction manager generates a rollback that triggers the Cancel interface, and the Try interface call is eventually received, but the Cancel arrives before the Try. According to the previous logic of allowing empty rollback, the rollback will return successfully, the transaction manager thinks the transaction has been rolled back successfully, then the Try interface should not be executed at this time, otherwise it will generate data inconsistency, so we record the transaction xid or business key before the Cancel empty rollback returns successfully, marking this record has been rolled back, the Try interface checks the transaction xid or business key first. The Try interface first checks the transaction xid or business key to identify that the record has been rolled back, and then does not perform the business operation of Try if it has already been marked as rolled back successfully.

4 TCC Design - Power Control:
**
16_52_07__08_13_2019.jpg

Idempotence means that for the same system, using the same conditions, a single request and repeated multiple requests have the same impact on system resources. Because network jitter or congestion may timeout, transaction manager will retry operation on resources, so it is very likely that a business operation will be called repeatedly, in order not to occupy resources many times because of repeated calls, it is necessary to control idempotency when designing the service, usually we can use the transaction xid or the business primary key to judge the weight to control.

2.3.3 Saga Patterns

Saga mode is Seata's upcoming open source solution for long transactions, which will be mainly contributed by Ant Gold. In Saga mode, there are multiple participants within a distributed transaction, and each participant is an offsetting compensation service that requires users to implement its forward and reverse rollback operations according to business scenarios.

During the execution of a distributed transaction, the forward operations of each participant are executed sequentially, and if all forward operations are executed successfully, the distributed transaction commits. If any of the forward operations fails, the distributed transaction will go back and execute the reverse rollback operations of the previous participants to roll back the committed participants and bring the distributed transaction back to the initial state.

image 8.png

Saga Pattern Distributed transactions are usually event-driven and executed asynchronously between the various participants, Saga Pattern is a long transaction solution.

1 Saga pattern usage scenario
**
16_44_58__08_13_2019.jpg

Saga pattern is suitable for business systems with long business processes and the need to ensure the final consistency of transactions. Saga pattern commits local transactions at one stage, and performance can be guaranteed in the case of lock-free and long processes.

Transaction participants may be services from other companies or legacy systems that cannot be transformed and provide the interfaces required by TCC, and can use the Saga pattern.

The advantages of the Saga pattern are:

  • One-stage commit of local database transactions, lock-free, high performance;
  • Participants can use transaction-driven asynchronous execution, high throughput;
  • The compensation service is the "reverse" of the forward service, which is easy to understand and implement;

Disadvantages: The Saga pattern does not guarantee isolation because the local database transaction has already been committed in the first phase and no "reservation" action has been performed. Later we will talk about the lack of isolation of the countermeasures.
2 Saga implementation based on a state machine engine*
2 Saga implementation based on a state machine engine*
**3

17_13_19__08_13_2019.jpg

Currently there are generally two types of Saga implementations, one is achieved through event-driven architecture, and the other is based on annotations plus interceptors to intercept the business of the positive service implementation.Seata is currently implemented using an event-driven mechanism, Seata implements a state machine, which can orchestrate the call flow of the service and the compensation service of the positive service, generating a state diagram defined by a json file, and the state machine The state machine engine is driven to the operation of this map, when an exception occurs, the state machine triggers a rollback and executes the compensation services one by one. Of course, it is up to the user to decide when to trigger the rollback. The state machine can achieve the needs of service orchestration, it supports single selection, concurrency, asynchrony, sub-state machine call, parameter conversion, parameter mapping, service execution state judgement, exception catching and other functions.

3 State Machine Engine Principles

16_45_32__08_13_2019.jpg

The basic principle of this state machine engine is that it is based on an event-driven architecture, where each step is executed asynchronously, and steps flow through an event queue between steps,
greatly improving system throughput. Transaction logs are recorded at the time of execution of each step for use when rolling back in the event of an exception. Transaction logs are recorded in the database where the business tables are located to improve performance.

**4 State Machine Engine Design

16_45_46__08_13_2019.jpg

The state machine engine is divided into a three-tier architecture design, the bottom layer is the "event-driven" layer, the implementation of the EventBus and the consumption of events in the thread pool, is a Pub-Sub architecture. The second layer is the "process controller" layer, which implements a minimalist process engine framework that drives an "empty" process execution. node does, it just executes the process method of each node and then executes the route method to flow to the next node. This is a generic framework, based on these two layers, developers can implement any process engine. The top layer is the "state machine engine" layer, which implements the "behaviour" and "route" logic code of each state node, provides APIs and statechart repositories, and has some other components, such as expression languages, logic languages, and so on. There are also a number of other components, such as expression languages, logic calculators, flow generators, interceptors, configuration management, transaction logging, and so on.

5 The Saga Service Design Experience

Similar to TCC, Saga's forward and reverse services need to follow the following design principles:

1) Saga Service Design - Allow Null Compensation
**
16_52_22__08_13_2019.jpg

2) Saga Service Design - Anti-Suspension Control
**
16_52_52__08_13_2019.jpg

3) Saga Service Design - Power Control
**
3 Distributed Transactions Seata Three Patterns Explained - Yi Yuan-31.jpg

4) Saga Design - Custom Transaction Recovery Strategies
**
16_53_07__08_13_2019.jpg

As mentioned earlier, the Saga pattern does not guarantee transaction isolation, and dirty writes can occur in extreme cases. For example, in the case of a distributed transaction is not committed, the data of the previous service was modified, and the service behind the anomaly needs to be rolled back, may not be able to compensate for the operation due to the data of the previous service was modified. One way to deal with this situation is to "retry" and continue forward to complete the distributed transaction. Since the entire business process is arranged by the state machine, even after the recovery can continue to retry. So you can configure the transaction policy of the process according to the business characteristics, whether to give priority to "rollback" or "retry", when the transaction timeout, the Server side will continue to retry according to this policy.

Since Saga does not guarantee isolation, we need to achieve the principle of "long money rather than short money" in business design. Long money refers to the situation when there is a mistake and the money is too much from our point of view, and the money is too little, because if the money is too long, we can refund the money to the customer, but if it is too short, the money may not be recovered, which means that in the business design, we must give priority to "rollback" or "retry". That is, when the business is designed, it must be deducted from the customer's account before crediting the account, and if the override update is caused by the isolation problem, there will not be a case of less money.

6 Annotation and Interceptor Based Saga Implementation
**
17_13_37__08_13_2019.jpg

There is another implementation of Saga that is based on annotations + interceptors, which Seata does not currently implement. You can look at the pseudo-code above to understand it, the @SagaCompensable annotation is defined on the one method, and the compensation method used to define the one method is the compensateOne method. Then the @SagaTransactional annotation is defined on the processA method of the business process code, which starts a Saga distributed transaction, intercepts each forward method with an interceptor, and triggers a rollback operation when an exception occurs, calling the compensation method of the forward method.

**7 Comparison of Advantages and Disadvantages of the Two Saga Implementations

The following table compares the advantages and disadvantages of the two Saga implementations:

17_13_49__08_13_2019.jpg

The biggest advantage of the state machine engine is that it can be executed asynchronously through an event-driven approach to improve system throughput, service scheduling requirements can be achieved, and in the absence of isolation in the Saga model, there can be an additional "retry forward" strategy to recover from things. The biggest advantage of annotations and interceptors is that they are easy to develop and low cost to learn.

Summary

This article first reviewed the background and theoretical basis of distributed transactions, and then focused on the principles of Seata distributed transactions and three patterns (AT, TCC, Saga) of distributed transaction implementation.

Seata's positioning is a full-scenario solution for distributed transactions, and in the future there will also be XA mode of distributed transaction implementation, each mode has its own application scenarios, AT mode is a non-intrusive distributed transaction solution for scenes that do not want to transform the business, with almost zero learning cost. TCC mode is a high-performance distributed transaction solution for core systems and other scenes that have a high demand for performance. Saga mode is a long transaction solution for business systems that have long business processes and need to ensure the ultimate consistency of transactions. Saga mode submits local transactions at one stage, with no locks, and can ensure performance in the case of long processes, and is mostly used in the channel layer and integration layer of business systems. Transaction participants may be services from other companies or legacy systems that can't be transformed to provide the interfaces required by TCC, Saga mode can also be used.

The video review and PPT of this sharing can be viewed at: [https://tech.antfin.com/community/activities/779/review](https://tech.antfin.com/community/activities/779/ review)

· 12 min read

Under the microservices architecture system, we can layered design according to business modules, deployed separately, reducing the pressure of service deployment, but also decoupled from the business coupling, to avoid the application gradually become a monster, so that it can be easily scaled up, and in the case of failure of some services will not affect the normal operation of other services. In short, microservices in the rapid development of business brings us more and more advantages, but microservices are not perfect, so we can not blindly over-abuse, it has a lot of shortcomings, and will bring a certain degree of complexity to the system, which is accompanied by distributed transactions, is a microservices architectural system is bound to need to deal with a pain point, but also the industry has always been concerned about a field, and therefore there is a Theories such as CAP and BASE have emerged.

At the beginning of this year, Ali open source a distributed transaction middleware, initially named Fescar, later renamed Seata, in the beginning of its open source, I know it must be fire, because this is an open source project to solve the pain points, Seata began to rush to the business of non-intrusive and high-performance direction to go, which is exactly the solution to the problem of distributed transactions of the urgent needs of us. Because several companies have stayed with the microservices architecture, but in solving the problem of distributed transactions are not very elegant, so I have been concerned about the development of Seata, today it is briefly about some of its design principles, followed by the various modules I will be in-depth analysis of the source code , interested in can continue to pay attention to my public number or blog, do not lose with.

What are the solutions for distributed transaction resolution?

Currently distributed transaction solutions mainly have no invasion of business and invasive solutions, no invasive solutions are mainly based on the database XA protocol two-part submission (2PC) scheme, its advantage is no invasion of business code, but its shortcomings are also very obvious: the database must be required to support the XA protocol, and because of the characteristics of the XA protocol itself, it will result in a long period of time without the release of transactional resources, the locking cycle is long, and in the case of the XA protocol, it will cause a long period of time, but it will not be released. release, locking cycle is long, and in the application layer can not intervene, so it is very poor performance, its existence is equivalent to the fist of seven injuries as "hurt seven points, the loss of their own three points", so in the Internet project is not very popular this solution.

In order to make up for the low performance of this solution, the big boys have come up with a variety of solutions to solve the problem, but this invariably need to be done through the application layer, that is, invasion of the business approach, such as the well-known TCC programme, based on the TCC, there are also many mature frameworks, such as ByteTCC, tcc-transaction and so on. As well as based on the ultimate consistency of reliable messages to achieve, such as RocketMQ transaction messages.

Invasive code solutions are based on the existing situation of "last resort" solution, in fact, they are very inelegant to implement, a transaction call is usually accompanied by a series of reverse operations on the transaction interface to add a series of, for example, TCC three-stage commit, the logic of the inevitable rollback of the logic of the logic of the logic of the commit, so that the code will make the project very bloated. code will make the project very bloated, high maintenance costs.

Relationships between Seata modules

In response to the above pain points of distributed transaction solutions, it is clear that our ideal distributed transaction solution must be good performance and no intrusion into the business, the business layer does not need to care about the constraints of the distributed transaction mechanism, Seata is precisely in this direction, so it is very much worth looking forward to, it will bring qualitative improvements to our microservices architecture.

So how does Seata do it? Here's how its modules relate to each other.

Seata's design idea is that a distributed transaction can be understood as a global transaction, under which a number of branch transactions are hung, and a branch transaction is a local transaction that meets the ACID, so we can operate the distributed transaction as if it were a local transaction.

Seata internally defines three modules to deal with the relationship and processing of global and branch transactions, these three components are:

  • Transaction Coordinator (TC): The transaction coordinator maintains the state of the global transaction and is responsible for coordinating and driving the commit or rollback of the global transaction.
  • Transaction Manager (TM): Controls the boundaries of the global transaction and is responsible for opening a global transaction and ultimately initiating a global commit or global rollback resolution.
  • Resource Manager (RM): Controls branch transactions and is responsible for branch registration, status reporting, and receiving instructions from the Transaction Coordinator to drive the commit and rollback of branch (local) transactions.

Briefly describe the execution steps of the whole global transaction:

  1. TM requests TC to open a global transaction. TC creates the global transaction and returns a globally unique XID, which is propagated in the context of the global transaction;
  2. the RM registers a branch transaction with the TC, which is attributed to the global transaction with the same XID;
  3. the TM initiates a global commit or rollback to the TC;
  4. TC schedules the branch transaction under the XID to complete the commit or rollback.

How is it different from the XA scheme?

Seata's transaction commit method is basically the same as the XA protocol's two-stage commit in general, so what is the difference between them?

We all know that the XA protocol relies on the database level to ensure the consistency of transactions, that is, XA branch transactions are driven at the database level, because XA branch transactions need to have XA drivers, on the one hand, it will lead to the database and the XA driver coupling, on the other hand, it will lead to a long period of locking the resources of the transaction of the various branches, which is not popular in the Internet company! This is also an important factor that it is not popular in Internet companies.

Based on the above problems of the XA protocol, Seata another way, since the dependence on the database layer will lead to so many problems, then I'll do from the application layer to do the trick, which also has to start from Seata's RM module, the previous also said that the main role of RM, in fact, RM in the database operation of the internal agent layer, as follows:

Seata in the data source to do a layer of proxy layer, so we use Seata, we use the data source is actually using Seata's own data source proxy DataSourceProxy, Seata in this layer of the proxy to add a lot of logic, mainly parsing SQL, business data before and after the update of the data mirror organised into a rollback log, and insert the undo log log into the undo_log table to ensure that every business sql that updates data has a corresponding rollback log.

The advantage of doing this is that after the local transaction is executed, the resources locked by the local transaction can be released immediately, and then the branch status can be reported to the TC. When the TM decides to commit globally, there is no need for synchronous coordination, the TC will asynchronously schedule each RM branch transaction to delete the corresponding undo log, which is a very fast step; when the TM decides to roll back globally, the RM receives a rollback request from the TC, and then finds the corresponding undo log through the XID, and then executes the log to complete the rollback operation. operation.

RM will find the corresponding undo log through XID and execute the rollback log to complete the rollback operation.

As shown in the above figure, the RM of the XA scheme is placed in the database layer, and it relies on the XA driver of the database.

The XA scenario RM is placed at the database level as shown in the figure above.

As shown above, Seata's RM is actually placed in the application layer as middleware, and does not rely on the database for protocol support, completely stripping out the protocol support requirements of the database for distributed transaction scenarios.

How are branching transactions committed and rolled back?

Here is a detailed description of how branching transactions are committed and rolled back:

  • Stage one:

Branching transactions make use of the JDBC data source proxy in the RM module to join several processes, interpret business SQL, organise the data mirroring of business data before and after updates into a rollback log, generate an undo log log, check global transaction locks, and register branching transactions, etc., and make use of the ACID feature of the local transaction to write the business SQL and the undo log into the same The local transaction ACID feature is used to write the business SQL and undo log into the same thing and submit them to the database together to ensure that the corresponding rollback log must exist for the business SQL, and finally the branch transaction status is reported to the TC.

  • Phase II:

TM resolution global commit:

When the TM resolution is committed, there is no need for synchronous orchestration, the TC will asynchronously schedule each RM branch transaction to delete the corresponding undo logs, and this step can be completed very quickly. This mechanism is critical for performance improvement. We know that the success rate of transaction execution is very high during normal business operation, so it is possible to commit directly in the local transaction, which is a very significant step for performance improvement.

This step is very significant for performance improvement.

TM resolution global rollback:

When TM resolves to rollback, RM receives the rollback request from TC, RM finds the corresponding undo log through XID, then uses the ACID feature of the local transaction to execute the rollback log to complete the rollback operation and delete the undo log, and finally reports the rollback result to TC.

The last step is to report the rollback result to the TC.

The business is not aware of all the above processes, the business does not care about the specific global transaction commit and rollback, and the most important point is that Seata will be two-stage commit synchronisation coordination is decomposed into various branch transactions, branch transactions and ordinary local transactions are not any different, which means that after we use Seata, distributed transactions like the use of local transactions, the database layer of transaction coordination mechanism to the middleware layer. transaction coordination mechanism to the middleware layer Seata to do , so that although the transaction coordination moved to the application layer , but still can do zero intrusion into the business , thus stripping the distributed transaction scheme on the database in the protocol support requirements , and Seata in the branch transaction is completed directly after the release of resources , greatly reducing the branch transaction on the resources of the locking time , perfectly avoiding the XA protocol needs to be The problem of long resource locking time due to synchronous coordination of XA protocol is perfectly avoided.

Supplementation of other solutions

The above is actually the default mode of Seata, also known as AT mode, which is similar to the XA scheme of the two-stage submission scheme, and is non-intrusive on the business, but this mechanism still needs to rely on the database local transaction ACID characteristics, have you noticed that I have stressed in the above chart must be to support the ACID characteristics of relational databases, then the problem is, non-relational or databases that do not support ACID can not use Seata, do not panic, Seata is now prepared for us another mode, called MT mode, which is a business invasive solution, commit rollback and other operations need to be defined by us, the business logic needs to be broken down into Prepare/Commit/Rollback 3 parts, forming a MT branch The purpose of the MT model is to reach more scenarios for Seata by adding global transactions.

The point of this is to reach more scenarios for Seata.

Only, it is not Seata's "main" model, it exists only as a complementary solution, from the above official development vision can be seen, Seata's goal is to always be a non-invasive solution to the business.

Note: The design of the pictures in this article refers to the official Seata diagram.

Author Bio:

Zhang Chenghui, currently working in the technology platform department of Zhongtong Technology Information Centre as a Java engineer, mainly responsible for the development of Zhongtong messaging platform and all-links pressure testing project, love to share technology, WeChat public number "back-end progression" author, technology blog (https://objcoding.com/) Blogger, Seata Contributor, GitHub ID: objcoding.

· 5 min read

Preface

TaaS is a high-availability implementation of the Seata server (TC, Transaction Coordinator), written in Golang. Taas has been contributed to the Seata open-source community by InfiniVision (http://infinivision.cn) and is now officially open source.

Before Seata was open-sourced, we began to reference GTS and some open-source projects to implement the distributed transaction solution TaaS (Transaction as a Service).

After we completed the development of the TaaS server, Seata (then called Fescar) was open-sourced and attracted widespread attention from the open-source community. With Alibaba's platform influence and community activity, we believe that Seata will become the standard for open-source distributed transactions in the future. Therefore, we decided to make TaaS compatible with Seata.

Upon discovering that Seata's server implementation was single-node and lacked high availability, we contacted the Seata community leaders and decided to open-source TaaS to contribute to the open-source community. We will also maintain it in the long term and keep it synchronized with Seata versions.

Currently, the official Java high-availability version of Seata is also under development. TaaS and this high-availability version have different design philosophies and will coexist in the future.

TaaS has been open-sourced on GitHub (https://github.com/apache/incubator-seata-go-server). We welcome everyone to try it out.

Design Principles

  1. High Performance: Performance scales linearly with the number of machines. Adding new machines to the cluster can improve performance.
  2. High Availability: If a machine fails, the system can still provide services externally, or the service can be restored externally in a short time (the time it takes to switch leaders).
  3. Auto-Rebalance: When new machines are added to the cluster or machines are offline, the system can automatically perform load balancing.
  4. Strong Consistency: The system's metadata is stored consistently in multiple replicas.

Design

TaaS Design

High Performance

TaaS's performance scales linearly with the number of machines. To support this feature, TaaS handles the smallest unit of global transactions called a Fragment. The system sets the maximum concurrency of active global transactions supported by each Fragment upon startup. The system also samples each Fragment, and when it becomes overloaded, it generates new Fragments to handle more concurrency.

High Availability

Each Fragment has multiple replicas and one leader to handle requests. When the leader fails, the system generates a new leader to handle requests. During the election process of the new leader, the Fragment does not provide services externally, typically for a few seconds.

Strong Consistency

TaaS itself does not store the metadata of global transactions. The metadata is stored in Elasticell (https://github.com/deepfabric/elasticell), a distributed KV storage compatible with the Redis protocol. Elasticell ensures data consistency based on the Raft protocol.

Auto-Rebalance

As the system runs, there will be many Fragments and their replicas, resulting in uneven distribution of Fragments on each machine, especially when old machines are offline or new machines come online. When TaaS starts, it selects three nodes as schedulers, responsible for scheduling these Fragments to ensure that the number of Fragments and the number of leaders on each machine are roughly equal. It also ensures that the number of replicas for each Fragment remains at the specified number.

Fragment Replication Creation

Fragment Replication Creation

  1. At time t0, Fragment1 is created on machine Seata-TC1.
  2. At time t1, a replica of Fragment1, Fragment1', is created on machine Seata-TC2.
  3. At time t2, another replica of Fragment1, Fragment1", is created on machine Seata-TC3.

By time t2, all three replicas of Fragment1 are created.

Fragment Replication Migration

Fragment Replication Migration

  1. At time t0, the system has four Fragments, each existing on machines Seata-TC1, Seata-TC2, and Seata-TC3.
  2. At time t1, a new machine, Seata-TC4, is added.
  3. At time t2, replicas of three Fragments are migrated to machine Seata-TC4.

Online Quick Experience

We have set up an experience environment on the public network:

Local Quick Experience

Quickly experience TaaS functionality using docker-compose.

git clone https://github.com/seata/taas.git
docker-compse up -d

Due to the many component dependencies, the docker-compose takes about 30 seconds to start and become available for external services.

Seata Server Address

The service listens on the default port 8091. Modify the Seata server address accordingly to experience.

Seata UI

Access the WEB UI at http://127.0.0.1:8084/ui/index.html

About InfiniVision

InfiniVision is a technology-driven enterprise service provider dedicated to assisting traditional enterprises in digital transformation and upgrading using technologies such as artificial intelligence, cloud computing, blockchain, big data, and IoT edge computing. InfiniVision actively embraces open source culture and open sources core algorithms and architectures. Notable open-source products include the facial recognition software InsightFace (https://github.com/deepinsight/insightface), which has repeatedly won large-scale facial recognition challenges, and the distributed storage engine Elasticell (https://github.com/deepfabric/elasticell).

About the Author

The author, Zhang Xu, is the creator of the open-source Gateway (https://github.com/fagongzi/gateway) and currently works at InfiniVision, focusing on infrastructure-related development.

· 17 min read

Fescar

Common distributed transaction approaches include XA based on 2PC (e.g., Atomikos), TCC (e.g., ByteTCC) focusing on the business layer, and transactional messaging (e.g., RocketMQ Half Message). XA is a protocol for distributed transactions that requires support from local databases. However, the resource locking at the database level can lead to poor performance. On the other hand, TCC, introduced by Alibaba as a preacher, requires a significant amount of business code to ensure transactional consistency, resulting in higher development and maintenance costs.

Distributed transactions are a widely discussed topic in the industry, and this is one of the reasons why Fescar has gained 6k stars in a short period of time. The name "Fescar" stands for Fast & Easy Commit And Rollback. In simple terms, Fescar drives global transactions by coordinating local RDBMS branch transactions. It is a middleware that operates at the application layer. The main advantages of Fescar are better performance compared to XA, as it does not occupy connection resources for a long time, and lower development cost and business invasiveness compared to TCC.

Similar to XA, Fescar divides roles into TC (Transaction Coordinator), RM (Resource Manager), and TM (Transaction Manager). The overall transaction process model of Fescar is as follows:

Fescar事务过程

1.The TM (Transaction Manager) requests the TC (Transaction Coordinator) to start a global transaction. The global transaction is successfully created, and a globally unique XID (Transaction ID) is generated.
2.The XID is propagated in the context of the microservice invocation chain.
3.The RM (Resource Manager) registers the branch transaction with the TC, bringing it under the jurisdiction of the global transaction corresponding to the XID.
4.The TM initiates a global commit or rollback resolution for the XID with the TC.
5.The TC schedules the completion of commit or rollback requests for all branch transactions under the jurisdiction of the XID.

In the current implementation version, the TC (Transaction Coordinator) is deployed as a separate process. It is responsible for maintaining the operation records and global lock records of the global transaction, as well as coordinating and driving the global transaction's commit or rollback. On the other hand, the TM (Transaction Manager) and RM (Resource Manager) work in the same application process as the application.

The RM manages the underlying database through proxying the JDBC data source. It uses syntax parsing to retain snapshots and generate undo logs during transaction execution. This ensures that the transaction can be rolled back to its previous state if needed.

This covers the general flow and model division of Fescar. Now, let's proceed with the analysis of Fescar's transaction propagation mechanism.

Fescar Transaction Propagation Mechanism

The transaction propagation in Fescar includes both nested transaction calls within an application and transaction propagation across different services. So, how does Fescar propagate transactions in a microservices call chain? Fescar provides a transaction API that allows users to manually bind a transaction's XID and join it to the global transaction. Therefore, depending on the specific service framework mechanism, we can propagate the XID in the call chain to achieve transaction propagation.

The RPC request process consists of two parts: the caller and the callee. We need to handle the XID during the request and response. The general process is as follows: the caller (or the requester) retrieves the XID from the current transaction context and passes it to the callee through the RPC protocol. The callee extracts the XID from the request and binds it to its own transaction context, thereby participating in the global transaction. Common microservices frameworks usually provide corresponding Filter and Interceptor mechanisms. Now, let's analyze the integration process of Spring Cloud and Fescar in more detail.

Partial Source Code Analysis of Fescar Integration with Spring Cloud Alibaba

This section of the source code is entirely from spring-cloud-alibaba-fescar. The source code analysis mainly includes three parts: AutoConfiguration, the microservice provider, and the microservice consumer. Regarding the microservice consumer, it can be further divided into two specific approaches: RestTemplate and Feign. For the Feign request approach, it is further categorized into usage patterns that integrate with Hystrix and Sentine.

Fescar AutoConfiguration

For the AutoConfiguration analysis, this section will only cover the parts related to the startup of Fescar. The analysis of other parts will be interspersed in the 'Microservice Provider' and 'Microservice Consumer' sections.

The startup of Fescar requires the configuration of GlobalTransactionScanner. The GlobalTransactionScanner is responsible for initializing Fescar's RM client, TM client, and automatically proxying classes annotated with the GlobalTransactional annotation. The startup of the GlobalTransactionScanner bean is loaded and injected through GlobalTransactionAutoConfiguration, which also injects FescarProperties.

FescarProperties contains important properties of Fescar, such as txServiceGroup. The value of this property can be read from the application.properties file using the key 'spring.cloud.alibaba.fescar.txServiceGroup', with a default value of '${spring.application.name}-fescar-service-group'. txServiceGroup represents the logical transaction group name in Fescar. This group name is obtained from the configuration center (currently supporting file and Apollo) to retrieve the TC cluster name corresponding to the logical transaction group name. The TC cluster's service name is then constructed based on the cluster name. The RM client, TM client, and TC interact through RPC by using the registry center (currently supporting Nacos, Redis, ZooKeeper, and Eureka) and the service name to find available TC service nodes.

Microservice Provider

Since the logic of the consumer is a bit more complex, let's first analyze the logic of the provider. For Spring Cloud projects, the default RPC transport protocol is HTTP, so the HandlerInterceptor mechanism is used to intercept HTTP requests.

HandlerInterceptor is an interface provided by Spring, and it has three methods that can be overridden.

    /**
* Intercept the execution of a handler. Called after HandlerMapping determined
* an appropriate handler object, but before HandlerAdapter invokes the handler.
*/
default boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler)
throws Exception {

return true;
}

/**
* Intercept the execution of a handler. Called after HandlerAdapter actually
* invoked the handler, but before the DispatcherServlet renders the view.
* Can expose additional model objects to the view via the given ModelAndView.
*/
default void postHandle(HttpServletRequest request, HttpServletResponse response, Object handler,
@Nullable ModelAndView modelAndView) throws Exception {
}

/**
* Callback after completion of request processing, that is, after rendering
* the view. Will be called on any outcome of handler execution, thus allows
* for proper resource cleanup.
*/
default void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler,
@Nullable Exception ex) throws Exception {
}

According to the comments, we can clearly see the timing and common use cases of each method. For Fescar integration, it overrides the preHandle and afterCompletion methods as needed.

The purpose of FescarHandlerInterceptor is to bind the XID passed from the service chain to the transaction context of the service node and clean up related resources after the request is completed. FescarHandlerInterceptorConfiguration is responsible for configuring the interception of all URLs. This interceptor will be executed for all incoming requests to perform XID conversion and transaction binding.

/**
* @author xiaojing
*
* Fescar HandlerInterceptor, Convert Fescar information into
* @see com.alibaba.fescar.core.context.RootContext from http request's header in
* {@link org.springframework.web.servlet.HandlerInterceptor#preHandle(HttpServletRequest , HttpServletResponse , Object )},
* And clean up Fescar information after servlet method invocation in
* {@link org.springframework.web.servlet.HandlerInterceptor#afterCompletion(HttpServletRequest, HttpServletResponse, Object, Exception)}
*/
public class FescarHandlerInterceptor implements HandlerInterceptor {

private static final Logger log = LoggerFactory
.getLogger(FescarHandlerInterceptor.class);

@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response,
Object handler) throws Exception {

String xid = RootContext.getXID();
String rpcXid = request.getHeader(RootContext.KEY_XID);
if (log.isDebugEnabled()) {
log.debug("xid in RootContext {} xid in RpcContext {}", xid, rpcXid);
}

if (xid == null && rpcXid != null) {
RootContext.bind(rpcXid);
if (log.isDebugEnabled()) {
log.debug("bind {} to RootContext", rpcXid);
}
}
return true;
}

@Override
public void afterCompletion(HttpServletRequest request, HttpServletResponse response,
Object handler, Exception e) throws Exception {

String rpcXid = request.getHeader(RootContext.KEY_XID);

if (StringUtils.isEmpty(rpcXid)) {
return;
}

String unbindXid = RootContext.unbind();
if (log.isDebugEnabled()) {
log.debug("unbind {} from RootContext", unbindXid);
}
if (!rpcXid.equalsIgnoreCase(unbindXid)) {
log.warn("xid in change during RPC from {} to {}", rpcXid, unbindXid);
if (unbindXid != null) {
RootContext.bind(unbindXid);
log.warn("bind {} back to RootContext", unbindXid);
}
}
}

}

The preHandle method is called before the request is executed. The xid parameter represents the unique identifier of the global transaction already bound to the current transaction context, while rpcXid represents the global transaction identifier that needs to be bound to the request and is passed through the HTTP header. In the preHandle method, it checks if there is no XID in the current transaction context and if rpcXid is not empty. If so, it binds rpcXid to the current transaction context.

The afterCompletion method is called after the request is completed and is used to perform resource cleanup actions. Fescar uses the RootContext.unbind() method to unbind the XID involved in the transaction context. The logic in the if statement is for code robustness. If rpcXid and unbindXid are not equal, it rebinds unbindXid.

For Spring Cloud, the default RPC method is HTTP. Therefore, for the provider, there is no need to differentiate the request interception method. It only needs to extract the XID from the header and bind it to its own transaction context. However, for the consumer, due to the variety of request components, including circuit breakers and isolation mechanisms, different situations need to be distinguished and handled. We will analyze this in more detail later.

Microservice Consumer

Fescar categorizes the request methods into RestTemplate, Feign, Feign+Hystrix, and Feign+Sentinel. Different components are automatically configured through Spring Boot's Auto Configuration. The specific configuration classes can be found in the spring.factories file, and we will also discuss the relevant configuration classes later in this document.

RestTemplate

Let's take a look at how Fescar passes XID if the consumer is using RestTemplate for requests.

public class FescarRestTemplateInterceptor implements ClientHttpRequestInterceptor {
@Override
public ClientHttpResponse intercept(HttpRequest httpRequest, byte[] bytes,
ClientHttpRequestExecution clientHttpRequestExecution) throws IOException {
HttpRequestWrapper requestWrapper = new HttpRequestWrapper(httpRequest);

String xid = RootContext.getXID();

if (!StringUtils.isEmpty(xid)) {
requestWrapper.getHeaders().add(RootContext.KEY_XID, xid);
}
return clientHttpRequestExecution.execute(requestWrapper, bytes);
}
}

The FescarRestTemplateInterceptor implements the intercept method of the ClientHttpRequestInterceptor interface. It wraps the outgoing request and, if there is an existing Fescar transaction context XID, retrieves it and adds it to the HTTP headers of the request.

FescarRestTemplateInterceptor is configured in RestTemplate through FescarRestTemplateAutoConfiguration.

@Configuration
public class FescarRestTemplateAutoConfiguration {

@Bean
public FescarRestTemplateInterceptor fescarRestTemplateInterceptor() {
return new FescarRestTemplateInterceptor();
}

@Autowired(required = false)
private Collection<RestTemplate> restTemplates;

@Autowired
private FescarRestTemplateInterceptor fescarRestTemplateInterceptor;

@PostConstruct
public void init() {
if (this.restTemplates != null) {
for (RestTemplate restTemplate : restTemplates) {
List<ClientHttpRequestInterceptor> interceptors = new ArrayList<ClientHttpRequestInterceptor>(
restTemplate.getInterceptors());
interceptors.add(this.fescarRestTemplateInterceptor);
restTemplate.setInterceptors(interceptors);
}
}
}

}

The init method iterates through all the RestTemplate instances, retrieves the original interceptors from each RestTemplate, adds the fescarRestTemplateInterceptor, and then reorders the interceptors.

Feign

Feign 类关系图

Next, let's take a look at the code related to Feign. There are quite a few classes in this package, so let's start with its AutoConfiguration.

@Configuration
@ConditionalOnClass(Client.class)
@AutoConfigureBefore(FeignAutoConfiguration.class)
public class FescarFeignClientAutoConfiguration {

@Bean
@Scope("prototype")
@ConditionalOnClass(name = "com.netflix.hystrix.HystrixCommand")
@ConditionalOnProperty(name = "feign.hystrix.enabled", havingValue = "true")
Feign.Builder feignHystrixBuilder(BeanFactory beanFactory) {
return FescarHystrixFeignBuilder.builder(beanFactory);
}

@Bean
@Scope("prototype")
@ConditionalOnClass(name = "com.alibaba.csp.sentinel.SphU")
@ConditionalOnProperty(name = "feign.sentinel.enabled", havingValue = "true")
Feign.Builder feignSentinelBuilder(BeanFactory beanFactory) {
return FescarSentinelFeignBuilder.builder(beanFactory);
}

@Bean
@ConditionalOnMissingBean
@Scope("prototype")
Feign.Builder feignBuilder(BeanFactory beanFactory) {
return FescarFeignBuilder.builder(beanFactory);
}

@Configuration
protected static class FeignBeanPostProcessorConfiguration {

@Bean
FescarBeanPostProcessor fescarBeanPostProcessor(
FescarFeignObjectWrapper fescarFeignObjectWrapper) {
return new FescarBeanPostProcessor(fescarFeignObjectWrapper);
}

@Bean
FescarContextBeanPostProcessor fescarContextBeanPostProcessor(
BeanFactory beanFactory) {
return new FescarContextBeanPostProcessor(beanFactory);
}

@Bean
FescarFeignObjectWrapper fescarFeignObjectWrapper(BeanFactory beanFactory) {
return new FescarFeignObjectWrapper(beanFactory);
}
}

}

The FescarFeignClientAutoConfiguration is enabled when the Client.class exists and requires it to be applied before FeignAutoConfiguration. Since FeignClientsConfiguration is responsible for generating the FeignContext and is enabled by FeignAutoConfiguration, based on the dependency relationship, FescarFeignClientAutoConfiguration is also applied before FeignClientsConfiguration.

FescarFeignClientAutoConfiguration customizes the Feign.Builder and adapts it for feign.sentinel, feign.hystrix, and regular feign cases. The purpose is to customize the actual implementation of the Client in Feign to be FescarFeignClient.

HystrixFeign.builder().retryer(Retryer.NEVER_RETRY)
.client(new FescarFeignClient(beanFactory))
SentinelFeign.builder().retryer(Retryer.NEVER_RETRY)
.client(new FescarFeignClient(beanFactory));
Feign.builder().client(new FescarFeignClient(beanFactory));

FescarFeignClient is an enhancement of the original Feign client proxy.

public class FescarFeignClient implements Client {

private final Client delegate;
private final BeanFactory beanFactory;

FescarFeignClient(BeanFactory beanFactory) {
this.beanFactory = beanFactory;
this.delegate = new Client.Default(null, null);
}

FescarFeignClient(BeanFactory beanFactory, Client delegate) {
this.delegate = delegate;
this.beanFactory = beanFactory;
}

@Override
public Response execute(Request request, Request.Options options) throws IOException {

Request modifiedRequest = getModifyRequest(request);

try {
return this.delegate.execute(modifiedRequest, options);
}
finally {

}
}

private Request getModifyRequest(Request request) {

String xid = RootContext.getXID();

if (StringUtils.isEmpty(xid)) {
return request;
}

Map<String, Collection<String>> headers = new HashMap<>();
headers.putAll(request.headers());

List<String> fescarXid = new ArrayList<>();
fescarXid.add(xid);
headers.put(RootContext.KEY_XID, fescarXid);

return Request.create(request.method(), request.url(), headers, request.body(),
request.charset());
}

In the above process, we can see that FescarFeignClient modifies the original Request. It first retrieves the XID from the current transaction context and, if the XID is not empty, adds it to the request's header.

FeignBeanPostProcessorConfiguration defines three beans: FescarContextBeanPostProcessor, FescarBeanPostProcessor, and FescarFeignObjectWrapper. FescarContextBeanPostProcessor and FescarBeanPostProcessor both implement the Spring BeanPostProcessor interface.

Here is the implementation of FescarContextBeanPostProcessor

    @Override
public Object postProcessBeforeInitialization(Object bean, String beanName)
throws BeansException {
if (bean instanceof FeignContext && !(bean instanceof FescarFeignContext)) {
return new FescarFeignContext(getFescarFeignObjectWrapper(),
(FeignContext) bean);
}
return bean;
}

@Override
public Object postProcessAfterInitialization(Object bean, String beanName)
throws BeansException {
return bean;
}

The two methods in BeanPostProcessor allow for pre- and post-processing of beans in the Spring container. The postProcessBeforeInitialization method is called before initialization, while the postProcessAfterInitialization method is called after initialization. The return value of these methods can be the original bean instance or a wrapped instance using a wrapper.

FescarContextBeanPostProcessor wraps FeignContext into FescarFeignContext. FescarBeanPostProcessor wraps FeignClient into FescarLoadBalancerFeignClient and FescarFeignClient, depending on whether it inherits from LoadBalancerFeignClient.

In FeignAutoConfiguration, the FeignContext does not have any ConditionalOnXXX conditions. Therefore, Fescar uses a pre-processing approach to wrap FeignContext into FescarFeignContext.

    @Bean
public FeignContext feignContext() {
FeignContext context = new FeignContext();
context.setConfigurations(this.configurations);
return context;
}

For Feign Clients, the FeignClientFactoryBean retrieves an instance of FeignContext. For custom Feign Client objects configured by developers using the @Configuration annotation, they are configured into the builder, which causes the enhanced FescarFeignClient in FescarFeignBuilder to become ineffective. The key code in FeignClientFactoryBean is as follows

	/**
* @param <T> the target type of the Feign client
* @return a {@link Feign} client created with the specified data and the context information
*/
<T> T getTarget() {
FeignContext context = applicationContext.getBean(FeignContext.class);
Feign.Builder builder = feign(context);

if (!StringUtils.hasText(this.url)) {
if (!this.name.startsWith("http")) {
url = "http://" + this.name;
}
else {
url = this.name;
}
url += cleanPath();
return (T) loadBalance(builder, context, new HardCodedTarget<>(this.type,
this.name, url));
}
if (StringUtils.hasText(this.url) && !this.url.startsWith("http")) {
this.url = "http://" + this.url;
}
String url = this.url + cleanPath();
Client client = getOptional(context, Client.class);
if (client != null) {
if (client instanceof LoadBalancerFeignClient) {
// not load balancing because we have a url,
// but ribbon is on the classpath, so unwrap
client = ((LoadBalancerFeignClient)client).getDelegate();
}
builder.client(client);
}
Targeter targeter = get(context, Targeter.class);
return (T) targeter.target(this, builder, context, new HardCodedTarget<>(
this.type, this.name, url));
}

The above code determines whether to make a direct call to the specified URL or use load balancing based on whether the URL parameter is specified in the annotation. The targeter.target method creates the object through dynamic proxy. The general process is as follows: the parsed Feign methods are stored in a map, and then passed as a parameter to generate the InvocationHandler, which in turn generates the dynamic proxy object.

The presence of FescarContextBeanPostProcessor ensures that even if developers customize operations on FeignClient, the enhancement of global transactions required by Fescar can still be achieved.

As for FescarFeignObjectWrapper, let's focus on the Wrapper method:

	Object wrap(Object bean) {
if (bean instanceof Client && !(bean instanceof FescarFeignClient)) {
if (bean instanceof LoadBalancerFeignClient) {
LoadBalancerFeignClient client = ((LoadBalancerFeignClient) bean);
return new FescarLoadBalancerFeignClient(client.getDelegate(), factory(),
clientFactory(), this.beanFactory);
}
return new FescarFeignClient(this.beanFactory, (Client) bean);
}
return bean;
}

In the wrap method, if the bean is an instance of LoadBalancerFeignClient, it first retrieves the actual Client object that the LoadBalancerFeignClient proxies using the client.getDelegate() method. It then wraps the Client object into FescarFeignClient and generates a subclass of LoadBalancerFeignClient called FescarLoadBalancerFeignClient. If the bean is an instance of Client and not FescarFeignClient or LoadBalancerFeignClient, it is directly wrapped and transformed into FescarFeignClient.

The above process design is quite clever. It controls the order of configuration based on Spring Boot's Auto Configuration and customizes the Feign Builder bean to ensure that all Clients are enhanced with FescarFeignClient. It also wraps the beans in the Spring container using BeanPostProcessor, ensuring that all beans in the container are enhanced with FescarFeignClient, thus avoiding the replacement action in the getTarget method of FeignClientFactoryBean.

Hystrix Isolation

Now let's take a look at the Hystrix part. Why do we separate Hystrix and implement a separate strategy class in Fescar? Currently, the default implementation of the transaction context RootContext is based on ThreadLocal, which means the context is bound to the thread. Hystrix itself has two isolation modes: semaphore-based isolation and thread pool-based isolation. Hystrix officially recommends using thread pool isolation for better separation, which is the commonly used mode:

Thread or Semaphore
The default, and the recommended setting, is to run HystrixCommands using thread isolation (THREAD) and HystrixObservableCommands using semaphore isolation (SEMAPHORE).

Commands executed in threads have an extra layer of protection against latencies beyond what network timeouts can offer.

Generally the only time you should use semaphore isolation for HystrixCommands is when the call is so high volume (hundreds per second, per instance) that the overhead of separate threads is too high; this typically only applies to non-network calls.

You are correct that the service layer's business code and the thread that sends the request are not the same. Therefore, the ThreadLocal approach cannot pass the XID to the Hystrix thread and subsequently to the callee. To address this issue, Hystrix provides a mechanism for developers to customize the concurrency strategy. This can be done by extending the HystrixConcurrencyStrategy class and overriding the wrapCallable method:

public class FescarHystrixConcurrencyStrategy extends HystrixConcurrencyStrategy {

private HystrixConcurrencyStrategy delegate;

public FescarHystrixConcurrencyStrategy() {
this.delegate = HystrixPlugins.getInstance().getConcurrencyStrategy();
HystrixPlugins.reset();
HystrixPlugins.getInstance().registerConcurrencyStrategy(this);
}

@Override
public <K> Callable<K> wrapCallable(Callable<K> c) {
if (c instanceof FescarContextCallable) {
return c;
}

Callable<K> wrappedCallable;
if (this.delegate != null) {
wrappedCallable = this.delegate.wrapCallable(c);
}
else {
wrappedCallable = c;
}
if (wrappedCallable instanceof FescarContextCallable) {
return wrappedCallable;
}

return new FescarContextCallable<>(wrappedCallable);
}

private static class FescarContextCallable<K> implements Callable<K> {

private final Callable<K> actual;
private final String xid;

FescarContextCallable(Callable<K> actual) {
this.actual = actual;
this.xid = RootContext.getXID();
}

@Override
public K call() throws Exception {
try {
RootContext.bind(xid);
return actual.call();
}
finally {
RootContext.unbind();
}
}

}
}

Fescar also provides a FescarHystrixAutoConfiguration, which generates the FescarHystrixConcurrencyStrategy when HystrixCommand is present.

@Configuration
@ConditionalOnClass(HystrixCommand.class)
public class FescarHystrixAutoConfiguration {

@Bean
FescarHystrixConcurrencyStrategy fescarHystrixConcurrencyStrategy() {
return new FescarHystrixConcurrencyStrategy();
}

}

reference

author

kangshu.guo,Community nickname ywind, formerly employed at Huawei Terminal Cloud, currently a Java engineer at Sohu Intelligent Media Center. Mainly responsible for development related to Sohu accounts. Has a strong interest in distributed transactions, distributed systems, and microservices architecture. min.ji(qinming),Community nickname slievrly, Fescar project leader, core developer of Alibaba middleware TXC/GTS. Engaged in core research and development work in distributed middleware for a long time. Has extensive technical expertise in the field of distributed transactions.

· 6 min read

Many developers are already familiar with Fescar. However, Fescar has now transformed into Seata. If you're not aware of Seata, please check the following link.

SEATA GITHUB: [https://github.com/apache/incubator-seata]

We extend our sincere thanks and greetings to the Alibaba team for their contributions in bringing numerous open-source software to developers.

Today, I will share my insights on integrating Seata with Spring Cloud, aiming to help more developers avoid common pitfalls and streamline their setup process.

2. Project Overview

The setup process is as follows: client -> gateway -> service consumer -> service provider.

Technical Framework: spring cloud gateway
spring cloud fegin
nacos1.0.RC2
fescar-server0.4.1 (Seata)

For instructions on starting Nacos, please refer to: Nacos Startup Guide

Seata supports various service registration methods. In the fescar-server-0.4.1\conf directory, you will find:

file.conf
logback.xml
nacos-config.sh
nacos-config.text
registry.conf

There are a total of five files. Among them, file.conf and registry.conf are needed in the code segments for both service consumers and providers. Note: file.conf and registry.conf must be included in the current applications in use, i.e., both service consumer and provider applications must include them. If you are using a configuration center like Nacos or ZK, file.cnf can be ignored. However, if type="file" is specified, then file.cnf must be used.

Below is the configuration information in the registry.conf file. The registry section is for the service registration center configuration, and the config section is for the configuration center.

As shown below, Seata currently supports nacos, file, eureka, redis, zookeeper, etc., for registration and configuration. The default downloaded configuration type is file. The choice of method depends on your project’s actual requirements. Here, I chose nacos, but eureka can also be used. Both versions have been tested and work fine.

Note: If you are integrating with eureka, please use the latest official version.

3. Core Configuration

registry {
# file, nacos, eureka, redis, zk
type = "nacos"

nacos {
serverAddr = "localhost"
namespace = "public"
cluster = "default"
}
eureka {
serviceUrl = "http://localhost:1001/eureka"
application = "default"
weight = "1"
}
redis {
serverAddr = "localhost:6379"
db = "0"
}
zk {
cluster = "default"
serverAddr = "127.0.0.1:2181"
session.timeout = 6000
connect.timeout = 2000
}
file {
name = "file.conf"
}
}

config {
# file, nacos, apollo, zk
type = "nacos"

nacos {
serverAddr = "localhost"
namespace = "public"
cluster = "default"
}
apollo {
app.id = "fescar-server"
apollo.meta = "http://192.168.1.204:8801"
}
zk {
serverAddr = "127.0.0.1:2181"
session.timeout = 6000
connect.timeout = 2000
}
file {
name = "file.conf"
}
}

Note that nacos-config.sh is a script that needs to be executed if using Nacos as the configuration center. It initializes some default settings for Nacos.

Refer to the official guide for SEATA startup: Note that the official startup command separates parameters with spaces, so be careful. The IP is an optional parameter. Due to DNS resolution, sometimes when registering with Nacos, Fescar might obtain the address using the computer name, requiring you to specify the IP or configure the host to point to the IP. This issue has been fixed in the latest SEATA version.

sh fescar-server.sh 8091 /home/admin/fescar/data/ IP (optional)

As mentioned earlier, file.conf and registry.conf are needed in our code. The focus here is on file.conf. It is only loaded if registry is configured with file. If using ZK, Nacos, or other configuration centers, it can be ignored. However, service.localRgroup.grouplist and service.vgroupMapping need to be specified in the configuration center so that your client can automatically obtain the corresponding SEATA service and address from the configuration center upon startup. Failure to configure this will result in an error due to the inability to connect to the server. If using eureka, the config section should specify type="file". SEATA config currently does not support eureka.

transport {
# tcp, udt, unix-domain-socket
type = "TCP"
# NIO, NATIVE
server = "NIO"
# enable heartbeat
heartbeat = true
# thread factory for netty
thread-factory {
boss-thread-prefix = "NettyBoss"
worker-thread-prefix = "NettyServerNIOWorker"
server-executor-thread-prefix = "NettyServerBizHandler"
share-boss-worker = false
client-selector-thread-prefix = "NettyClientSelector"
client-selector-thread-size = 1
client-worker-thread-prefix = "NettyClientWorkerThread"
# netty boss thread size, will not be used for UDT
boss-thread-size = 1
# auto default pin or 8
worker-thread-size = 8
}
}
service {
# vgroup -> rgroup
vgroup_mapping.service-provider-fescar-service-group = "default"
# only support single node
localRgroup.grouplist = "127.0.0.1:8091"
# degrade current not support
enableDegrade = false
# disable
disable = false
}

client {
async.commit.buffer.limit = 10000
lock {
retry.internal = 10
retry.times = 30
}
}

4. Service Details

Two key points need attention:

grouplist IP: This is the IP and port of the current Fescar server.
vgroup_mapping configuration.

vgroup_mapping.service-provider-fescar-service-group: The service name here is actually the application name configured in the application.properties of your consumer or provider, e.g., spring.application.name=service-provider. In the source code, the application name is concatenated with fescar-service-group to form the key. Similarly, the value is the name of the current Fescar service. cluster = "default" / application = "default"

vgroup_mapping.service-provider-fescar-service-group = "default"
# only support single node
localRgroup.grouplist = "127.0.0.1:8091"

Both provider and consumer need to configure these two files.

If you use Nacos as the configuration center, you need to add the configuration in Nacos by adding the configuration manually.

5. Transaction Usage

In my code, the request is load-balanced through the gateway to the consumer. The consumer then requests the provider through Feign. The official example uses Feign, but here, the request is forwarded directly through the gateway, so the global transaction is handled in the controller layer, similar to the official demo.

@RestController
public class DemoController {
@Autowired
private DemoFeignClient demoFeignClient;

@Autowired
private DemoFeignClient2 demoFeignClient2;
@GlobalTransactional(timeoutMills = 300000, name = "spring-cloud-demo-tx")
@GetMapping("/getdemo")
public String demo() {

// Call service A and simply save
ResponseData<Integer> result = demoFeignClient.insertService("test", 1);
if(result.getStatus() == 400) {
System.out.println(result + "+++++++++++++++++++++++++++++++++++++++");
throw new RuntimeException("this is error1");
}

// Call service B and test rollback of service A upon error
ResponseData<Integer> result2 = demoFeignClient2.saveService();

if(result2.getStatus() == 400) {
System.out.println(result2 + "+++++++++++++++++++++++++++++++++++++++");
throw new RuntimeException("this is error2");
}

return "SUCCESS";
}
}

This concludes the core integration of transactions. Here, service A and B are both providers. When service B encounters an error, the global transaction rolls back. Each transaction can handle its local transactions independently.

SEATA uses a global XID to uniformly identify transactions. I will not list the database tables needed for SEATA here. For details, refer to: spring-cloud-fescar official DEMO

5.Data Proxy

Another important point is that in a distributed database service, each database needs an undo_log table to handle XID storage.

Additionally, each service project needs a database connection pool proxy. Currently, only Druid connection pool is supported. More will be supported in the future.

@Configuration
public class DatabaseConfiguration {

@Bean(destroyMethod = "close", initMethod = "init")
@ConfigurationProperties(prefix="spring.datasource")
public DruidDataSource druidDataSource() {
return new DruidDataSource();
}

@Bean
public DataSourceProxy dataSourceProxy(DruidDataSource druidDataSource) {
return new DataSourceProxy(druidDataSource);
}

@Bean
public SqlSessionFactory sqlSessionFactory(DataSourceProxy dataSourceProxy) throws Exception {
SqlSessionFactoryBean factoryBean = new SqlSessionFactoryBean();
factoryBean.setDataSource(dataSourceProxy);
return factoryBean.getObject();
}
}

Pay attention to the configuration file and data proxy. Without a data source proxy, undo_log will have no data, making XID management impossible.

Author: Da Fei

· 22 min read

Preface

In distributed systems, distributed transactions are a problem that must be solved. Currently, the most commonly used solution is eventual consistency. Since Alibaba open-sourced Fescar (renamed Seata in early April) earlier this year, the project has received great attention, currently nearing 8000 stars. Seata aims to solve the problem of distributed transactions in the microservices field with high performance and zero intrusion. It is currently undergoing rapid iteration, with a near-term goal of producing a production-ready MySQL version. For a comprehensive introduction to Seata, you can check the official WIKI for more detailed information.

This article is mainly based on the structure of spring cloud+spring jpa+spring cloud alibaba fescar+mysql+seata, building a distributed system demo, and analyzing and explaining its working process and principles from the perspective of the client (RM, TM) through seata's debug logs and source code. The code in the article is based on fescar-0.4.1. Since the project was just renamed to seata not long ago, some package names, class names, and jar package names are still named fescar, so the term fescar is still used in the following text. Sample project: https://github.com/fescar-group/fescar-samples/tree/master/springcloud-jpa-seata

  • XID: The unique identifier of a global transaction, composed of ip:port:sequence
  • Transaction Coordinator (TC): Maintains the running state of global transactions, responsible for coordinating and driving the commit or rollback of global transactions
  • Transaction Manager (TM): Controls the boundary of global transactions, responsible for starting a global transaction and finally initiating the decision of global commit or rollback
  • Resource Manager (RM): Controls branch transactions, responsible for branch registration, status reporting, and receiving instructions from the transaction coordinator to drive the commit and rollback of branch (local) transactions

Distributed Framework Support

Fescar uses XID to represent a distributed transaction. XID needs to be transmitted in the systems involved in a distributed transaction request, to send the processing status of branch transactions to the feacar-server, and to receive commit and rollback instructions from the feacar-server. Fescar officially supports all versions of the dubbo protocol, and the community also provides corresponding implementations for spring cloud (spring-boot) distributed projects.

<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-alibaba-fescar</artifactId>
<version>2.1.0.BUILD-SNAPSHOT</version>
</dependency>

This component implements the XID transmission function based on RestTemplate and Feign communication.

Business Logic

The business logic is the classic process of placing an order, deducting the balance, and reducing inventory. According to the module division, it is divided into three independent services, each connected to the corresponding database:

  • Order: order-server
  • Account: account-server
  • Inventory: storage-server

There is also a business system that initiates distributed transactions:

  • Business: business-server

The project structure is as shown in the figure below: Insert image description here

Normal Business

  1. The business initiates a purchase request
  2. Storage deducts inventory
  3. Order creates an order
  4. Account deducts balance

Abnormal Business

  1. The business initiates a purchase request
  2. Storage deducts inventory
  3. Order creates an order
  4. Account deduct balance exception

In the normal process, the data of steps 2, 3, and 4 is normally updated globally commit. In the abnormal process, the data is globally rolled back due to the exception error in step 4.

Configuration Files

The configuration entry file for fescar is registry.conf. Check the code ConfigurationFactory to find that currently, the configuration file name can only be registry.conf and cannot be specified otherwise.

private static final String REGISTRY_CONF = "registry.conf";
public static final Configuration FILE_INSTANCE = new FileConfiguration(REGISTRY_CONF);

In registry, you can specify the specific configuration form, the default is to use the file type. In file.conf, there are 3 parts of the configuration content:

  1. transport The transport part configuration corresponds to the NettyServerConfig class, used to define Netty-related parameters. TM and RM communicate with fescar-server using Netty.

  2. service

service {
#vgroup->rgroup
vgroup_mapping.my_test_tx_group = "default"
#Configure the address for the Client to connect to TC
default.grouplist = "127.0.0.1:8091"
#degrade current not support
enableDegrade = false
#disable
Whether to enable seata distributed transaction
disableGlobalTransaction = false
}
  1. client
client {
#RM receives the upper limit of buffer after TC's commit notification
async.commit.buffer.limit = 10000
lock {
retry.internal = 10
retry.times = 30
}
}

Data Source Proxy

In addition to the previous configuration files, fescar in AT mode has a bit of code volume, which is for the proxy of the data source, and currently can only be based on the proxy of DruidDataSource. Note: In the latest release of version 0.4.2, it has supported any data source type.

@Bean
@ConfigurationProperties(prefix = "spring.datasource")
public DruidDataSource druidDataSource() {
DruidDataSource druidDataSource = new DruidDataSource();
return druidDataSource;
}
@Primary
@Bean("dataSource")
public DataSourceProxy dataSource(DruidDataSource druidDataSource) {
return new DataSourceProxy(druidDataSource);
}

The purpose of using DataSourceProxy is to introduce ConnectionProxy. One aspect of fescar's non-intrusiveness is reflected in the implementation of ConnectionProxy. The cut-in point for branch transactions to join global transactions is at the local transaction's commit stage. This design ensures that business data and undo_log are in a local transaction. The undo_log table needs to be created in the business library, and fescar depends on this table to record the status of each branch transaction and the rollback data of the second stage. There is no need to worry about the table's data volume becoming a single point problem. In the case of a global transaction commit, the transaction's corresponding undo_log will be asynchronously deleted.

CREATE TABLE `undo_log` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`branch_id` bigint(20) NOT NULL,
`xid` varchar(100) NOT NULL,
`rollback_info` longblob NOT NULL,
`log_status` int(11) NOT NULL,
`log_created` datetime NOT NULL,
`log_modified` datetime NOT NULL,
`ext` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `ux_undo_log` (`xid`, `branch_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

Start Server

Go to https://github.com/apache/incubator-seata/releases to download the fescar-server version corresponding to the Client version to avoid protocol inconsistencies caused by different versions. Enter the bin directory after decompression and execute

./fescar-server.sh 8091 ../data

If the startup is successful, the following output will be shown

2019-04-09 20:27:24.637 INFO [main] c.a.fescar.core.rpc.netty.AbstractRpcRemotingServer.start:152 -Server started ...

Start Client

The loading entry class of fescar is located in GlobalTransactionAutoConfiguration, which can be automatically loaded for spring boot-based projects, and of course, it can also be instantiated by other means.

@Configuration
@EnableConfigurationProperties({FescarProperties.class})
public class GlobalTransactionAutoConfiguration {
private final ApplicationContext applicationContext;
private final FescarProperties fescarProperties;
public GlobalTransactionAutoConfiguration(ApplicationContext applicationContext, FescarProperties fescarProperties) {


this.applicationContext = applicationContext;
this.fescarProperties = fescarProperties;
}
/**
* Instantiate GlobalTransactionScanner
* scanner is the initialization initiator for the client
*/
@Bean
public GlobalTransactionScanner globalTransactionScanner() {
String applicationName = this.applicationContext.getEnvironment().getProperty("spring.application.name");
String txServiceGroup = this.fescarProperties.getTxServiceGroup();
if (StringUtils.isEmpty(txServiceGroup)) {
txServiceGroup = applicationName + "-fescar-service-group";
this.fescarProperties.setTxServiceGroup(txServiceGroup);
}
return new GlobalTransactionScanner(applicationName, txServiceGroup);
}
}

You can see that it supports a configuration item FescarProperties, used to configure the transaction group name

spring.cloud.alibaba.fescar.tx-service-group=my_test_tx_group

If the service group is not specified, the default name generated is spring.application.name + "-fescar-service-group". Therefore, if spring.application.name is not specified, it will report an error when starting.

@ConfigurationProperties("spring.cloud.alibaba.fescar")
public class FescarProperties {
private String txServiceGroup;
public FescarProperties() {}
public String getTxServiceGroup() {
return this.txServiceGroup;
}
public void setTxServiceGroup(String txServiceGroup) {
this.txServiceGroup = txServiceGroup;
}
}

After obtaining the applicationId and txServiceGroup, create a GlobalTransactionScanner object, mainly seeing the initClient method in the class.

private void initClient() {
if (StringUtils.isNullOrEmpty(applicationId) || StringUtils.isNullOrEmpty(txServiceGroup)) {
throw new IllegalArgumentException("applicationId: " + applicationId + " txServiceGroup: " + txServiceGroup);
}
// init TM
TMClient.init(applicationId, txServiceGroup);
// init RM
RMClient.init(applicationId, txServiceGroup);
}

In the method, you can see that TMClient and RMClient are initialized. For a service, it can be both TM and RM roles. When it is TM or RM depends on whether the @GlobalTransactional annotation is marked in a global transaction. The result of the Client creation is a Netty connection with TC, so you can see two Netty Channels in the startup log, indicating that the transactionRole is TMROLE and RMROLE.

2019-04-09 13:42:57.417 INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory : NettyPool create channel to {"address":"127.0.0.1:8091", "message":{"applicationId":"business-service", "byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"transactionServiceGroup":"my_test_tx_group","typeCode":101,"version":"0.4.1"},"transactionRole":"TMROLE"}
2019-04-09 13:42:57.505 INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory : NettyPool create channel to {"address":"127.0.0.1:8091", "message":{"applicationId":"business-service", "byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"transactionServiceGroup":"my_test_tx_group","typeCode":103,"version":"0.4.1"},"transactionRole":"RMROLE"}
2019-04-09 13:42:57.629 DEBUG 93715 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:RegisterTMRequest{applicationId='business-service', transactionServiceGroup='my_test_tx_group'}
2019-04-09 13:42:57.629 DEBUG 93715 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:RegisterRMRequest{resourceIds='null', applicationId='business-service', transactionServiceGroup='my_test_tx_group'}
2019-04-09 13:42:57.699 DEBUG 93715 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:1
2019-04-09 13:42:57.699 DEBUG 93715 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:2
2019-04-09 13:42:57.701 DEBUG 93715 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : com.alibaba.fescar.core.rpc.netty.RmRpcClient@3b06d101 msgId:1 future :com.alibaba.fescar.core.protocol.MessageFuture@28bb1abd body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
2019-04-09 13:42:57.701 DEBUG 93715 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : com.alibaba.fescar.core.rpc.netty.TmRpcClient@65fc3fb7 msgId:2 future :com.alibaba.fescar.core.protocol.MessageFuture@9a1e3df body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
2019-04-09 13:42:57.710 INFO 93715 --- [imeoutChecker_1] c.a.fescar.core.rpc.netty.RmRpcClient : register RM success. server version:0.4.1 channel:[id: 0xe6468995 L:/127.0.0.1:57397 - R:/127.0.0.1:8091]
2019-04-09 13:42:57.710 INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory : register success cost 114 ms version:0.4.1 role:TMROLE channel:[id: 0xd22fe0c5 L:/127.0.0.1:57398 - R:/127.0.0.1:8091]
2019-04-09 13:42:57.711 INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory : register success cost 125 ms version:0.4.1 role:RMROLE channel:[id: 0xe6468995 L:/127.0.0.1:57397 - R:/127.0.0.1:8091]

The log shows

  1. Create Netty connection
  2. Send registration request
  3. Receive response result
  4. RmRpcClient, TmRpcClient successfully instantiated

TM Process Flow

In this example, the TM role is business-service. The purchase method of BusinessService is marked with the @GlobalTransactional annotation.

@Service
public class BusinessService {
@Autowired
private StorageFeignClient storageFeignClient;
@Autowired
private OrderFeignClient orderFeignClient;
@GlobalTransactional
public void purchase(String userId, String commodityCode, int orderCount) {
storageFeignClient.deduct(commodityCode, orderCount);
orderFeignClient.create(userId, commodityCode, orderCount);
}
}

After the method is called, a global transaction will be created. First, pay attention to the function of the @GlobalTransactional annotation, which is intercepted and processed in the GlobalTransactionalInterceptor.

/**
* AOP intercepts method calls
*/
@Override
public Object invoke(final MethodInvocation methodInvocation) throws Throwable {
Class<?> targetClass = (methodInvocation.getThis() != null ? AopUtils.getTargetClass(methodInvocation.getThis()) : null);
Method specificMethod = ClassUtils.getMostSpecificMethod(methodInvocation.getMethod(), targetClass);
final Method method = BridgeMethodResolver.findBridgedMethod(specificMethod);
// Get the GlobalTransactional annotation of the method
final GlobalTransactional globalTransactionalAnnotation = getAnnotation(method, GlobalTransactional.class);
final GlobalLock globalLockAnnotation = getAnnotation(method, GlobalLock.class);
// If the method has the GlobalTransactional annotation, intercept the corresponding method processing
if (globalTransactionalAnnotation != null) {
return handleGlobalTransaction(methodInvocation, globalTransactionalAnnotation);
} else if (globalLockAnnotation != null) {
return handleGlobalLock(methodInvocation);
} else {
return methodInvocation.proceed();
}
}

The handleGlobalTransaction method calls the execute method of the [TransactionalTemplate](https://github.com/apache/incubator-seata/blob/develop/tm/src/main/java

/com/alibaba/fescar/tm/api/TransactionalTemplate.java). As the class name suggests, this is a standard template method that defines the standard steps for TM to handle global transactions. The comments are already quite clear.

public Object execute(TransactionalExecutor business) throws TransactionalExecutor.ExecutionException {
// 1. get or create a transaction
GlobalTransaction tx = GlobalTransactionContext.getCurrentOrCreate();
try {
// 2. begin transaction
try {
triggerBeforeBegin();
tx.begin(business.timeout(), business.name());
triggerAfterBegin();
} catch (TransactionException txe) {
throw new TransactionalExecutor.ExecutionException(tx, txe, TransactionalExecutor.Code.BeginFailure);
}
Object rs = null;
try {
// Do Your Business
rs = business.execute();
} catch (Throwable ex) {
// 3. any business exception rollback.
try {
triggerBeforeRollback();
tx.rollback();
triggerAfterRollback();
// 3.1 Successfully rolled back
throw new TransactionalExecutor.ExecutionException(tx, TransactionalExecutor.Code.RollbackDone, ex);
} catch (TransactionException txe) {
// 3.2 Failed to rollback
throw new TransactionalExecutor.ExecutionException(tx, txe, TransactionalExecutor.Code.RollbackFailure, ex);
}
}
// 4. everything is fine commit.
try {
triggerBeforeCommit();
tx.commit();
triggerAfterCommit();
} catch (TransactionException txe) {
// 4.1 Failed to commit
throw new TransactionalExecutor.ExecutionException(tx, txe, TransactionalExecutor.Code.CommitFailure);
}
return rs;
} finally {
// 5. clear
triggerAfterCompletion();
cleanUp();
}
}

The begin method of DefaultGlobalTransaction is called to start a global transaction.

public void begin(int timeout, String name) throws TransactionException {
if (role != GlobalTransactionRole.Launcher) {
check();
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Ignore Begin(): just involved in global transaction [" + xid + "]");
}
return;
}
if (xid != null) {
throw new IllegalStateException();
}
if (RootContext.getXID() != null) {
throw new IllegalStateException();
}
// Specific method to start the transaction, get the XID returned by TC
xid = transactionManager.begin(null, null, name, timeout);
status = GlobalStatus.Begin;
RootContext.bind(xid);
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Begin a NEW global transaction [" + xid + "]");
}
}

At the beginning of the method, if (role != GlobalTransactionRole.Launcher) checks the role to determine whether the current role is the initiator (Launcher) or the participant (Participant) of the global transaction. If the @GlobalTransactional annotation is also added to the downstream system methods in a distributed transaction, its role is Participant, and the subsequent begin will be ignored and directly returned. The determination of whether it is a Launcher or Participant is based on whether the current context already has an XID. The one without an XID is the Launcher, and the one with an XID is the Participant. Therefore, the creation of a global transaction can only be executed by the Launcher, and only one Launcher exists in a distributed transaction.

The DefaultTransactionManager is responsible for TM and TC communication, sending begin, commit, rollback instructions.

@Override
public String begin(String applicationId, String transactionServiceGroup, String name, int timeout) throws TransactionException {
GlobalBeginRequest request = new GlobalBeginRequest();
request.setTransactionName(name);
request.setTimeout(timeout);
GlobalBeginResponse response = (GlobalBeginResponse) syncCall(request);
return response.getXid();
}

At this point, the XID returned by fescar-server indicates that a global transaction has been successfully created. The log also reflects the above process.

2019-04-09 13:46:57.417 DEBUG 31326 --- [nio-8084-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting : offer message: timeout=60000, transactionName=purchase(java.lang.String, java.lang.String, int)
2019-04-09 13:46:57.417 DEBUG 31326 --- [geSend_TMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : write message:FescarMergeMessage, timeout=60000, transactionName=purchase(java.lang.String, java.lang.String, int) channel:[id: 0xa148545e L:/127.0.0.1:56120 - R:/127.0.0.1:8091] active?true, writable?true, isopen?true
2019-04-09 13:46:57.418 DEBUG 31326 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:FescarMergeMessage, timeout=60000, transactionName=purchase(java.lang.String, java.lang.String, int)
2019-04-09 13:46:57.421 DEBUG 31326 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:MergeResultMessage, com.alibaba.fescar.core.protocol.transaction.GlobalBeginResponse@2dc480dc, messageId:1
2019-04-09 13:46:57.421 DEBUG 31326 --- [nio-8084-exec-1] c.a.fescar.core.context.RootContext : bind 192.168.224.93:8091:2008502699
2019-04-09 13:46:57.421 DEBUG 31326 --- [nio-8084-exec-1] c.a.f.tm.api.DefaultGlobalTransaction : Begin a NEW global transaction [192.168.224.93:8091:2008502699]

After the global transaction is created, business.execute() is executed, which is the business code storageFeignClient.deduct(commodityCode, orderCount) that enters the RM processing flow. The business logic here is to call the inventory service's deduct inventory interface.

RM Processing Flow

@GetMapping(path = "/deduct")
public Boolean deduct(String commodityCode, Integer count) {
storageService.deduct(commodityCode, count);
return true;
}
@Transactional
public void deduct(String commodityCode, int count) {
Storage storage = storageDAO.findByCommodityCode(commodityCode);
storage.setCount(storage.getCount() - count);
storageDAO.save(storage);
}

The interface and service method of storage do not appear to have fescar-related code and annotations, reflecting fescar's non-intrusiveness. How does it join this global transaction? The answer is in the ConnectionProxy, which is why the DataSourceProxy must be used. Through DataSourceProxy, fescar can register branch transactions and send RM's processing results to TC at the cut-in point of local transaction submission in business code. Since the local transaction submission of business code is implemented by the proxy of ConnectionProxy, the commit method of ConnectionProxy is actually executed during local transaction submission.

public void commit() throws SQLException {
// If it is a global transaction, perform global transaction submission
// Determine if it is a global transaction, just check if there is an XID in the current context
if (context.inGlobalTransaction()) {
processGlobalTransactionCommit();
} else if (context.isGlobalLockRequire()) {
processLocalCommitWithGlobalLocks();
} else {
targetConnection.commit();
}
}

private void processGlobalTransactionCommit() throws SQLException {
try {
// First register RM with TC and get the branchId assigned by TC
register();
} catch (TransactionException e) {
recognizeLockKeyConflictException(e);
}
try {
if (context.hasUndoLog()) {
// Write undo log
UndoLogManager.flushUndoLogs(this);
}
// Commit local transaction, write undo_log and business data in a local transaction
targetConnection.commit();
} catch (Throwable ex) {
// Notify TC that RM's transaction processing failed
report(false);
if (ex instanceof SQLException) {
throw new SQLException(ex);
}
}
// Notify TC that RM's transaction processing succeeded
report(true);
context.reset();
}

private void register() throws TransactionException {
// Register RM, build request to send registration instructions to TC via netty
Long branchId = DefaultResourceManager.get().branchRegister(BranchType.AT, getDataSourceProxy().getResourceId(), null, context.getXid(), null, context.buildLockKeys());
// Save the returned branchId in the context
context.setBranchId(branchId);
}

Verify the above process through logs.

2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor : xid in Root

Context null xid in RpcContext 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] c.a.fescar.core.context.RootContext : bind 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor : bind 192.168.0.2:8091:2008546211 to RootContext
2019-04-09 21:57:48.386 INFO 38933 --- [nio-8081-exec-1] o.h.h.i.QueryTranslatorFactoryInitiator : HHH000397: Using ASTQueryTranslatorFactory
Hibernate: select storage0_.id as id1_0_, storage0_.commodity_code as commodit2_0_, storage0_.count as count3_0_ from storage_tbl storage0_ where storage0_.commodity_code=?
Hibernate: update storage_tbl set count=? where id=?
2019-04-09 21:57:48.673 INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient : will connect to 192.168.0.2:8091
2019-04-09 21:57:48.673 INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient : RM will register :jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false
2019-04-09 21:57:48.673 INFO 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.NettyPoolableFactory : NettyPool create channel to {"address":"192.168.0.2:8091", "message":{"applicationId":"storage-service", "byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"resourceIds":"jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false","transactionServiceGroup":"hello-service-fescar-service-group","typeCode":103,"version":"0.4.0"},"transactionRole":"RMROLE"}
2019-04-09 21:57:48.677 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:RegisterRMRequest{resourceIds='jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false', applicationId='storage-service', transactionServiceGroup='hello-service-fescar-service-group'}
2019-04-09 21:57:48.680 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:9
2019-04-09 21:57:48.680 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : com.alibaba.fescar.core.rpc.netty.RmRpcClient@7d61f5d4 msgId:9 future :com.alibaba.fescar.core.protocol.MessageFuture@186cd3e0 body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
2019-04-09 21:57:48.680 INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient : register RM success. server version:0.4.1 channel:[id: 0xd40718e3 L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:48.680 INFO 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.NettyPoolableFactory : register success cost 3 ms version:0.4.1 role:RMROLE channel:[id: 0xd40718e3 L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:48.680 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting : offer message: transactionId=2008546211, branchType=AT, resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false, lockKey=storage_tbl:1
2019-04-09 21:57:48.681 DEBUG 38933 --- [geSend_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : write message:FescarMergeMessage, transactionId=2008546211, branchType=AT, resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false, lockKey=storage_tbl:1 channel:[id: 0xd40718e3 L:/192.168.0.2:62607 - R:/192.168.0.2:8091] active?true, writable?true, isopen?true
2019-04-09 21:57:48.681 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:FescarMergeMessage, transactionId=2008546211, branchType=AT, resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false, lockKey=storage_tbl:1
2019-04-09 21:57:48.687 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:MergeResultMessage, BranchRegisterResponse: transactionId=2008546211, branchId=2008546212, result code=Success, getMsg=null, messageId:11
2019-04-09 21:57:48.702 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.rm.datasource.undo.UndoLogManager : Flushing UNDO LOG: {"branchId":2008546212, "sqlUndoLogs":[{"afterImage":{"rows":[{"fields":[{"keyType":"PrimaryKey","name":"id","type":4,"value":1},{"keyType":"NULL","name":"count","type":4,"value":993}]}],"tableName":"storage_tbl"},"beforeImage":{"rows":[{"fields":[{"keyType":"PrimaryKey","name":"id","type":4,"value":1},{"keyType":"NULL","name":"count","type":4,"value":994}]}],"tableName":"storage_tbl"},"sqlType":"UPDATE","tableName":"storage_tbl"}],"xid":"192.168.0.2:8091:2008546211"}
2019-04-09 21:57:48.755 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting : offer message: transactionId=2008546211, branchId=2008546212, resourceId=null, status=PhaseOne_Done, applicationData=null
2019-04-09 21:57:48.755 DEBUG 38933 --- [geSend_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : write message:FescarMergeMessage, transactionId=2008546211, branchId=2008546212, resourceId=null, status=PhaseOne_Done, applicationData=null channel:[id: 0xd40718e3 L:/192.168.0.2:62607 - R:/192.168.0.2:8091] active?true, writable?true, isopen?true
2019-04-09 21:57:48.756 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:FescarMergeMessage, transactionId=2008546211, branchId=2008546212, resourceId=null, status=PhaseOne_Done, applicationData=null
2019-04-09 21:57:48.758 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:MergeResultMessage, com.alibaba.fescar.core.protocol.transaction.BranchReportResponse@582a08cf, messageId:13
2019-04-09 21:57:48.799 DEBUG 38933 --- [nio-8081-exec-1] c.a.fescar.core.context.RootContext : unbind 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.799 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor : unbind 192.168.0.2:8091:2008546211 from RootContext
  1. Get the XID passed from business-service
  2. Bind X

ID to the current context 3. Execute business logic SQL 4. Create a Netty connection to TC 5. Send branch transaction information to TC 6. Get the branchId returned by TC 7. Record Undo Log data 8. Send the processing result of PhaseOne stage to TC 9. Unbind XID from the current context

Steps 1 and 9 are completed in the FescarHandlerInterceptor, which is not part of fescar, but implemented in spring-cloud-alibaba-fescar. It realizes the bind and unbind of xid to the current request context during feign and rest communication.

Here, RM completes the work of the PhaseOne stage, then look at the processing logic of the PhaseTwo stage.

Transaction Commit

After each branch transaction is completed, TC summarizes the reported results of each RM and sends commit or rollback instructions to each RM.

2019-04-09 21:57:49.813 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Receive:xid=192.168.0.2:8091:2008546211, branchId=2008546212, branchType=AT, resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false, applicationData=null, messageId:1
2019-04-09 21:57:49.813 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting : com.alibaba.fescar.core.rpc.netty.RmRpcClient@7d61f5d4 msgId:1 body:xid=192.168.0.2:8091:2008546211, branchId=2008546212, branchType=AT, resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false, applicationData=null
2019-04-09 21:57:49.814 INFO 38933 --- [atch_RMROLE_1_8] c.a.f.core.rpc.netty.RmMessageListener : onMessage:xid=192.168.0.2:8091:2008546211, branchId=2008546212, branchType=AT, resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false, applicationData=null
2019-04-09 21:57:49.816 INFO 38933 --- [atch_RMROLE_1_8] com.alibaba.fescar.rm.AbstractRMHandler : Branch committing: 192.168.0.2:8091:2008546211, 2008546212, jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false, null
2019-04-09 21:57:49.816 INFO 38933 --- [atch_RMROLE_1_8] com.alibaba.fescar.rm.AbstractRMHandler : Branch commit result: PhaseTwo_Committed
2019-04-09 21:57:49.817 INFO 38933 --- [atch_RMROLE_1_8] c.a.fescar.core.rpc.netty.RmRpcClient : RmRpcClient sendResponse branchStatus=PhaseTwo_Committed, result code=Success, getMsg=null
2019-04-09 21:57:49.817 DEBUG 38933 --- [atch_RMROLE_1_8] c.a.f.c.rpc.netty.AbstractRpcRemoting : send response:branchStatus=PhaseTwo_Committed, result code=Success, getMsg=null channel:[id: 0xd40718e3 L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:49.817 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler : Send:branchStatus=PhaseTwo_Committed, result code=Success, getMsg=null

The log shows

  1. RM receives the commit notice of XID=192.168.0.2:8091:2008546211, branchId=2008546212
  2. Execute the commit action
  3. Send the commit result to TC, branchStatus is PhaseTwo_Committed

Specifically, see the execution process of the second stage commit in the doBranchCommit method of the AbstractRMHandler class.

/**
* Get the key parameters such as xid and branchId notified
* Then call the RM's branchCommit
*/
protected void doBranchCommit(BranchCommitRequest request, BranchCommitResponse response) throws TransactionException {
String xid = request.getXid();
long branchId = request.getBranchId();
String resourceId = request.getResourceId();
String applicationData = request.getApplicationData();
LOGGER.info("Branch committing: " + xid + " " + branchId + " " + resourceId + " " + applicationData);
BranchStatus status = getResourceManager().branchCommit(request.getBranchType(), xid, branchId, resourceId, applicationData);
response.setBranchStatus(status);
LOGGER.info("Branch commit result: " + status);
}

Eventually, the branchCommit request will be called to the branchCommit method of AsyncWorker. AsyncWorker's processing method is a key part of fescar's architecture. Since most transactions will be committed normally, they end in the PhaseOne stage. This way, locks can be released as quickly as possible. After receiving the commit instruction in the PhaseTwo stage, the asynchronous processing can be done. This excludes the time consumption of the PhaseTwo stage from a distributed transaction.

private static final List<Phase2Context> ASYNC_COMMIT_BUFFER = Collections.synchronizedList(new ArrayList<Phase2Context>());

/**
* Add the XID to be committed to the list
*/
@Override
public BranchStatus branchCommit(BranchType branchType, String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
if (ASYNC_COMMIT_BUFFER.size() < ASYNC_COMMIT_BUFFER_LIMIT) {
ASYNC_COMMIT_BUFFER.add(new Phase2Context(branchType, xid, branchId, resourceId, applicationData));
} else {
LOGGER.warn("Async commit buffer is FULL. Rejected branch [" + branchId + "/" + xid + "] will be handled by housekeeping later.");
}
return BranchStatus.PhaseTwo_Committed;
}

/**
* Consume the XIDs in ASYNC_COMMIT_BUFFER through scheduled tasks
*/
public synchronized void init() {
LOGGER.info("Async Commit Buffer Limit: " + ASYNC_COMMIT_BUFFER_LIMIT);
timerExecutor = new ScheduledThreadPoolExecutor(1, new NamedThreadFactory("AsyncWorker", 1, true));
timerExecutor.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
try {
doBranchCommits();
} catch (Throwable e) {
LOGGER.info("Failed at async committing ... " + e.getMessage());
}
}
}, 10, 1000 * 1, TimeUnit.MILLISECONDS);
}

private void doBranchCommits() {
if (ASYNC_COMMIT_BUFFER.size() == 0) {
return;
}
Map<String, List<Phase2Context>> mappedContexts = new HashMap<>();
Iterator<Phase2Context> iterator = ASYNC_COMMIT_BUFFER.iterator();
// In a timed loop, take out all pending data in ASYNC_COMMIT_BUFFER
// Group the data to be committed with resourceId as the key. The resourceId is the database connection URL
// This can be seen in the previous log, the purpose is to cover the multiple data sources created by the application
while (iterator.hasNext()) {
Phase2Context commitContext = iterator.next();
List<Phase2Context> contextsGroupedByResourceId = mappedContexts.get(commitContext.resourceId);
if (contextsGroupedByResourceId == null) {
contextsGroupedByResourceId = new ArrayList<>();
mappedContexts.put(commitContext.resourceId, contextsGroupedByResourceId);
}
contextsGroupedByResourceId.add(commitContext);
iterator.remove();
}
for (Map.Entry<String, List<Phase2Context>> entry : mappedContexts.entrySet()) {
Connection conn = null;
try {
try {
// Get the data source and connection based on resourceId
DataSourceProxy dataSourceProxy = DataSourceManager.get().get(entry.getKey());
conn = dataSourceProxy.getPlainConnection();
} catch (SQLException sqle) {
LOGGER.warn("Failed to get connection for async committing on " + entry.getKey(), sqle);
continue;
}
List<Phase2Context> contextsGroupedByResourceId = entry.getValue();
for (Phase2Context commitContext : contextsGroupedByResourceId) {
try {
// Perform undolog processing, that is, delete the records corresponding to xid and branchId
UndoLogManager.deleteUndoLog(commitContext.xid, commitContext.branchId, conn);
} catch

(Exception ex) {
LOGGER.warn("Failed to delete undo log [" + commitContext.branchId + "/" + commitContext.xid + "]", ex);
}
}
} finally {
if (conn != null) {
try {
conn.close();
} catch (SQLException closeEx) {
LOGGER.warn("Failed to close JDBC resource while deleting undo_log ", closeEx);
}
}
}
}
}

So for the commit action, RM only needs to delete the undo_log corresponding to xid and branchId.

Transaction Rollback

There are two scenarios for triggering rollback:

  1. Branch transaction processing exception, that is the case of report(false) in the ConnectionProxy
  2. TM catches the exceptions thrown from the downstream system, that is the exception caught by the method marked with the @GlobalTransactional annotation in the initiated global transaction. In the previous template method execute of the TransactionalTemplate, the call to business.execute() was caught. After the catch, it will call rollback, and TM will notify TC that the corresponding XID needs to roll back the transaction.
public void rollback() throws TransactionException {
// Only the Launcher can initiate this rollback
if (role == GlobalTransactionRole.Participant) {
// Participant has no responsibility of committing
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Ignore Rollback(): just involved in global transaction [" + xid + "]");
}
return;
}
if (xid == null) {
throw new IllegalStateException();
}
status = transactionManager.rollback(xid);
if (RootContext.getXID() != null) {
if (xid.equals(RootContext.getXID())) {
RootContext.unbind();
}
}
}

TC summarizes and sends rollback instructions to the participants. RM receives the rollback notice in the doBranchRollback method of the AbstractRMHandler class.

protected void doBranchRollback(BranchRollbackRequest request, BranchRollbackResponse response) throws TransactionException {
String xid = request.getXid();
long branchId = request.getBranchId();
String resourceId = request.getResourceId();
String applicationData = request.getApplicationData();
LOGGER.info("Branch rolling back: " + xid + " " + branchId + " " + resourceId);
BranchStatus status = getResourceManager().branchRollback(request.getBranchType(), xid, branchId, resourceId, applicationData);
response.setBranchStatus(status);
LOGGER.info("Branch rollback result: " + status);
}

Then the rollback request is passed to the branchRollback method of the DataSourceManager class.

public BranchStatus branchRollback(BranchType branchType, String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
// Get the corresponding data source based on resourceId
DataSourceProxy dataSourceProxy = get(resourceId);
if (dataSourceProxy == null) {
throw new ShouldNeverHappenException();
}
try {
UndoLogManager.undo(dataSourceProxy, xid, branchId);
} catch (TransactionException te) {
if (te.getCode() == TransactionExceptionCode.BranchRollbackFailed_Unretriable) {
return BranchStatus.PhaseTwo_RollbackFailed_Unretryable;
} else {
return BranchStatus.PhaseTwo_RollbackFailed_Retryable;
}
}
return BranchStatus.PhaseTwo_Rollbacked;
}

Ultimately, the undo method of the UndoLogManager is executed. Since it is pure JDBC operation, the code is relatively long and will not be posted here. You can view the source code on GitHub via the link. Briefly describe the undo process:

  1. Find the undo_log submitted in the PhaseOne stage based on xid and branchId.
  2. If found, generate the playback SQL based on the data recorded in the undo_log and execute it, that is, restore the data modified in the PhaseOne stage.
  3. After step 2 is completed, delete the corresponding undo_log data.
  4. If no corresponding undo_log is found in step 1, insert a new undo_log with the status GlobalFinished. The reason it is not found may be that the local transaction in the PhaseOne stage encountered an exception, resulting in no normal write-in. Because xid and branchId are unique indexes, the insertion in step 4 can prevent successful write-in after recovery in the PhaseOne stage. In this way, the business data will not be successfully submitted, achieving the effect of final rollback.

Summary

Combining distributed business scenarios with local, this article analyzes the main processing flow of the fescar client side and analyzes the main source code for the TM and RM roles, hoping to help you understand the working principles of fescar. With the rapid iteration of fescar and the planning of future Roadmaps, it is believed that fescar can become a benchmark solution for open-source distributed transactions in time.

· 21 min read

Not long ago, I wrote an analysis of the distributed transaction middleware Fescar. Shortly after, the Fescar team rebranded it as Seata (Simple Extensible Autonomous Transaction Architecture), whereas the previous Fescar's English full name was Fast & Easy Commit And Rollback. It can be seen that Fescar was more limited to Commit and Rollback based on its name, while the new brand name Seata aims to create a one-stop distributed transaction solution. With the name change, I am more confident about its future development.

Here, let's briefly recall the overall process model of Seata:

Seata Process Model

  • TM: Transaction initiator. Used to inform TC about the start, commit, and rollback of global transactions.
  • RM: Specific transaction resource. Each RM is registered as a branch transaction in TC.
  • TC: Transaction coordinator. Also known as Fescar-server, used to receive registration, commit, and rollback of transactions.

In previous articles, I provided a general introduction to the roles, and in this article, I will focus on the core role TC, which is the transaction coordinator.

2. Transaction Coordinator

Why has the emphasis been placed on TC as the core role? Because TC, like God, manages the RM and TM of countless beings in the cloud. If TC fails to function properly, even minor issues with RM and TM will lead to chaos. Therefore, to understand Seata, one must understand its TC.

So, what capabilities should an excellent transaction coordinator possess? I think it should have the following:

  • Correct coordination: It should be able to properly coordinate what RM and TM should do next, what to do if something goes wrong, and what to do if everything goes right.
  • High availability: The transaction coordinator is crucial in distributed transactions. If it cannot ensure high availability, it serves no purpose.
  • High performance: The performance of the transaction coordinator must be high. If there are performance bottlenecks, it will frequently encounter timeouts, leading to frequent rollbacks.
  • High scalability: This characteristic belongs to the code level. If it is an excellent framework, it needs to provide many customizable extensions for users, such as service registration/discovery, reading configuration, etc.

Next, I will explain how Seata achieves the above four points.

2.1 Seata-Server Design

Seata-Server Design

The overall module diagram of Seata-Server is shown above:

  • Coordinator Core: At the bottom is the core code of the transaction coordinator, mainly used to handle transaction coordination logic, such as whether to commit, rollback, etc.
  • Store: Storage module used to persist data to prevent data loss during restarts or crashes.
  • Discovery: Service registration/discovery module used to expose server addresses to clients.
  • Config: Used to store and retrieve server configurations.
  • Lock: Lock module used to provide global locking functionality to Seata.
  • RPC: Used for communication with other endpoints.
  • HA-Cluster: High availability cluster, currently not open-source, provides reliable high availability services to Seata, expected to be open-sourced in version 0.6.

2.2 Discovery

First, let's talk about the basic Discovery module, also known as the service registration/discovery module. After starting Seata-Sever, we need to expose our address to other users, which is the responsibility of this module.

Discovery Module

This module has a core interface RegistryService, as shown in the image above:

  • register: Used by the server to register the service.
  • unregister: Used by the server, typically called in JVM shutdown hooks.
  • subscribe: Used by clients to register event listeners to listen for address changes.
  • unsubscribe: Used by clients to cancel event listeners.
  • lookup: Used by clients to retrieve service address lists based on keys.
  • close: Used to close the Registry resource.

If you need to add your own service registration/discovery, just implement this interface. So far, with the continuous development and promotion in the community, there are already five service registration/discovery implementations, including redis, zk, nacos, eruka, and consul. Below is a brief introduction to the Nacos implementation:

2.2.1 register Interface:

Register Interface

Step 1: Validate the address. Step 2: Get the Naming instance of Nacos and register the address with the service name serverAddr (fixed service name) on the corresponding cluster group (configured in registry.conf).

The unregister interface is similar, and I won't go into detail here.

2.2.2 lookup Interface:

Lookup Interface

Step 1: Get the current cluster name. Step 2: Check if the service corresponding to the current cluster name has been subscribed. If yes, directly retrieve the subscribed data from the map.

Step 3: If not subscribed, actively query the service instance list once, then add subscription and store the data returned by subscription in the map. After that, retrieve the latest data directly from the map.

2.2.3 subscribe Interface:

Subscribe Interface

This interface is relatively simple, divided into two steps: Step 1: Add the cluster -> listener to be subscribed to the map. Since Nacos does not provide a single machine already subscribed list, it needs to be implemented by itself. Step 2: Subscribe using the Nacos API.

2.3 Config

The configuration module is also a relatively basic and simple module. We need to configure some common parameters such as the number of select and work threads for Netty, the maximum allowed session, etc. Of course, these parameters in Seata have their own default settings.

Similarly, Seata also provides an interface Configuration for customizing where we need to obtain configurations:

Config Interface

  • getInt/Long/Boolean/getConfig(): Retrieves the corresponding value based on the dataId. If the configuration cannot be read, an exception occurs, or a timeout occurs, it returns the default value specified in the parameters.
  • putConfig: Used to add configuration.
  • removeConfig: Deletes a configuration.
  • add/remove/get ConfigListener: Add/remove/get configuration listeners, usually used to listen for configuration changes.

Currently, there are four ways to obtain Config: File (file-based), Nacos, Apollo, and ZK (not recommended). In Seata, you first need to configure registry.conf to specify the config.type. Implementing Config is relatively simple, and I won't delve into it here.

2.4 Store

The implementation of the storage layer is crucial for Seata's performance and reliability. If the storage layer is not implemented well, data being processed by TC in distributed transactions may be lost in the event of a crash. Since distributed transactions cannot tolerate data loss, if the storage layer is implemented well but has significant performance issues, RM may experience frequent rollbacks, making it unable to cope with high-concurrency scenarios.

In Seata, file storage is provided as the default method for storing data. Here, we define the data to be stored as sessions. Sessions created by the TM are referred to as GlobalSessions, while those created by RMs are called BranchSessions. A GlobalSession can have multiple BranchSessions. Our objective is to store all these sessions.

the code of FileTransactionStoreManager:

The code snippet above can be broken down into the following steps:

  • Step 1: Generate a TransactionWriteFuture.
  • Step 2: Put this futureRequest into a LinkedBlockingQueue. Why do we need to put all the data into a queue? Well, in fact, we could also use locks for this purpose. In another Alibaba open-source project, RocketMQ, locks are used. Whether it's a queue or a lock, their purpose is to ensure single-threaded writing. But why is that necessary? Some might explain that it's to ensure sequential writing, which would improve speed. However, this understanding is incorrect. The write method of our FileChannel is thread-safe and already ensures sequential writing. Ensuring single-threaded writing is actually to make our write logic single-threaded. This is because there may be logic such as when a file is full or when records are written to specific positions. Of course, this logic could be actively locked, but to achieve simplicity and convenience, it's most appropriate to queue the entire write logic for processing.
  • Step 3: Call future.get to wait for the completion notification of our write logic.

Once we submit the data to the queue, the next step is to consume it. The code is as follows:

Write Data File

Here, a WriteDataFileRunnable() is submitted to our thread pool, and the run() method of this Runnable is as follows:

Store Run

It can be broken down into the following steps:

  • Step 1: Check if stopping is true. If so, return null.
  • Step 2: Get data from our queue.
  • Step 3: Check if the future has timed out. If so, set the result to false. At this point, our producer's get() method will unblock.
  • Step 4: Write our data to the file. At this point, the data is still in the pageCache layer and has not been flushed to the disk yet. If the write is successful, flush it based on certain conditions.
  • Step 5: When the number of writes reaches a certain threshold, or when the writing time exceeds a certain limit, the current file needs to be saved as a historical file, the old historical files need to be deleted, and a new file needs to be created. This step is to prevent unlimited growth of our files, which would waste disk resources.

In our writeDataFile method, we have the following code:

Write Data File

  • Step 1: First, get our ByteBuffer. If it exceeds the maximum loop BufferSize, create a new one directly; otherwise, use our cached Buffer. This step can greatly reduce garbage collection.
  • Step 2: Add the data to the ByteBuffer.
  • Step 3: Finally, write the ByteBuffer to our fileChannel. This will be retried three times. At this point, the data is still in the pageCache layer and is affected by two factors: the OS has its own flushing strategy, but this business program cannot control it. To prevent events such as crashes from causing a large amount of data loss, the business itself needs to control the flush. Below is the flush code:

Flush

Here, the flush condition is based on writing a certain number of data or exceeding a certain time. This also presents a small issue: in the event of a power failure, there may still be data in the pageCache that has not been flushed to disk, resulting in a small amount of data loss. Currently, synchronous mode is not supported, which means that each piece of data needs to be flushed, ensuring that each message is written to disk. However, this would greatly affect performance. Of course, there will be continuous evolution and support for this in the future.

Our store's core process mainly consists of the above methods, but there are also some other processes such as session reconstruction, which are relatively simple and readers can read them on their own.

2.5 Lock

As we know, the isolation level in databases is mainly implemented through locks. Similarly, in the distributed transaction framework Seata, achieving isolation levels also requires locks. Generally, there are four isolation levels in databases: Read Uncommitted, Read Committed, Repeatable Read, and Serializable. In Seata, it can ensure that the isolation level is Read Committed but provides means to achieve Read Committed isolation.

The Lock module is the core module of Seata for implementing isolation levels. In the Lock module, an interface is provided for managing our locks: Lock Manager

It has three methods:

  • acquireLock: Used to lock our BranchSession. Although a branch transaction Session is passed here, it is actually locking the resources of the branch transaction. Returns true upon successful locking.
  • isLockable: Queries whether the transaction ID, resource ID, and locked key are already locked.
  • cleanAllLocks: Clears all locks.

For locks, we can implement them locally or use Redis or MySQL to help us implement them. The official default provides local global lock implementation: Default Lock

In the local lock implementation, there are two constants to pay attention to:

  • BUCKET_PER_TABLE: Defines how many buckets each table has, aiming to reduce competition when locking the same table later.
  • LOCK_MAP: This map seems very complex from its definition, with many layers of Maps nested inside and outside. Here's a table to explain it specifically:
LayerKeyValue
1-LOCK_MAPresourceId (jdbcUrl)dbLockMap
2- dbLockMaptableName (table name)tableLockMap
3- tableLockMapPK.hashcode%Bucket (hashcode%bucket of the primary key value)bucketLockMap
4- bucketLockMapPKtransactionId

It can be seen that the actual locking occurs in the bucketLockMap. The specific locking method here is relatively simple and will not be detailed. The main process is to gradually find the bucketLockMap and then insert the current transactionId. If this primary key currently has a TransactionId, then it checks if it is itself; if not, the locking fails.

2.6 RPC

One of the key factors in ensuring Seata's high performance is the use of Netty as the RPC framework, with the default configuration of the thread model as shown in the diagram below:

Reactor

If the default basic configuration is adopted, there will be one Acceptor thread for handling client connections and a number of NIO-Threads equal to cpu*2. In these threads, heavy business operations are not performed. They only handle relatively fast tasks such as encoding and decoding, heartbeats, and TM registration. Time-consuming business operations are delegated to the business thread pool. By default, the business thread pool is configured with a minimum of 100 threads and a maximum of 500.

Seata currently allows for configuration of transport layer settings, as shown in the following diagram. Users can optimize Netty transport layer settings according to their needs, and the configuration takes effect when loaded for the first time.

Transport

It's worth mentioning Seata's heartbeat mechanism, which is implemented using Netty's IdleStateHandler, as shown below:

Idle State Handler

On the server side, there is no maximum idle time set for writing, and for reading, the maximum idle time is set to 15 seconds (the client's default write idle time is 5 seconds, sending ping messages). If it exceeds 15 seconds, the connection will be disconnected, and resources will be closed.

User Event Triggered

  • Step 1: Check if it is a read idle detection event.
  • Step 2: If it is, disconnect the connection and close the resources.

Additionally, Seata has implemented features such as memory pools, batch merging of small packets by the client for sending, and Netty connection pools (reducing the service unavailable time when creating connections), one of which is batch merging of small packets.

The client's message sending process does not directly send messages. Instead, it wraps the message into an RpcMessage through AbstractRpcRemoting#sendAsyncRequest and stores it in the basket, then wakes up the merge sending thread. The merge sending thread, through a while true loop, waits for a maximum of 1ms to retrieve messages from the basket and wraps them into merge messages for actual sending. If an exception occurs in the channel during this process, it will quickly fail and return the result through fail-fast. Before sending the merge message, it is marked in the map. Upon receiving the results, batch confirmation is performed (AbstractRpcRemotingClient#channelRead), and then dispatched to messageListener and handler for processing. Additionally, timerExecutor periodically checks for timeouts in sent messages, marking them as failed if they exceed the timeout. Specific details of the message protocol design will be provided in subsequent articles, so stay tuned.

Seata's Netty Client consists of TMClient and RMClient, distinguished by their transactional roles. Both inherit from AbstractRpcRemotingClient, which implements RemotingService (service start and stop), RegisterMsgListener (Netty connection pool connection creation callback), and ClientMessageSender (message sending), further inheriting from AbstractRpcRemoting (the top-level message sending and processing template for Client and Server).

The class diagram for RMClient is depicted below: RMClient Class Diagram

TMClient and RMClient interact with channel connections based on their respective poolConfig and NettyPoolableFactory, which implements KeyedPoolableObjectFactory<NettyPoolKey, Channel>. The channel connection pool locates each connection pool based on the role key+IP, and manages channels uniformly. During the sending process, TMClient and RMClient each use only one long-lived connection per IP. However, if a connection becomes unavailable, it is quickly retrieved from the connection pool, reducing service downtime.

2.7 HA-Cluster

Currently, the official HA-Cluster design has not been publicly disclosed. However, based on some hints from other middleware and the official channels, HA-Cluster could be designed as follows: HA-Cluster Design

The specific process is as follows:

Step 1: When clients publish information, they ensure that the same transaction with the same transaction ID is handled on the same master. This is achieved by horizontally scaling multiple masters to provide concurrent processing performance.

Step 2: On the server side, each master has multiple slaves. Data in the master is nearly real-time synchronized to the slaves, ensuring that when the master crashes, other slaves can take over.

However, all of the above is speculation, and the actual design and implementation will have to wait until version 0.5. Currently, there is a Go version of Seata-Server donated to Seata (still in progress), which implements replica consistency through Raft. However, other details are not clear.

2.8 Metrics

This module has not yet disclosed a specific implementation. However, it may provide a plugin interface for integrating with other third-party metrics. Recently, Apache SkyWalking has been discussing how to integrate with the Seata team.

3. Coordinator Core

We have covered many foundational modules of the Seata server. I believe you now have a general understanding of Seata's implementation. Next, I will explain how the transaction coordinator's specific logic is implemented, providing you with a deeper understanding of Seata's internal workings.

3.1 Startup Process

The startup method is defined in the Server class's main method, outlining our startup process:

step1: Create an RpcServer, which encapsulates network operations using Netty to implement the server.

step2: Parse the port number, local file address (for recovering incomplete transactions if the server crashes), and IP address (optional, useful for obtaining an external VIP registration service when crossing networks).

step3: Initialize SessionHolder, wherein the crucial step is to recover data from the dataDir folder and rebuild sessions.

step4: Create a Coordinator, the core logic of the transaction coordinator, and initialize it. The initialization process includes creating four scheduled tasks:

  • retryRollbacking: Retry rollback task, used to retry failed rollbacks, executed every 5ms.
  • retryCommitting: Retry commit task, used to retry failed commits, executed every 5ms.
  • asyncCommitting: Asynchronous commit task, used to perform asynchronous commits, executed every 10ms.
  • timeoutCheck: Timeout task check, used to detect timeout tasks and execute timeout logic, executed every 2ms.

step5: Initialize UUIDGenerator, a basic class used for generating various IDs (transactionId, branchId).

step6: Set the local IP and listening port in XID, initialize rpcServer, and wait for client connections.

The startup process is relatively straightforward. Next, I will describe how Seata handles common business logic in distributed transaction frameworks.

3.2 Begin - Start Global Transaction

The starting point of a distributed transaction is always to start a global transaction. Let's see how Seata implements global transactions:

Begin Global Transaction

step1: Create a GloabSession based on the application ID, transaction group, name, and timeout. As mentioned earlier, GloabSession and BranchSession represent different aspects of the transaction.

step2: Add a RootSessionManager to it for listening to some events. Currently, Seata has four types of listeners (it's worth noting that all session managers implement SessionLifecycleListener):

  • ROOT_SESSION_MANAGER: The largest, containing all sessions.
  • ASYNC_COMMITTING_SESSION_MANAGER: Manages sessions that need asynchronous commit.
  • RETRY_COMMITTING_SESSION_MANAGER: Manages sessions for retrying commit.
  • RETRY_ROLLBACKING_SESSION_MANAGER: Manages sessions for retrying rollback. Since this is the beginning of a transaction, other SessionManagers are not needed, so only add RootSessionManager.

step3: Start GloabSession, which changes the state to Begin, records the start time, and calls the onBegin method of RootSessionManager to save the session to the map and write it to the file.

step4: Finally, return the XID. This XID is composed of ip+port+transactionId, which is crucial. When the TM acquires it, it needs to pass this ID to RM. RM uses XID to determine which server to access.

3.3 BranchRegister - Register Branch Transaction

After the global transaction is initiated by TM, the branch transaction of our RM also needs to be registered on top of our global transaction. Here's how it's handled:

Branch Transaction Registration

step1: Retrieve and validate the global transaction's state based on the transactionId.

step2: Create a new branch transaction, which is our BranchSession.

step3: Lock the branch transaction globally. Here, the logic uses the lock module.

step4: Add the branchSession, mainly by adding it to the globalSession object and writing it to our file.

step5: Return the branchId, which is also important. We will need it later to roll back our transaction or update the status of our branch transaction.

After registering the branch transaction, it is necessary to report whether the local transaction execution of the branch transaction was successful or failed. Currently, the server simply records this information. The purpose of reporting is that even if this branch transaction fails, if the TM insists on committing the global transaction (catches exceptions without throwing), then when traversing to commit the branch transaction, this failed branch transaction does not need to be committed (users can choose to skip it).

3.4 GlobalCommit - Global Commit

When our branch transaction is completed, it's up to our TM - Transaction Manager to decide whether to commit or rollback. If it's a commit, then the following logic will be executed:

Global Commit

step1: First, find our globalSession. If it's null, it means it has already been committed, so perform idempotent operation and return success.

step2: Close our GloabSession to prevent new branches from coming in (rollback due to timeout in cross-service calls, provider continues execution).

step3: If the status is Begin, it means it hasn't been committed yet, so change its status to Committing, indicating that it's committing.

step4: Determine if it can be asynchronously committed. Currently, only AT mode can be asynchronously committed. In two-phase global commits, undolog is only deleted without strict order. Here, a timer task is used, and the client merges and deletes in batches after receiving it.

step5: If it's an asynchronous commit, directly put it into our ASYNC_COMMITTING_SESSION_MANAGER to let it asynchronously execute step6 in the background. If it's synchronous, then execute step6 directly.

step6: Traverse our BranchSessions for submission. If a branch transaction fails, determine whether to retry based on different conditions. This branchSession can be executed asynchronously, and if it fails, it can continue with the next one because it remains in the manager and won't be deleted until it succeeds. If it's a synchronous commit, it will be put into the retry queue for scheduled retries and will block and submit in sequence.

3.5 GlobalRollback - Global Rollback

If our TM decides to globally rollback, it will follow the logic below:

Global Rollback

This logic is basically the same as the commit process but in reverse. I won't delve into it here.

4. Conclusion

Finally, let's summarize how Seata solves the key points of distributed transactions:

  • Correct coordination: Through background scheduled tasks, various retries are performed correctly, and in the future, a monitoring platform may be introduced, possibly allowing manual rollback.
  • High availability: Ensured by HA-Cluster.
  • High performance: Sequential file writing, RPC implemented through Netty, and Seata can be horizontally scaled in the future to improve processing performance.
  • High extensibility: Provides places where users can freely implement, such as configuration, service discovery and registration, global lock, etc.

In conclusion, I hope you can understand the core design principles of Seata-Server from this article. Of course, you can also imagine how you would design a distributed transaction server if you were to implement one yourself.

Seata GitHub Repository: https://github.com/apache/incubator-seata

Article Authors:

Li Zhao, GitHub ID @CoffeeLatte007, author of the public account "咖啡拿铁", Seata community Committer, Java engineer at Yuanfudao, formerly employed at Meituan. Has a strong interest in distributed middleware and distributed systems. Ji Min (Qingming), GitHub ID @slievrly, Seata open source project leader, core R&D member of Alibaba middleware TXC/GTS, has long been engaged in core R&D work of distributed middleware, and has rich technical accumulation in the field of distributed transactions.

· 13 min read

Fescar 0.4.0 version released the TCC model, contributed by the ant gold service team, welcome to try, the end of the article also provides the project follow-up Roadmap, welcome to pay attention.

Preface: Application scenarios based on TCC model


1.png

The TCC distributed transaction model acts directly on the service layer. It is not coupled with the specific service framework, has nothing to do with the underlying RPC protocol, has nothing to do with the underlying storage media, can flexibly choose the locking granularity of the business resources, reduces the resource locking holding time, has good scalability, and can be said to be designed for independently deployed SOA services.

I. TCC model advantages

For TCC distributed transaction model, I think its application in business scenarios, there are two aspects of significance.

1.1 Distributed transaction across services

The splitting of services can also be thought of as horizontal scaling of resources, only in a different direction.

Horizontal extensions may go along two directions:

  1. functional scaling, where data is grouped according to function and different functional groups are distributed over multiple different databases, which is effectively servitisation under the SOA architecture.
  2. data sharding, which adds a new dimension to horizontal scaling by splitting data across multiple databases within functional groups.

The following figure briefly illustrates the horizontal data scaling strategy:

2.png

Therefore, one of the roles of TCC is to ensure the transaction property of multi-resource access when scaling resources horizontally by function.

1.2 Two-stage splitting

Another effect of TCC is that it splits the two phases into two separate phases that are related by means of resource business locking. The advantage of resource locking is that it does not block other transactions from continuing to use the same resources in the first phase, nor does it affect the correct execution of the second phase of the transaction.

The traditional model of concurrent transactions:
3.png

Concurrent transactions for the TCC model:
4.png

How does this benefit the business? Taking the secured transaction scenario of Alipay, the simplified case involves only two services, the transaction service and the billing service. The transaction service is the main business service, and the accounting service is the slave business service, which provides the Try, Commit, and Cancel interfaces:

  1. The Try interface deducts the user's available funds and transfers them to pre-frozen funds. Pre-frozen funds is the business locking programme, each transaction can only use the pre-frozen funds of this transaction in the second phase, and other concurrent transactions can continue to process the user's available funds after the first phase of execution.
  2. The Commit interface deducts the pre-frozen funds and increases the funds available in the intermediate account (secured transactions do not immediately credit the merchant and require an intermediate account for suspense).

Assuming there is only one intermediary account, every time the Commit interface of the payment service is called, it locks the intermediary account, and there are hotspot performance issues with the intermediary account. However, in the secured transaction scenario, the funds need to be transferred from the intermediate account to the merchant only after seven days, and the intermediate account does not need to be displayed to the public. Therefore, after executing the first stage of the payment service, it can be considered that the payment part of this transaction has been completed and return the result of successful payment to the user and the merchant, and does not need to execute the Commit interface of the second stage of the payment service right away, and wait until the low-frontal period, and then slowly digest it and execute it asynchronously.
5.png

This is the two-phase asynchronisation feature of TCC distributed transaction model, the first phase of execution from the business service is successful, the master business service can be committed to complete, and then the framework asynchronously execute the second phase of each slave business service.

General-purpose TCC solution

The generic TCC solution is the most typical implementation of the TCC distributed transaction model, where all the slave business services need to participate in the decision making of the master business service.
6.png

Applicable scenarios

Since the slave business services are invoked synchronously and their results affect the decisions of the master business service, the generic TCC distributed transaction solution is suitable for businesses with deterministic and short execution times, such as the three most core services of an Internet financial enterprise: transaction, payment, and accounting:
7.png

When a user initiates a transaction, the transaction service is accessed first to create the transaction order; then the transaction service calls the payment service to create the payment order for the transaction and performs the collection action, and finally, the payment service calls the billing service to record the account flow and bookkeeping.

In order to ensure that the three services work together to complete a transaction, either succeed or fail at the same time, you can use a general-purpose TCC solution that puts the three services in a distributed transaction, with the transaction as the master service, the payment as the slave service, and the billing as the nested slave service of the payment service, and the atomicity of the transaction is guaranteed by the TCC model.
8.png

The Try interface of the payment service creates the payment order, opens a nested distributed transaction, and calls the Try interface of the billing service; the billing service freezes the buyer's funds in the Try interface. After the first stage of the call is completed, the transaction is completed, the local transaction is submitted, and the TCC framework completes the second stage of the distributed transaction from the business service.

The second stage of the payment service first calls the Confirm interface of the accounting service to deduct the buyer's frozen funds and increase the seller's available funds. After the call is successful, the payment service modifies the payment order to the completed state and completes the payment.

When both payment and billing service phase 2 are finished, the whole distributed transaction is finished.

Asynchronous guaranteed TCC solution

The direct slave service of the asynchronous assured TCC solution is the reliable messaging service, while the real slave service is decoupled by the messaging service and executed asynchronously as the consumer of the messaging service.
9.png

The Reliable Messaging Service needs to provide three interfaces, Try, Confirm, and Cancel. The Try interface pre-sends, and is only responsible for persistently storing the message data; the Confirm interface confirms the sending, and this is when the actual delivery of the message begins The Confirm interface confirms the delivery, which is when the actual delivery of the message begins; and the Cancel interface cancels the delivery and deletes the message data.

The message data of the message service is stored independently and scaled independently, which reduces the coupling between the business service and the messaging system, and achieves the ultimate consistency of the distributed transaction under the premise that the message service is reliable.

This solution increases the maintenance cost of message service, but since message service implements TCC interface instead of slave business service, slave business service doesn't need any modification and the access cost is very low.

Application scenario

Since consuming messages from a business service is an asynchronous process, the execution time is uncertain, which may lead to an increase in the inconsistency time window. Therefore, the Asynchronous Ensured TCC Distributed Transaction Solution is only applicable to some passive businesses that are less sensitive to the final consistency time (the processing result of the slave business service does not affect the decision of the master business service, and only passively receives the decision result of the master business service). For example, the member registration service and the email sending service:
10.png
 
When a user registers for a membership successfully, an email needs to be sent to the user to tell the user that the registration was successful and to prompt the user to activate the membership. But pay attention to two points:

  1. If the user registration is successful, make sure to send an email to the user;
  2. if the user's registration fails, an email must not be sent to the user.

So again, this requires the membership service and the mail service to ensure atomicity, either both are executed or neither is executed. The difference is that the mail service is only a passive business, it does not affect whether the user can register successfully or not, it only needs to send an email to the user after the user has registered successfully, and the mail service does not need to be involved in the decision making of the activities of the membership service.

For this kind of business scenario, you can use the asynchronous ensured TCC distributed transaction solution, as follows:
11.png
 
 
The reliable messaging service decouples the member and mail services, and the member service and the messaging service comprise the TCC transaction model, which ensures the atomicity of transactions. Then through the reliable feature of the message service, it ensures that the message can definitely be consumed by the mail service, so that the member and the mail service are in the same distributed transaction. At the same time, the mail service will not affect the execution process of the member service, and will only passively receive the request to send mail after the member service is executed successfully.

Compensated TCC solution

Compensated TCC solution is similar in structure to generic TCC solution, and its slave services also need to participate in the decision making of the main business service. However, the difference is that the former slave service only needs to provide Do and Compensate two interfaces, while the latter needs to provide three interfaces.
12.png
 
The Do interface directly executes the real complete business logic, completes the business processing, and the result of the business execution is visible externally; the Compensate operation is used for the business compensation, which offsets or partially offsets the business result of the positive business operation. Compensate operation needs to satisfy idempotency.
Compensate operation is used to offset or partially offset the business results of positive business operations, and the Compensate operation needs to satisfy idempotency.
Compared with the general-purpose solution, Compensate solution does not need to transform the original business logic from the business service, and only needs to add an additional Compensate rollback logic, which is a lesser business transformation. However, it is important to note that the business executes the entire business logic in one phase and cannot achieve effective transaction isolation. When rollback is required, there may be a compensation failure, and additional exception handling mechanisms, such as manual intervention, are also required.

Applicable scenarios

Due to the existence of rollback compensation failure, the compensated TCC distributed transaction solution is only applicable to some of the less concurrent conflict or need to interact with external business, these external business is not a passive business, its execution results will affect the decision of the main business service, such as the ticket booking service of the air ticket agency:
13.png

This air ticket service provides multi-destination air ticket booking service, which can book air tickets for multiple itinerary flights at the same time, e.g., to travel from Beijing to St. Petersburg, it is necessary to take the first journey from Beijing to Moscow, as well as the second journey from Moscow to St. Petersburg.

When a user books a ticket, he/she would definitely want to book tickets for both flights at the same time, and booking only one flight does not make sense for the user. Therefore, such a business service also imposes the atomicity requirement that if the booking for one of the flights fails, the other flight needs to be able to be cancelled.

However, it is extremely difficult to push the airlines to change as they are external to the ticket agents and only provide booking and cancellation interfaces. Therefore, for this type of business service, a compensated TCC distributed transaction solution can be used, as follows:
14.png

The gateway service adds the Compensate interface on top of the original logic, which is responsible for calling the cancellation interface of the corresponding airline.

When the user initiates a ticket booking request, the ticket service first calls the booking interface of each airline through the Do interface of the gateway, and if all flights are booked successfully, the whole distributed transaction is executed successfully; once the booking of tickets for a certain flight fails, the distributed transaction is rolled back, and the Compensate interface of each gateway is called by the TCC transaction framework, which then calls the corresponding airline's The TCC transaction framework calls the Compensate compensation interface of each gateway, which then calls the corresponding airline's cancellation interface. In this way, the atomicity of multi-way ticket booking service can also be guaranteed.

V. Summary

For today's Internet applications, horizontal scaling of resources provides more flexibility and is a relatively easy to implement outward scaling solution, but at the same time, it also significantly increases the complexity and introduces some new challenges, such as data consistency issues between resources.

Horizontal data scaling can be done both by data slicing and by functionality. the TCC model ensures the transactional properties of multi-resource access while scaling resources horizontally by functionality.

TCC model in addition to the role of cross-service distributed transactions this layer , but also has a two-stage division of the function , through the business resource locking , allowing the second stage of asynchronous execution , and asynchronous idea is to solve the hot spot data concurrency performance problems of one of the tools .

Roadmap

Currently, we have released 0.4.0, and we will release 0.5 ~ 1.0 to continue to improve and enrich the functionality of AT and TCC modes, and to solve the problem of high availability of the server side. After 1.0, this open source product will reach the standard of production environment.


image1.png

· 7 min read

Fescar 0.4.0 version released the TCC schema, contributed by the Anthem team, you are welcome to try it out,
Sample address:[https://github.com/fescar-group/fescar-samples/tree/master/tcc](https. //github.com/fescar-group/fescar-samples/tree/master/tcc),
At the end of this article, we also provide the roadmap of the project, welcome to follow.

I. Introduction to TCC

In the Two Phase Commitment Protocol (2PC), the resource manager (RM, resource manager) needs to provide three functions: "prepare", "commit" and "rollback". "Rollback" 3 operations; while the transaction manager (TM, transaction manager) coordinates all resource managers in 2 phases, in the first phase asks all resource managers whether the "preparation" is successful, if all resources are If all resources are "ready" successfully, then perform "commit" operation of all resources in the second phase, otherwise perform "rollback" operation of all resources in the second phase to ensure that the final state of all resources is the same, either all commits or all commits, or the final state of all resources is the same. to ensure that the final state of all resources is the same, either all commit or all rollback.

Resource Manager has many implementations, among which TCC (Try-Confirm-Cancel) is a service-based implementation of Resource Manager; TCC is a relatively mature distributed transaction solution that can be used to solve the data consistency problem of cross-database and cross-service business operations; TCC's Try, Confirm and Cancel methods are implemented by business code. TCC's Try, Confirm, and Cancel methods are all implemented by business code, so TCC can be called a service-based resource manager.

The Try operation of TCC is the first stage, which is responsible for checking and reserving resources; Confirm operation is the second stage, which is the submit operation to execute the real business; Cancel is the second stage, which is the rollback operation, which is the cancellation of the reserved resources to return the resources to the initial state.

As shown in the figure below, after the user implements a TCC service, the TCC service will be one of the resources of the distributed transaction, participating in the whole distributed transaction; the transaction manager coordinates the TCC services in two stages, calling the Try method of all TCC services in the first stage, and executing the Confirm or Cancel method of all TCC services in the second stage; eventually all TCC services are either committed or cancelled; all TCC services are either committed or cancelled. services are either all committed or all rolled back.

image.png

II. TCC Design

When users access TCC, most of the work is focused on how to implement TCC service, after years of TCC application by Anthem, the following main TCC design and implementation of the main matters are summarised below:

1, Business operation is completed in two stages

Before connecting to TCC, business operation can be completed in one step only, but after connecting to TCC, we need to consider how to divide it into 2 phases to complete, put the resource checking and reserving in Try operation in the first phase, and put the execution of real business operation in Confirm operation in the second phase.

Below is an example of how the business model can be designed in two phases. Example scenario: "Account A has a balance of $100, of which $30 needs to be deducted";

Before accessing TCC, the user could write SQL: "update account table set balance = balance - 30 where account = A" to complete the deduction operation in one step.

After connecting to TCC, you need to consider how to split the debit operation into 2 steps:

  • Try operation: checking and reserving resources;

In the deduction scenario, what Try operation has to do is to check whether the balance of A account is enough, and then freeze the $30 to be deducted (reserved resources); no real deduction will happen at this stage.

  • Confirm operation: performs the submission of the real operation;

In the deduction scenario, the Confirm phase takes place when the real deduction occurs, deducting the $30 already frozen in A's account.

  • Cancel operation: whether or not the reserved resource is released;

In a debit scenario, the debit is cancelled and the Cancel operation performs the task of releasing the $30 that was frozen by the Try operation, returning Account A to its initial state.

image.png

2, Concurrency Control

Users should consider concurrency issues when implementing TCC and minimise lock granularity to maximise concurrency in distributed transactions.

The following is still an example of deducting money from account A. "There is $100 on account A. Transaction T1 has to deduct $30 of it, and transaction T2 also has to deduct $30, and there is concurrency".

In the first phase of the Try operation, distributed transaction T1 and distributed transaction T2 are freezing that part of the funds without interfering with each other; so that in the second phase of the distributed transaction, no matter whether T1 is a commit or a rollback, there will be no impact on T2, so that T1 and T2 are executing in parallel on the same piece of business data.

image.png

3, Allow empty rollback

As shown in the following figure, when the transaction coordinator invokes the first-phase Try operation of the TCC service, there may be a network timeout due to packet loss, and at this time the transaction manager triggers a two-phase rollback to invoke the Cancel operation of the TCC service, which is invoked without a timeout.

The TCC service receives a Cancel request without receiving a Try request, this scenario is called a null rollback; null rollbacks often occur in production environments, and users should allow for null rollbacks when implementing TCC services, i.e., return success when receiving a null rollback.

image.png

4. Anti-suspension control

As shown in the figure below, when the transaction coordinator calls the TCC service's one-phase Try operation, there may be a timeout due to network congestion, at this time, the transaction manager will trigger a two-phase rollback and call the TCC service's Cancel operation, and the Cancel call is not timed out; after this, the one-phase Try packet that is congested in the network is received by the TCC service, and there is a two-phase After this, the first-phase Try packet on the congested network is received by the TCC service, and the second-phase Cancel request is executed before the first-phase Try request, and the TCC service will never receive the second-phase Confirm or Cancel after executing the late Try, resulting in the suspension of the TCC service.

When you implement TCC service, you should allow empty rollback, but refuse to execute Try request after empty rollback to avoid hanging.

image.png

5. Idempotent control

Whether it is network packet retransmission or compensation execution of abnormal transaction, it will lead to the Try, Confirm or Cancel operation of TCC service to be executed repeatedly; users need to consider idempotent control when implementing TCC service, i.e., the business result of Try, Confirm, Cancel executed once and executed many times is the same.
image.png

Roadmap

Currently we have released version 0.4.0, we will release version 0.5 ~ 1.0, continue to improve and enrich the functions of AT, TCC mode, and solve the problem of high availability of the server side, after version 1.0, this open source product will reach the standard of production environment.

image1.png