How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Efficiently_Ningguo Chen
- 1. How Apache Pulsar Helps
Tencent Process Tens of Billions
of Transactions Efficiently
- 2. Self-Introduction
• Ningguo Chen, Lead Architect.
• Joined Tencent in 2008. Lead Architect of
Tencent Midas Billing Platform. Leads and
participates in building Tencent QB E-shop,
Tencent Midas Standard/Enterprise/Oversea
Editions, etc, facilitating Tencent Midas
becomes an all-round one-stop global billing
platform.
• Professional in Billing and Transaction system.
Owner of 20+ patents. Mainly focused on
providing billing services of high stability,
efficiency and security for Tencent’s online
and offline business.
- 5. Pain Points of Billing platform
• High consistency of transaction, ensuring the consistency of payments and goods
delivery.
-Failure and timeout: A single payment of Midas often involves many internal and external systems. This
leads to longer call chains and more exceptions, especially in network timeouts (e.g. overseas payment
services), DB manipulation, goods delivery Failures.
• High data reliability crossing regions
• Oceans of timed processing
- subscription & recurring billing, in-time account reconciliation
• High Performance
• High Scalability
- automatically scales elastically according to the business scope
- 6. Our Solution
• TDXA+TDMQ
TDXA, a framework solution for transaction control. Ensuring high consistency of transactions, high fault
tolerance, high scalability.
TDMQ,based on Pulsar,distributed Message Queue. EnablesTDXA to deal with messaging with
high consistency and availability, and makes convergence of various exceptions with the state table.
TDXA
TDMQ
- 7. Deployment of Tencent billing system
…
TDSQL
…
TDSQL
TDXA TDXA TDXA
App App App
MQ MQ
Wechat
Pay
Wechat
Pay
bank
Shenzhen Shanghai Hongkong
TDSQL
MQ’s Achievements:
1. Asynchronous data
transmission
between sytems
2. System Exception
handling
3. High data
consistency
3
1
2
- 8. Our Requirements for MQ
• Highly-consistency and highly-availability across regions
• Massive storage, able to manage massive delayed messages and
massive topics
• Highly concurrent consumptions (the number of consumers is m
uch larger than the number of queues)
• High Scalability
• Support message modification and more protocols, such as Restf
ul, AMQP
Pulsar perfectly supports the first 4
points. That’s why we choose Pulsar
to be our infrastructure.
- 9. Geo Replication
• Cross City
Shanghai Shenzhen
Shanghai
(China)
Toronto
(Canada)
broker
bookie
• Cross Region/State
2 replica up to 560k QPS
3 replica up to 360k QPS
3 geo replica up to 280k QPS
Compare to Kafka:
high throughput(async): up to 1million QPS
2 replica(sync): only 60k QPS
Benchmark Hardware: cpu 24cores memory 48G network 10G
- 10. n Base on database binlog, hash method
n Batch compression
n Conflict detection,ensure correctness
Result:
Data Replication 10w/s,Latency ~30ms, from Shenzhen to shanghai
Practice of DB replication across cities
- 12. Real-time Interaction
• Question: use MQ as a RPC?
• Answer: add Request-Reply mode
Product Consume
Msg
Rsp(+data)
Msg
Rsp(+data)
Msg
ACK
Product(data)
ACK
Leader
Read-only
Session
A
Session
B
Caller Callee
Callee
transmit
Broker
- 13. Batch-Process
• Question: how to modify data when data errors happened?
• Answer: Offset Shadow + Priority Queue
consume
Offset Shadow
Data eror,
How to modify?
product
reproduct
Priority Queue
Correct Data
Offset Shadow, prevent incorrect data from being consumed
Priority Queue, prevent revised data from being consumed too late