TGIPulsar - EP #006: Lifecycle of a Pulsar message
- 5. Brokers + Bookies
Bookie 0 Bookie 1 Bookie 2
The processes for storing
data are called bookies. They
persist data for Pulsar.
Broker 0 Broker 1 Broker 2
Brokers are “stateless”. They
serve clients for producing and
consuming events
- 6. ZooKeeper
Bookie 0 Bookie 1 Bookie 2
The processes for storing
data are called bookies. They
persist data for Pulsar.
Broker 0 Broker 1 Broker 2
Brokers are “stateless”. They
serve clients for producing and
consuming events
ZooKeeper
ZooKeeper
ZooKeeper
ZooKeeper is used for storing the
metadata for Pulsar and
bookkeeper as well as for
discovering brokers and bookies.
- 7. Pulsar Producer 0 Producer 1
Topic
Partition 0 Partition 1 Partition 2
Broker X Broker Y Broker Z
Subscription A
Consumer (P012)
- 8. Produce Producer 0 Producer 1
Topic
Partition 0 Partition 1 Partition 2
Broker 0 Broker 1 Broker 2
Bookie 0 Bookie 1 Bookie 2
1. A message is created and a
partition is selected
2. The message is sent to the
owner broker that serves the
selected partition
3. The message is written to N bookies in
parallel by the owner broker. The message
is written once and stored in their entirety.
4. Once the message has been
written by 2 bookies, the broker
will acknowledge the message
- 9. Consume
(Cached)
Topic
Partition 0 Partition 1 Partition 2
Broker 0 Broker 1 Broker 2
Bookie 0 Bookie 1 Bookie 2
Consumer (P012)
1. The consumer subscribes to a
topic. It connects to the owner
brokers serving the partitions.
2. Broker sends messages for the
partition coming out of its
memory cache
3. Consumer acknowledges a
message after processing it.
Broker updates cursor once it
receives acknowledgment.
- 10. Consume
(BK)
Topic
Partition 0 Partition 1 Partition 2
Broker 0 Broker 1 Broker 2
Bookie 0 Bookie 1 Bookie 2
Consumer (P012)
1. The consumer subscribes to a
topic. It connects to the owner
brokers serving the partitions.
2. Broker does not have the data in
the memory and will read from one
of the Bookies that have the data.
3. Consumer acknowledges a
message after processing it.
Broker updates cursor once it
receives acknowledgment.
- 11. Failures Producer 0 Producer 1
Topic
Partition 0 Partition 1 Partition 2
Broker 0 Broker 1 Broker 2
Bookie 0 Bookie 1 Bookie 2
In flights messages will be
automatically retried by
Pulsar clients
Brokers are stateless. Any
broker process that dies that
doesn’t impact data storage.
Consumer (P012)
When a bookie dies, all the data
is still accessible and will be
replicated by other replicas
- 24. Message retention (5)
Acked
Msg 1
Acked
Msg 2
Acked
Msg 3
Acked
Msg 4
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Unacked
Msg 9
Unacked
Msg 10
Unacked
Msg 11
Deleted Retention Yet to be processed
- 25. Message expiry (1)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Unacked
Msg 9
Unacked
Msg 10
Unacked
Msg 11
Deleted Retention
Not within TTL
(may still be processed)
Unacked
Msg 12
Unacked
Msg 13
Unacked
Msg 14
Unacked
Msg 15
Within the applied TTL
- 26. Message expiry (2)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Acked
Msg 9
Acked
Msg 10
Acked
Msg 11
Deleted Retention
Not within TTL
(may still be processed)
Acked
Msg 12
Unacked
Msg 13
Unacked
Msg 14
Unacked
Msg 15
- 27. Backlog (1)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Unacked
Msg 9
Unacked
Msg 10
Unacked
Msg 11
Deleted Retention
Unacked
Msg 12
Unacked
Msg 13
Unacked
Msg 14
Unacked
Msg 15
Yet to be processed
- 28. Backlog (2)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Unacked
Msg 9
Unacked
Msg 10
Unacked
Msg 11
Deleted Retention
Unacked
Msg 12
Unacked
Msg 13
Unacked
Msg 14
Unacked
Msg 15
Yet to be processed
SUB 0 SUB 2
Backlog
- 29. Message deletion (1)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Acked
Msg 9
Acked
Msg 10
Acked
Msg 11
Deleted Retention
Acked
Msg 12
Acked
Msg 13
Acked
Msg 14
Acked
Msg 15
- 30. Message deletion (2)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Acked
Msg 9
Acked
Msg 10
Acked
Msg 11
Deleted Retention
Acked
Msg 12
Acked
Msg 13
Acked
Msg 14
Acked
Msg 15
- 31. Message deletion (3)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Acked
Msg 9
Acked
Msg 10
Acked
Msg 11
Deleted Retention
Acked
Msg 12
Acked
Msg 13
Acked
Msg 14
Acked
Msg 15
S1 S2 S3 S4
- 32. Message deletion (4)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Acked
Msg 9
Acked
Msg 10
Acked
Msg 11
Deleted Retention
Acked
Msg 12
Acked
Msg 13
Acked
Msg 14
Acked
Msg 15
S1 S2 S3 S4
- 33. Message deletion (5)
Acked
Msg 5
Acked
Msg 6
Acked
Msg 7
Acked
Msg 8
Acked
Msg 9
Acked
Msg 10
Acked
Msg 11
Deleted Retention
Acked
Msg 12
Acked
Msg 13
Acked
Msg 14
Acked
Msg 15
S1 S2 S3 S4
- 35. Message deletion
✓ Messages are deleted segment by segment
✓ The disk space of a segment is reclaimed by a
garbage collector thread after it is deleted
✓ The garbage collector is running periodically
○ gcWaitTime
- 36. Retention settings
✓ Retention (broker / namespace)
○ defaultRetentionTimeInMinutes
○ defaultRetentionSizeInMB
✓ TTL (broker / namespace)
○ ttlDurationDefaultInSeconds
- 37. Trigger retention
✓ Ledger Rollover
○ managedLedgerMinLedgerRolloverTimeMinutes
○ managedLedgerMaxLedgerRolloverTimeMinutes
○ managedLedgerMaxEntriesPerLedger
- 38. Garbage collection settings
✓ Bookie settings
○ gcWaitTime
○ majorCompactionThreshold
○ majorCompactionInterval
○ minorCompactionThreshold
○ minorCompactionInterval