SlideShare a Scribd company logo
Kafka on Pulsar
Kafka on Pulsar (KoP)
翟佳
Who am I?
Apache Pulsar Committer & PMC Member
Apache BookKeeper Committer & PMC Member
EMC -> StreamNative
StreamNative Core Engineer
HUST -> ICT
Jia Zhai / 翟佳
What is Apache Pulsar?
Flexible Pub/Sub
Messaging
backed by Durable
log/stream Storage
Barrier for user?
Unified Messaging Protocol
Apps Build on old systems
How Pulsar handles it?
Pulsar Kafka Wrapper on Kafka Java API
https://pulsar.apache.org/docs/en/adaptors-kafka/
Pulsar IO Connect
https://pulsar.apache.org/docs/en/io-overview/
Kafka on Pulsar (KoP)
KoP Feasibility — Log
Topic
KoP Feasibility — Log
Topic
Producer Consumer
KoP Feasibility — Log
Topic
Producer Consumer
Kafka
KoP Feasibility — Log
Topic
Producer Consumer
Pulsar
KoP Feasibility — Others
Producer Consumer
Topic Lookup
Produce
Consume
Offset
Consumption State
KoP Overview
Kafka lib
Broker
Pulsar
Consumer
Pulsar lib
Load
Balancer
Pulsar Protocol handler Kafka Protocol handler
Pulsar
Producer
Pulsar lib
Kafka
Producer
Kafka lib
Kafka
Consumer
Kafka lib
Kafka
Producer
Managed Ledger
BK Client
Geo-
Replicator
Pulsar Topic
ZooKeeper
Bookie
Pulsar
KoP Implementation
Topic flat map: Broker sets `kafkaNamespace`
Message ID and Offset: LedgerId + EntryId
Message: Convert Key/value/timestamp/headers(properties)
Topic Lookup: Pulsar admin topic lookup -> owner broker
Produce: Convert, then call PulsarTopic.publishMessage
Consume: Convert, then call non-durable-cursor.readEntries
Group Coordinator: Keep in topic `public/__kafka/__offsets`
KoP Implementation — Topic Map
KoP Implementation — Offset
Kafka lib
Kafka
Producer
entryId
LedgerId
entryIdLedgerId
Offset
KoP Implementation — Message Map
KoP Implementation — Topic Lookup
KoP Implementation — Pro/Con
KoP Implementation — Pro/Con
KoP Now
Kafka lib
Broker
Pulsar
Consumer
Pulsar lib
Load
Balancer
Pulsar Protocol handler Kafka Protocol handler
Pulsar
Producer
Pulsar lib
Kafka
Producer
Kafka lib
Kafka
Consumer
Kafka lib
Kafka
Producer
Managed Ledger
BK Client
Geo-
Replicator
Pulsar Topic
ZooKeeper
Bookie
Pulsar
KoP Now
Layered Architecture
Independent Scale
Instant Recovery
Balance-free expand
Ordering
Guaranteed ordering
Multi-tenancy
A single cluster can
support many tenants
and use cases
High throughput
Can reach 1.8 M
messages/s in a
single partition
Durability
Data replicated and
synced to disk
Geo-replication
Out of box support for
geographically
distributed
applications
Unified messaging
model
Support both
Streaming and
Queuing
Delivery Guarantees
At least once, at most
once and effectively once
Low Latency
Low publish latency of
5ms
Highly scalable &
available
Can support millions of
topics
HA
KoP Now
Demo
https://kafka.apache.org/quickstart
Demo1: Kafka Producer / Consumer
Demo2: Kafka Connect
https://archive.apache.org/dist/kafka/2.0.0/
kafka_2.12-2.0.0.tgz
Demo
Kafka lib
Broker
Pulsar
Consumer
Pulsar lib
Load
Balancer
Pulsar Protocol handler Kafka Protocol handler
Pulsar
Producer
Pulsar lib
Kafka
Producer
Kafka lib
Kafka
Consumer
Kafka lib
Kafka
Producer
Managed Ledger
BK Client
Geo-
Replicator
Pulsar Topic
ZooKeeper
Bookie
Pulsar
Demo1: K-Producer -> K-Consumer
Kafka lib
Kafka
Consumer
Kafka libKafka lib
Kafka
Producer
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
Demo1: P-Producer -> K-Consumer
Pulsar
Consumer
Pulsar lib
Pulsar
Producer
Pulsar lib
Kafka lib
Kafka
Consumer
Kafka libKafka lib
Kafka
Producer
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
bin/pulsar-client produce test -n 1 -m “Hello from Pulsar Producer, Message 1”
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
Demo1: P-Producer -> K-Consumer
Pulsar
Consumer
Pulsar lib
Pulsar
Producer
Pulsar lib
Kafka lib
Kafka
Consumer
Kafka libKafka lib
Kafka
Producer
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
bin/pulsar-client consume -s sub-name test -n 0
Demo2: Kafka Connect
Demo2: Kafka Connect
Kafka lib
Kafka
File

Source
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
InPut
File
Kafka
File

Sink
OutPut
File
TOPIC
bin/connect-standalone.sh 

config/connect-standalone.properties 

config/connect-file-source.properties 

config/connect-file-sink.properties
Demo2: Pulsar Functions
https://pulsar.apache.org/docs/en/functions-overview/
Demo2: Pulsar Functions
Kafka lib
Kafka
File

Source
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
InPut
File
Kafka
File

Sink
OutPut
File
TOPIC
Kafka lib
Pulsar 

Functions
OutPut Topic
bin/pulsar-admin functions localrun --name pulsarExclamation

--jar pulsar-functions-api-examples.jar 

--classname org…ExclamationFunction

--inputs connect-test-partition-0 --output out-hello
Apache Pulsar & Apache Kafka
Thanks!Stream
Native
Thanks!Stream
NativeWe are hiring

More Related Content

Kafka on Pulsar

  • 2. Kafka on Pulsar (KoP) 翟佳
  • 3. Who am I? Apache Pulsar Committer & PMC Member Apache BookKeeper Committer & PMC Member EMC -> StreamNative StreamNative Core Engineer HUST -> ICT Jia Zhai / 翟佳
  • 4. What is Apache Pulsar? Flexible Pub/Sub Messaging backed by Durable log/stream Storage
  • 5. Barrier for user? Unified Messaging Protocol Apps Build on old systems
  • 6. How Pulsar handles it? Pulsar Kafka Wrapper on Kafka Java API https://pulsar.apache.org/docs/en/adaptors-kafka/ Pulsar IO Connect https://pulsar.apache.org/docs/en/io-overview/
  • 9. KoP Feasibility — Log Topic Producer Consumer
  • 10. KoP Feasibility — Log Topic Producer Consumer Kafka
  • 11. KoP Feasibility — Log Topic Producer Consumer Pulsar
  • 12. KoP Feasibility — Others Producer Consumer Topic Lookup Produce Consume Offset Consumption State
  • 13. KoP Overview Kafka lib Broker Pulsar Consumer Pulsar lib Load Balancer Pulsar Protocol handler Kafka Protocol handler Pulsar Producer Pulsar lib Kafka Producer Kafka lib Kafka Consumer Kafka lib Kafka Producer Managed Ledger BK Client Geo- Replicator Pulsar Topic ZooKeeper Bookie Pulsar
  • 14. KoP Implementation Topic flat map: Broker sets `kafkaNamespace` Message ID and Offset: LedgerId + EntryId Message: Convert Key/value/timestamp/headers(properties) Topic Lookup: Pulsar admin topic lookup -> owner broker Produce: Convert, then call PulsarTopic.publishMessage Consume: Convert, then call non-durable-cursor.readEntries Group Coordinator: Keep in topic `public/__kafka/__offsets`
  • 16. KoP Implementation — Offset Kafka lib Kafka Producer entryId LedgerId entryIdLedgerId Offset
  • 17. KoP Implementation — Message Map
  • 18. KoP Implementation — Topic Lookup
  • 21. KoP Now Kafka lib Broker Pulsar Consumer Pulsar lib Load Balancer Pulsar Protocol handler Kafka Protocol handler Pulsar Producer Pulsar lib Kafka Producer Kafka lib Kafka Consumer Kafka lib Kafka Producer Managed Ledger BK Client Geo- Replicator Pulsar Topic ZooKeeper Bookie Pulsar
  • 22. KoP Now Layered Architecture Independent Scale Instant Recovery Balance-free expand
  • 23. Ordering Guaranteed ordering Multi-tenancy A single cluster can support many tenants and use cases High throughput Can reach 1.8 M messages/s in a single partition Durability Data replicated and synced to disk Geo-replication Out of box support for geographically distributed applications Unified messaging model Support both Streaming and Queuing Delivery Guarantees At least once, at most once and effectively once Low Latency Low publish latency of 5ms Highly scalable & available Can support millions of topics HA KoP Now
  • 24. Demo https://kafka.apache.org/quickstart Demo1: Kafka Producer / Consumer Demo2: Kafka Connect https://archive.apache.org/dist/kafka/2.0.0/ kafka_2.12-2.0.0.tgz
  • 25. Demo Kafka lib Broker Pulsar Consumer Pulsar lib Load Balancer Pulsar Protocol handler Kafka Protocol handler Pulsar Producer Pulsar lib Kafka Producer Kafka lib Kafka Consumer Kafka lib Kafka Producer Managed Ledger BK Client Geo- Replicator Pulsar Topic ZooKeeper Bookie Pulsar
  • 26. Demo1: K-Producer -> K-Consumer Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
  • 27. Demo1: P-Producer -> K-Consumer Pulsar Consumer Pulsar lib Pulsar Producer Pulsar lib Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/pulsar-client produce test -n 1 -m “Hello from Pulsar Producer, Message 1” bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
  • 28. Demo1: P-Producer -> K-Consumer Pulsar Consumer Pulsar lib Pulsar Producer Pulsar lib Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test bin/pulsar-client consume -s sub-name test -n 0
  • 30. Demo2: Kafka Connect Kafka lib Kafka File
 Source Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic InPut File Kafka File
 Sink OutPut File TOPIC bin/connect-standalone.sh 
 config/connect-standalone.properties 
 config/connect-file-source.properties 
 config/connect-file-sink.properties
  • 32. Demo2: Pulsar Functions Kafka lib Kafka File
 Source Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic InPut File Kafka File
 Sink OutPut File TOPIC Kafka lib Pulsar 
 Functions OutPut Topic bin/pulsar-admin functions localrun --name pulsarExclamation
 --jar pulsar-functions-api-examples.jar 
 --classname org…ExclamationFunction
 --inputs connect-test-partition-0 --output out-hello
  • 33. Apache Pulsar & Apache Kafka