SlideShare a Scribd company logo
Building Zhaopin’s enterprise
Event Center on Apache Pulsar
Penghui Li
(@lipenghui6)
Jia Zhai
(@Jia_Zhai)
Zhaopin.com
Zhaopin.com is the biggest online recruitment service
provider in China
Zhaopin.com provides job seekers a comprehensive resume service, latest
deployment, and career development related information, as well as in-depth
online job search for positions throughout China.
Zhaopin.com provides professional HR services to over 2.2 million clients and its
average daily pageviews are over 68 millions.
Who we are
❏ Penghui Li
❏ Tech lead of Infrastructure
team at Zhaopin
❏ 5+ years of experiences in
messaging and microservices
❏ Apache Pulsar Committer
Who we are
❏ Jia Zhai
❏ Pulsar PMC Member / Committer
❏ BookKeeper PMC Member /
Committer
❏ Funding engineer at
StreamNative
Agenda
❏ Why building an Event Center
❏ Why Apache Pulsar
❏ Apache Pulsar at Zhaopin
❏ Streaming Platform
❏ Zhaopin’s contributions to Apache Pulsar
Why building an Event Center
Data Silos -> Unified Platform
Data Silos
❏ High Maintenance Cost
❏ Extremely hard to scale data cross
teams
❏ Inconsistency between data silos
❏ Doesn’t scale
❏ No consistent SLA
Pain Points
To Enterprises
MSMQ
Data Processing
Kafka
To End Users
RabbitMQ
Data Silos
❏ High Maintenance Cost
❏ Extremely hard to scale data cross
teams
❏ Inconsistency between data silos
❏ Doesn’t scale
❏ No consistent SLA
Pain Points
To Enterprises
MSMQ
Data Processing
Kafka
To End Users
RabbitMQ
Unification - MQService
❏ Simplified Operations
❏ Scale-out Service
❏ High Availability
Problem Solved
Problem Unsolved
❏ Keep messages for longer period
❏ Data rewind
❏ Order guarantee
Unification - MQService
Online Services
MQService
Data Processing
Kafka
Why building an Event Center
Why building an Event Center
Why building an Event Center
Why building an Event Center
Why Apache Pulsar
Pulsar == Messaging + Storage
Why Apache Pulsar?
Flexible Pub/Sub Messaging
backed by scalable log storage
Why Apache Pulsar / Multi Tenancy
Why Apache Pulsar / Queuing + Streaming
Why Apache Pulsar / Cloud Native Architecture
Why Apache Pulsar
Apache Pulsar at Zhaopin
20+ core services, 20 billions events/day
Unification - MQService
❏ No Data Silos
❏ Queue + Streaming
❏ Disaster Recovery
❏ Infinite Stream Storage
(via Tiered Storage)
❏ Data rewind
Problem Solved
Milestones
Core Metrics
❏ 50+ Namespaces
❏ 5000+ Topics
❏ 20+ billions events/day
❏ 5TB storage per day
❏ 20+ core services
System Metrics
Pulsar at Zhaopin
❏ One copy of data, single source-of-truth
❏ Don’t worry about data consistency between RabbitMQ and
Kafka
❏ Multi-tenancy makes topic management easier
❏ Strong data durability allows us to stop worrying about
message loss
Event Streaming Platform
Beyond Pub/Sub Messaging
Event Streaming Platform
Event Streaming Platform
❏ Pulsar Functions: lightweight computing
❏ Flink: streaming-first, unified data processing
❏ Pulsar SQL (presto): interactive queries on both historic and
real-time data
More details
Coming soon for Next ApacheCon :-)
Contribute to Apache Pulsar
The Apache Way
Zhaopin’s Contributions to Apache Pulsar
❏ Client Interceptors
❏ Dead Letter Topic
❏ Time Partitioned Message Tracker
❏ Service Url Provider
❏ Key_Shared Subscription
❏ Pulsar SQL Improvements
❏ Multi-versions Schema Support
❏ HDFS Offloader
Community
❏ Pulsar Website: https://pulsar.apache.org
❏ Twitter: @apache_pulsar / @streamnativeio
❏ Slack: https://apache-pulsar.herokuapp.com
❏ Mailing Lists
dev@pulsar.apache.org, users@pulsar.apache.org
❏ Github
https://github.com/apache/pulsar
❏ Medium
https://medium.com/streamnative
Thanks!

More Related Content

Building Zhaopin's enterprise event center on apache pulsar

  • 1. Building Zhaopin’s enterprise Event Center on Apache Pulsar Penghui Li (@lipenghui6) Jia Zhai (@Jia_Zhai)
  • 2. Zhaopin.com Zhaopin.com is the biggest online recruitment service provider in China Zhaopin.com provides job seekers a comprehensive resume service, latest deployment, and career development related information, as well as in-depth online job search for positions throughout China. Zhaopin.com provides professional HR services to over 2.2 million clients and its average daily pageviews are over 68 millions.
  • 3. Who we are ❏ Penghui Li ❏ Tech lead of Infrastructure team at Zhaopin ❏ 5+ years of experiences in messaging and microservices ❏ Apache Pulsar Committer
  • 4. Who we are ❏ Jia Zhai ❏ Pulsar PMC Member / Committer ❏ BookKeeper PMC Member / Committer ❏ Funding engineer at StreamNative
  • 5. Agenda ❏ Why building an Event Center ❏ Why Apache Pulsar ❏ Apache Pulsar at Zhaopin ❏ Streaming Platform ❏ Zhaopin’s contributions to Apache Pulsar
  • 6. Why building an Event Center Data Silos -> Unified Platform
  • 7. Data Silos ❏ High Maintenance Cost ❏ Extremely hard to scale data cross teams ❏ Inconsistency between data silos ❏ Doesn’t scale ❏ No consistent SLA Pain Points To Enterprises MSMQ Data Processing Kafka To End Users RabbitMQ
  • 8. Data Silos ❏ High Maintenance Cost ❏ Extremely hard to scale data cross teams ❏ Inconsistency between data silos ❏ Doesn’t scale ❏ No consistent SLA Pain Points To Enterprises MSMQ Data Processing Kafka To End Users RabbitMQ
  • 9. Unification - MQService ❏ Simplified Operations ❏ Scale-out Service ❏ High Availability Problem Solved Problem Unsolved ❏ Keep messages for longer period ❏ Data rewind ❏ Order guarantee
  • 10. Unification - MQService Online Services MQService Data Processing Kafka
  • 11. Why building an Event Center
  • 12. Why building an Event Center
  • 13. Why building an Event Center
  • 14. Why building an Event Center
  • 15. Why Apache Pulsar Pulsar == Messaging + Storage
  • 16. Why Apache Pulsar? Flexible Pub/Sub Messaging backed by scalable log storage
  • 17. Why Apache Pulsar / Multi Tenancy
  • 18. Why Apache Pulsar / Queuing + Streaming
  • 19. Why Apache Pulsar / Cloud Native Architecture
  • 21. Apache Pulsar at Zhaopin 20+ core services, 20 billions events/day
  • 22. Unification - MQService ❏ No Data Silos ❏ Queue + Streaming ❏ Disaster Recovery ❏ Infinite Stream Storage (via Tiered Storage) ❏ Data rewind Problem Solved
  • 24. Core Metrics ❏ 50+ Namespaces ❏ 5000+ Topics ❏ 20+ billions events/day ❏ 5TB storage per day ❏ 20+ core services
  • 26. Pulsar at Zhaopin ❏ One copy of data, single source-of-truth ❏ Don’t worry about data consistency between RabbitMQ and Kafka ❏ Multi-tenancy makes topic management easier ❏ Strong data durability allows us to stop worrying about message loss
  • 27. Event Streaming Platform Beyond Pub/Sub Messaging
  • 29. Event Streaming Platform ❏ Pulsar Functions: lightweight computing ❏ Flink: streaming-first, unified data processing ❏ Pulsar SQL (presto): interactive queries on both historic and real-time data
  • 30. More details Coming soon for Next ApacheCon :-)
  • 31. Contribute to Apache Pulsar The Apache Way
  • 32. Zhaopin’s Contributions to Apache Pulsar ❏ Client Interceptors ❏ Dead Letter Topic ❏ Time Partitioned Message Tracker ❏ Service Url Provider ❏ Key_Shared Subscription ❏ Pulsar SQL Improvements ❏ Multi-versions Schema Support ❏ HDFS Offloader
  • 33. Community ❏ Pulsar Website: https://pulsar.apache.org ❏ Twitter: @apache_pulsar / @streamnativeio ❏ Slack: https://apache-pulsar.herokuapp.com ❏ Mailing Lists dev@pulsar.apache.org, users@pulsar.apache.org ❏ Github https://github.com/apache/pulsar ❏ Medium https://medium.com/streamnative