SlideShare a Scribd company logo
© 2 0 2 0 S P L U N K I N C .
How Splunk Mission
Control leverages Pulsar
Pulsar Summit
Pranav Dharma
June 17, 2020
© 2 0 2 0 S P L U N K I N C .
Splunk
Mission
Control
A cloud native, unified
experience for modernizing the
Security Operation Center
(SOC)
© 2 0 2 0 S P L U N K I N C .
Core
Requirements
driving
technology
choices for
Mission
Control
1) Targeted for the Security Operations
Center (SOC)
Reliability – Can not lose a single security event or its details
during the security event lifecycle
2) Multi tenant SAAS product
Data isolation of tenant data is a very big deal
3) Micro services based architecture
Performance and latency when communicating with
downstream services is important. Event investigation,
Automation and collaboration need to happen with minimal
latency
© 2 0 2 0 S P L U N K I N C .
Security Event Lifecycle
© 2 0 2 0 S P L U N K I N C .
Messaging use cases in Mission Control
• Sending user notifications as part of user collaboration and approval workflows
• Populating data for dashboard panels
• Triggering downstream services for automation and security metadata enrichment
• Generating audit trail
• Publishing Web Socket messages for UI refresh
• User triggered resource provisioning
• Providing playbook (automation) debug log
• Broadcasting important settings and ACL changes to pods
© 2 0 2 0 S P L U N K I N C .
Why Pulsar ?
Core Pulsar team now part of Splunk as part of the Streamlio
acquisition
(Drops Mic)
© 2 0 2 0 S P L U N K I N C .
Why Pulsar ?
• Native multitenancy
– Satisfies our core requirement of data isolation
• Message level acks instead of using only offset level acks
– Satisfies our core requirement of reliability
• Improve performance and scalability by adding more consumers without adding partitions
– Satisfies our core requirement for performance and latency
• Unified messaging
– We can use both queueing and streaming without the need to operate and maintain different products for each purpose
• TTL
– Our use cases have varied TTL requirements from none (WebSocket) to high (audit)
© 2 0 2 0 S P L U N K I N C .
Why Pulsar ?
• Simple producers and clients
– Reduced complexity for developers and increased productivity
• Operational overhead
– Several benefits, but adding new brokers, additional storage etc. worth calling out
• Topic creation is lightweight
– Easier and light weight to create new topics if needed
(We evaluated Redis and Kafka when making this design decision)
© 2 0 2 0 S P L U N K I N C .
Use case for Exclusive subscription
Each pod has a consumer
subscribing to a topic with
the subscription type
‘Exclusive’ – streaming or
pub-sub paradigm
Pod
Pod
Pod
Service
Pod
Pod
Pod
Service
Pod
Pod
Pod
Service
Critical Settings / ACL changes
Producer Pulsar
© 2 0 2 0 S P L U N K I N C .
Use case for Shared subscription
Pod Pod
Pod
Consumer service
Pulsar
Pod Pod
Service
Pod Pod
Service
Pod Pod
Service
Data for use cases like websocket
notification, send emails,
dashboard data, audit record etc.
can be generated by any of the
services and is published to Pulsar.
Consumers in the consumer
service subscribe using a ‘shared’
subscription (queueing paradigm)
© 2 0 2 0 S P L U N K I N C .
Use case for Key Shared subscription
First some background
• Automation consists of playbooks
– Playbooks consist of discrete units of work called actions
– Example of actions: “block user”, “create ticket”, “restart server”
– Actions can run sequentially or concurrently
– A final “on_finish” handler (part of playbook) called when playbook completes – should be called only once
– Only one pod should be able to call the “on_finish”
– Consumer with Key shared subscription on the playbook initiating pod will ensure only this pod will call the ‘on_finish’ handler
• Action runs
– Action runs can be cancelled
– Cancel action run messages need to be routed to the pod running the action
• Used as message bus between the automation services
© 2 0 2 0 S P L U N K I N C .
Use case for Key Shared subscription
Pod Pod
Service
Pulsar
Pod Pod
Service
Run action for <playbook_id>
Message published with
<playbook_id> key
Message published with
<playbook_id> key
Message with <playbook_id> key
consumed
The pod initializing the playbook
run creates a key_shared
subscription to decide when
playbook run is complete
© 2 0 2 0 S P L U N K I N C .
Use case for Key Shared subscription
Service
Pod Pod Pulsar
Pod Pod
Run action for <action_id>
Cancel action message
published with key
<action_id>
Cancel action message
consumed with key
<action_id>
Key_shared subscription used
to route cancel action run
messages to the pod running
the action
© 2 0 2 0 S P L U N K I N C .
What’s next ?
• So far, Pulsar integration has been pretty painless relatively
• Still learning, tweaking and optimizing
• Unbundle various queueing based consumers into their own service
• Opens possibility of event based design
Thank You
© 2 0 2 0 S P L U N K I N C .

More Related Content

How Splunk Mission Control leverages various Pulsar subscription types_Pranav Dharma

  • 1. © 2 0 2 0 S P L U N K I N C . How Splunk Mission Control leverages Pulsar Pulsar Summit Pranav Dharma June 17, 2020
  • 2. © 2 0 2 0 S P L U N K I N C . Splunk Mission Control A cloud native, unified experience for modernizing the Security Operation Center (SOC)
  • 3. © 2 0 2 0 S P L U N K I N C . Core Requirements driving technology choices for Mission Control 1) Targeted for the Security Operations Center (SOC) Reliability – Can not lose a single security event or its details during the security event lifecycle 2) Multi tenant SAAS product Data isolation of tenant data is a very big deal 3) Micro services based architecture Performance and latency when communicating with downstream services is important. Event investigation, Automation and collaboration need to happen with minimal latency
  • 4. © 2 0 2 0 S P L U N K I N C . Security Event Lifecycle
  • 5. © 2 0 2 0 S P L U N K I N C . Messaging use cases in Mission Control • Sending user notifications as part of user collaboration and approval workflows • Populating data for dashboard panels • Triggering downstream services for automation and security metadata enrichment • Generating audit trail • Publishing Web Socket messages for UI refresh • User triggered resource provisioning • Providing playbook (automation) debug log • Broadcasting important settings and ACL changes to pods
  • 6. © 2 0 2 0 S P L U N K I N C . Why Pulsar ? Core Pulsar team now part of Splunk as part of the Streamlio acquisition (Drops Mic)
  • 7. © 2 0 2 0 S P L U N K I N C . Why Pulsar ? • Native multitenancy – Satisfies our core requirement of data isolation • Message level acks instead of using only offset level acks – Satisfies our core requirement of reliability • Improve performance and scalability by adding more consumers without adding partitions – Satisfies our core requirement for performance and latency • Unified messaging – We can use both queueing and streaming without the need to operate and maintain different products for each purpose • TTL – Our use cases have varied TTL requirements from none (WebSocket) to high (audit)
  • 8. © 2 0 2 0 S P L U N K I N C . Why Pulsar ? • Simple producers and clients – Reduced complexity for developers and increased productivity • Operational overhead – Several benefits, but adding new brokers, additional storage etc. worth calling out • Topic creation is lightweight – Easier and light weight to create new topics if needed (We evaluated Redis and Kafka when making this design decision)
  • 9. © 2 0 2 0 S P L U N K I N C . Use case for Exclusive subscription Each pod has a consumer subscribing to a topic with the subscription type ‘Exclusive’ – streaming or pub-sub paradigm Pod Pod Pod Service Pod Pod Pod Service Pod Pod Pod Service Critical Settings / ACL changes Producer Pulsar
  • 10. © 2 0 2 0 S P L U N K I N C . Use case for Shared subscription Pod Pod Pod Consumer service Pulsar Pod Pod Service Pod Pod Service Pod Pod Service Data for use cases like websocket notification, send emails, dashboard data, audit record etc. can be generated by any of the services and is published to Pulsar. Consumers in the consumer service subscribe using a ‘shared’ subscription (queueing paradigm)
  • 11. © 2 0 2 0 S P L U N K I N C . Use case for Key Shared subscription First some background • Automation consists of playbooks – Playbooks consist of discrete units of work called actions – Example of actions: “block user”, “create ticket”, “restart server” – Actions can run sequentially or concurrently – A final “on_finish” handler (part of playbook) called when playbook completes – should be called only once – Only one pod should be able to call the “on_finish” – Consumer with Key shared subscription on the playbook initiating pod will ensure only this pod will call the ‘on_finish’ handler • Action runs – Action runs can be cancelled – Cancel action run messages need to be routed to the pod running the action • Used as message bus between the automation services
  • 12. © 2 0 2 0 S P L U N K I N C . Use case for Key Shared subscription Pod Pod Service Pulsar Pod Pod Service Run action for <playbook_id> Message published with <playbook_id> key Message published with <playbook_id> key Message with <playbook_id> key consumed The pod initializing the playbook run creates a key_shared subscription to decide when playbook run is complete
  • 13. © 2 0 2 0 S P L U N K I N C . Use case for Key Shared subscription Service Pod Pod Pulsar Pod Pod Run action for <action_id> Cancel action message published with key <action_id> Cancel action message consumed with key <action_id> Key_shared subscription used to route cancel action run messages to the pod running the action
  • 14. © 2 0 2 0 S P L U N K I N C . What’s next ? • So far, Pulsar integration has been pretty painless relatively • Still learning, tweaking and optimizing • Unbundle various queueing based consumers into their own service • Opens possibility of event based design
  • 15. Thank You © 2 0 2 0 S P L U N K I N C .