SlideShare a Scribd company logo
© 2020 SPLUNK INC.
Pulsar Functions
A Deep Dive | Pulsar Summit 2020
Sanjeev Kulkarni
sanjeevk@splunk.com
© 2020 SPLUNK INC.
Pulsar Functions:- A Deep Dive
Brief introduction to Pulsar Functions
Deep Dive into internals
• Submission workflow
• Scheduling workflow
• Execution workflow
• Java Instance concepts
Current/Future Work
Agenda
© 2020 SPLUNK INC.
Pulsar Functions:- A Deep Dive
Brief introduction to Pulsar Functions
Deep Dive into internals
• Submission workflow
• Scheduling workflow
• Execution workflow
• Java Instance concepts
Current/Future Work
Agenda
© 2020 SPLUNK INC.
Pulsar Functions:- A Brief Introduction
Bringing Serverless concepts to the
streaming world.
Execute processing logic per
message on input topic
Function output goes to an output
topic
• Optional
Abstract View
Core Concept
© 2020 SPLUNK INC.
Pulsar Functions:- A Brief Introduction
Emphasis on simplicity
Great for 90% use-cases on streams
• Filtering
• Routing
• Enrichment
Not meant to replace Spark/Flink
SDK-less API
import java.util.function.Function;
public class ExclamationFunction implements Function<String, String> {
@Override
public String apply(String input) {
return input + "!";
}
}
Simple API
© 2020 SPLUNK INC.
Pulsar Functions:- A Brief Introduction
Flexible execution environments
• Pulsar managed
– Thread
– Process
• Externally managed
– Kubernetes
CRUD based Rest API
Function lifecycle
© 2020 SPLUNK INC.
Pulsar Functions:- A Deep Dive
Brief introduction to Pulsar Functions
Deep Dive into internals
• Submission workflow
• Scheduling workflow
• Execution workflow
• Java Instance concepts
Current/Future Work
Agenda
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Submit to any worker
Json repr of FunctionConfig
• tenant/namespace/name
• Input/Output
• configs
• lot more knobs ….
Function Code
• jars/.py/zip/etc
FunctionConfig
public class FunctionConfig {
private String tenant;
private String namespace;
private String name;
private String className;
private Collection<String> inputs;
private String output;
private ProcessingGuarantees processingGuarantees;
private Map<String, Object> userConfig;
private Map<String, Object> secrets;
private Integer parallelism;
private Resources resources;
...
}
Function Representation
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
AuthN/AuthZ checks
FunctionConfig validation
• missing parameters
• Incorrect parameters
• Local Configs
Function Code Validation
• class presence, etc
Copy Code to Bookeeper
FunctionMetaData
message FunctionMetaData {
FunctionDetails functionDetails;
PackageLocationMetaData packageLocation;
uint64 version;
uint64 createTime;
map<int32, FunctionState> instanceStates;
FunctionAuthenticationSpec functionAuthSpec;
}
Submission Checks
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
System of record
Stores all Functions
• map from <FQFN, FunctionMetaData>
FQFN:- Fully Qualified Function
Name
Backed by Pulsar Topic
• Function MetaData Topic
Contains a MetaData Topic Tailer
Function MetaData Manager
Function MetaData Manager
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Just before Function
creation/update/delete
Function MetaData Manager:- Update State Machine
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Make a copy of the current state
Function MetaData Manager:- Update State Machine
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
foo -> {functionDetails : {...},
version: 2,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Make a copy of the current state
Merge the updates
Function MetaData Manager:- Update State Machine
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
foo -> {functionDetails : {......},
version: 2,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Make a copy of the current state
Merge the updates
Increment the version
Function MetaData Manager:- Update State Machine
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
foo -> {functionDetails : {......},
version: 3,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Make a copy of the current state
Merge the updates
Increment the version
Write to MetaData Topic
Function MetaData Manager:- Update State Machine
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Make a copy of the current state
Merge the updates
Increment the version
Write to MetaData Topic
Tailer reads and verifies
Function MetaData Manager:- Update State Machine
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Make a copy of the current state
Merge the updates
Increment the version
Write to MetaData Topic
Tailer reads and verifies
Upon no conflict, tailer updates
Function MetaData Manager:- Update State Machine
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {.....},
version: 3,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Multiple Workers
Function MetaData Manager:- When do conflicts occur?
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
Worker 1
MetaData Topic Tailer
foo -> {functionDetails : {...},
version: 2,
…}
Worker 2
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Multiple Workers
Concurrent updates to same function
Function MetaData Manager:- When do conflicts occur?
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
Worker 1
MetaData Topic Tailer
foo -> {functionDetails : {...},
version: 2,
…}
Worker 2
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Multiple Workers
Concurrent updates to same function
First Writer Wins
Function MetaData Manager:- When do conflicts occur?
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 3,
…}
Worker 1
MetaData Topic Tailer
foo -> {functionDetails : {...},
version: 3,
…}
Worker 2
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Multiple Workers
Concurrent updates to same function
First Writer Wins
Others are rejected
Function MetaData Manager:- When do conflicts occur?
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 3,
…}
Worker 1
MetaData Topic Tailer
foo -> {functionDetails : {...},
version: 3,
…}
Worker 2
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
Submit to any worker
Validation load scales linearly
Deterministic State Machine
MetaData Topic is audit log
Advantages
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 3,
…}
Worker 1
MetaData Topic Tailer
foo -> {functionDetails : {...},
version: 3,
…}
Worker 2
© 2020 SPLUNK INC.
Pulsar Functions:- Submission Workflow
MetaData topic topic growth
MetaData Topic compaction
non-trivial
Worker Start time
All Workers know everything
Pitfalls
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 3,
…}
Worker 1
MetaData Topic Tailer
foo -> {functionDetails : {...},
version: 3,
…}
Worker 2
© 2020 SPLUNK INC.
Pulsar Functions:- A Deep Dive
Brief introduction to Pulsar Functions
Deep Dive into internals
• Submission workflow
• Scheduling workflow
• Execution workflow
• Java Instance concepts
Current/Future Work
Agenda
© 2020 SPLUNK INC.
Pulsar Functions:- Scheduling Workflow
Abstracts out Scheduler
Executed only on a Leader
Invoked when
• Function CRUD operations
– create/update
– delete
• Worker Changes
– Unresponsive/dead workers
– New workers
– Periodic
– Leadership changes
IScheduler Interface
public interface IScheduler {
List<Assignment> schedule(<List<Instance> unassigned,
List<Instance> current,
Set<String> workers);
}
Pluggable Scheduler
© 2020 SPLUNK INC.
Pulsar Functions:- Scheduling Workflow
Empty Coordination Topic
Failover Subscription
Active Consumer is the Leader
Leader Election
Leader Election
Coordination Topic
Worker 1Worker 2Worker 3
© 2020 SPLUNK INC.
Pulsar Functions:- Scheduling Workflow
Assignment Topic
Written by the Leader
Compacted based on key(FQFN +
Instance Id)
All workers know about all
assignments
Function Assignments
Assignment Topic
Worker 1Worker 2Worker 3
{foo, 1} : worker-1,
...
{foo, 1} : worker-1,
...
{foo, 1} : worker-1,
...
Assignment Tailer Assignment Tailer Assignment Tailer
© 2020 SPLUNK INC.
Pulsar Functions:- Scheduling Workflow
Stores Assignment
Compacted
Key -> (FQFN + InstanceId)
Assignment
message Instance {
FunctionMetaData functionMetaData = 1;
int32 instanceId = 2;
}
message Assignment {
Instance instance = 1;
string workerId = 2;
}
Assignment Topic
© 2020 SPLUNK INC.
Pulsar Functions:- A Deep Dive
Brief introduction to Pulsar Functions
Deep Dive into internals
• Submission workflow
• Scheduling workflow
• Execution workflow
• Java Instance concepts
Current/Future Work
Agenda
© 2020 SPLUNK INC.
Pulsar Functions:- Execution Workflow
Triggered by Changes to
Assignment Table
Takes care of the worker’s specific
assignments
Function lifecycle management via
Spawner
Function RunTime Manager
Assignment Topic
Worker
{foo, 1} : worker-1,
...
Assignment Tailer
RunTime Manager
Spawner Spawner
© 2020 SPLUNK INC.
Pulsar Functions:- Execution Workflow
Abstracts out execution
environments using Runtime Factory
Manages Function lifecycle
Maintains grpc connection with
Function instance
Spawner
GRPC Channel
Spawner
Function
© 2020 SPLUNK INC.
Pulsar Functions:- Execution Workflow
Short-circuit MetaData Manager and
Runtime Manager
Directly use Spawner
Local Runner
GRPC Channel
Spawner
Function
Local Runner
© 2020 SPLUNK INC.
Pulsar Functions:- Execution Workflow
Simple interface for creating
execution environments
Creates Runtimes
Runtime Factory
public interface RuntimeFactory {
void initialize(WorkerConfig workerConfig);
Runtime createContainer(InstanceConfig instanceConfig,
String codeFile);
void close();
}
Runtime Factory
© 2020 SPLUNK INC.
Pulsar Functions:- A Deep Dive
Brief introduction to Pulsar Functions
Deep Dive into internals
• Submission workflow
• Scheduling workflow
• Execution workflow
• Java Instance concepts
Current/Future Work
Agenda
© 2020 SPLUNK INC.
Pulsar Functions:- Java Instance
Java Instance is (source, function,
sink) ensemble.
Source abstracts reading from input
topics
Sink abstracts writing to output topic
Java Instance
Source -> Process -> Sink
Source Sink
f
© 2020 SPLUNK INC.
Pulsar Functions:- Java Instance
Pulsar Source implements the
Source interface to read from Pulsar
topics
Pulsar Sink implements Sink
interface to write to Pulsar topic
Java Instance
Regular Pulsar Functions
Pulsar
Source
Pulsar
Sink
f
© 2020 SPLUNK INC.
Pulsar Functions:- Java Instance
Java Instance
What if we have non-Pulsar Source?
Non
Pulsar
Source
Pulsar
Sink
f
© 2020 SPLUNK INC.
Pulsar Functions:- Java Instance
Java Instance
Pulsar IO
Non
Pulsar
Source
Pulsar
Sink
f
© 2020 SPLUNK INC.
Pulsar Functions:- Java Instance
Non Pulsar Source reads from
external system
Identity Function lets the data pass
thru
Pulsar Sink writes to Pulsar
Java Instance
Pulsar IO Source
Non
Pulsar
Source
SinkIdentity
© 2020 SPLUNK INC.
Pulsar Functions:- Java Instance
Pulsar Source reads from Pulsar
topics
Identity Function lets the data pass
thru
Non Pulsar Sink writes to an external
system
Java Instance
Pulsar IO Sink
Pulsar
Source
Non
Pulsar
Sink
Identity
© 2020 SPLUNK INC.
Pulsar Functions:- A Deep Dive
Brief introduction to Pulsar Functions
Deep Dive into internals
• Submission workflow
• Scheduling workflow
• Execution workflow
• Java Instance concepts
Current/Future Work
Agenda
© 2020 SPLUNK INC.
Pulsar Functions:- Future Work
Each setup only supports a static
Runtime(Process/Thread/Pods)
Change it to be dynamically
specified during submission
Function RunTime Manager
Changes
Dynamic Runtime Selection
Assignment Topic
Worker
{foo, 1} : worker-1,
...
Assignment Tailer
RunTime Manager
Spawner
Function-1
Spawner
Function-2
Thread
Process
© 2020 SPLUNK INC.
Pulsar Functions:- Future Work
MetaData Topic not compacted
Stores all function change requests
Worker needs to read from
beginning upon startup
Function MetaData Topic Compaction
MetaData Topic Tailer
MetaData Topic
foo -> {functionDetails : {...},
version: 2,
…}
© 2020 SPLUNK INC.
Pulsar Functions:- Future Work
Chaining Functions
Output of one going as input of
others
A simple workflow API
Function Mesh
f1
f2
f3
f4
© 2020 SPLUNK INC.
Pulsar Functions:- Future Work
Discover/Collect Cycle
Repeating Cycle
Don’t drop discovered tasks on
failures
BatchSource
public interface BatchSource<T> {
void open(final Map<String, Object> config,
SourceContext context);
void discover(Consumer<byte[]> taskEater);
void prepare(byte[] task);
Record<T> readNext();
}
Batch Sources
Thank You
© 2020 SPLUNK INC.

More Related Content

Pulsar Functions Deep Dive_Sanjeev kulkarni

  • 1. © 2020 SPLUNK INC. Pulsar Functions A Deep Dive | Pulsar Summit 2020 Sanjeev Kulkarni sanjeevk@splunk.com
  • 2. © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work Agenda
  • 3. © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work Agenda
  • 4. © 2020 SPLUNK INC. Pulsar Functions:- A Brief Introduction Bringing Serverless concepts to the streaming world. Execute processing logic per message on input topic Function output goes to an output topic • Optional Abstract View Core Concept
  • 5. © 2020 SPLUNK INC. Pulsar Functions:- A Brief Introduction Emphasis on simplicity Great for 90% use-cases on streams • Filtering • Routing • Enrichment Not meant to replace Spark/Flink SDK-less API import java.util.function.Function; public class ExclamationFunction implements Function<String, String> { @Override public String apply(String input) { return input + "!"; } } Simple API
  • 6. © 2020 SPLUNK INC. Pulsar Functions:- A Brief Introduction Flexible execution environments • Pulsar managed – Thread – Process • Externally managed – Kubernetes CRUD based Rest API Function lifecycle
  • 7. © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work Agenda
  • 8. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Submit to any worker Json repr of FunctionConfig • tenant/namespace/name • Input/Output • configs • lot more knobs …. Function Code • jars/.py/zip/etc FunctionConfig public class FunctionConfig { private String tenant; private String namespace; private String name; private String className; private Collection<String> inputs; private String output; private ProcessingGuarantees processingGuarantees; private Map<String, Object> userConfig; private Map<String, Object> secrets; private Integer parallelism; private Resources resources; ... } Function Representation
  • 9. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow AuthN/AuthZ checks FunctionConfig validation • missing parameters • Incorrect parameters • Local Configs Function Code Validation • class presence, etc Copy Code to Bookeeper FunctionMetaData message FunctionMetaData { FunctionDetails functionDetails; PackageLocationMetaData packageLocation; uint64 version; uint64 createTime; map<int32, FunctionState> instanceStates; FunctionAuthenticationSpec functionAuthSpec; } Submission Checks
  • 10. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow System of record Stores all Functions • map from <FQFN, FunctionMetaData> FQFN:- Fully Qualified Function Name Backed by Pulsar Topic • Function MetaData Topic Contains a MetaData Topic Tailer Function MetaData Manager Function MetaData Manager MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …}
  • 11. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Just before Function creation/update/delete Function MetaData Manager:- Update State Machine MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …}
  • 12. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Make a copy of the current state Function MetaData Manager:- Update State Machine MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …} foo -> {functionDetails : {...}, version: 2, …}
  • 13. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Make a copy of the current state Merge the updates Function MetaData Manager:- Update State Machine MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …} foo -> {functionDetails : {......}, version: 2, …}
  • 14. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Make a copy of the current state Merge the updates Increment the version Function MetaData Manager:- Update State Machine MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …} foo -> {functionDetails : {......}, version: 3, …}
  • 15. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Make a copy of the current state Merge the updates Increment the version Write to MetaData Topic Function MetaData Manager:- Update State Machine MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …}
  • 16. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Make a copy of the current state Merge the updates Increment the version Write to MetaData Topic Tailer reads and verifies Function MetaData Manager:- Update State Machine MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …}
  • 17. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Make a copy of the current state Merge the updates Increment the version Write to MetaData Topic Tailer reads and verifies Upon no conflict, tailer updates Function MetaData Manager:- Update State Machine MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {.....}, version: 3, …}
  • 18. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Multiple Workers Function MetaData Manager:- When do conflicts occur? MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …} Worker 1 MetaData Topic Tailer foo -> {functionDetails : {...}, version: 2, …} Worker 2
  • 19. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Multiple Workers Concurrent updates to same function Function MetaData Manager:- When do conflicts occur? MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …} Worker 1 MetaData Topic Tailer foo -> {functionDetails : {...}, version: 2, …} Worker 2
  • 20. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Multiple Workers Concurrent updates to same function First Writer Wins Function MetaData Manager:- When do conflicts occur? MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 3, …} Worker 1 MetaData Topic Tailer foo -> {functionDetails : {...}, version: 3, …} Worker 2
  • 21. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Multiple Workers Concurrent updates to same function First Writer Wins Others are rejected Function MetaData Manager:- When do conflicts occur? MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 3, …} Worker 1 MetaData Topic Tailer foo -> {functionDetails : {...}, version: 3, …} Worker 2
  • 22. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Submit to any worker Validation load scales linearly Deterministic State Machine MetaData Topic is audit log Advantages MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 3, …} Worker 1 MetaData Topic Tailer foo -> {functionDetails : {...}, version: 3, …} Worker 2
  • 23. © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow MetaData topic topic growth MetaData Topic compaction non-trivial Worker Start time All Workers know everything Pitfalls MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 3, …} Worker 1 MetaData Topic Tailer foo -> {functionDetails : {...}, version: 3, …} Worker 2
  • 24. © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work Agenda
  • 25. © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Abstracts out Scheduler Executed only on a Leader Invoked when • Function CRUD operations �� create/update – delete • Worker Changes – Unresponsive/dead workers – New workers – Periodic – Leadership changes IScheduler Interface public interface IScheduler { List<Assignment> schedule(<List<Instance> unassigned, List<Instance> current, Set<String> workers); } Pluggable Scheduler
  • 26. © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Empty Coordination Topic Failover Subscription Active Consumer is the Leader Leader Election Leader Election Coordination Topic Worker 1Worker 2Worker 3
  • 27. © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Assignment Topic Written by the Leader Compacted based on key(FQFN + Instance Id) All workers know about all assignments Function Assignments Assignment Topic Worker 1Worker 2Worker 3 {foo, 1} : worker-1, ... {foo, 1} : worker-1, ... {foo, 1} : worker-1, ... Assignment Tailer Assignment Tailer Assignment Tailer
  • 28. © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Stores Assignment Compacted Key -> (FQFN + InstanceId) Assignment message Instance { FunctionMetaData functionMetaData = 1; int32 instanceId = 2; } message Assignment { Instance instance = 1; string workerId = 2; } Assignment Topic
  • 29. © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work Agenda
  • 30. © 2020 SPLUNK INC. Pulsar Functions:- Execution Workflow Triggered by Changes to Assignment Table Takes care of the worker’s specific assignments Function lifecycle management via Spawner Function RunTime Manager Assignment Topic Worker {foo, 1} : worker-1, ... Assignment Tailer RunTime Manager Spawner Spawner
  • 31. © 2020 SPLUNK INC. Pulsar Functions:- Execution Workflow Abstracts out execution environments using Runtime Factory Manages Function lifecycle Maintains grpc connection with Function instance Spawner GRPC Channel Spawner Function
  • 32. © 2020 SPLUNK INC. Pulsar Functions:- Execution Workflow Short-circuit MetaData Manager and Runtime Manager Directly use Spawner Local Runner GRPC Channel Spawner Function Local Runner
  • 33. © 2020 SPLUNK INC. Pulsar Functions:- Execution Workflow Simple interface for creating execution environments Creates Runtimes Runtime Factory public interface RuntimeFactory { void initialize(WorkerConfig workerConfig); Runtime createContainer(InstanceConfig instanceConfig, String codeFile); void close(); } Runtime Factory
  • 34. © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work Agenda
  • 35. © 2020 SPLUNK INC. Pulsar Functions:- Java Instance Java Instance is (source, function, sink) ensemble. Source abstracts reading from input topics Sink abstracts writing to output topic Java Instance Source -> Process -> Sink Source Sink f
  • 36. © 2020 SPLUNK INC. Pulsar Functions:- Java Instance Pulsar Source implements the Source interface to read from Pulsar topics Pulsar Sink implements Sink interface to write to Pulsar topic Java Instance Regular Pulsar Functions Pulsar Source Pulsar Sink f
  • 37. © 2020 SPLUNK INC. Pulsar Functions:- Java Instance Java Instance What if we have non-Pulsar Source? Non Pulsar Source Pulsar Sink f
  • 38. © 2020 SPLUNK INC. Pulsar Functions:- Java Instance Java Instance Pulsar IO Non Pulsar Source Pulsar Sink f
  • 39. © 2020 SPLUNK INC. Pulsar Functions:- Java Instance Non Pulsar Source reads from external system Identity Function lets the data pass thru Pulsar Sink writes to Pulsar Java Instance Pulsar IO Source Non Pulsar Source SinkIdentity
  • 40. © 2020 SPLUNK INC. Pulsar Functions:- Java Instance Pulsar Source reads from Pulsar topics Identity Function lets the data pass thru Non Pulsar Sink writes to an external system Java Instance Pulsar IO Sink Pulsar Source Non Pulsar Sink Identity
  • 41. © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work Agenda
  • 42. © 2020 SPLUNK INC. Pulsar Functions:- Future Work Each setup only supports a static Runtime(Process/Thread/Pods) Change it to be dynamically specified during submission Function RunTime Manager Changes Dynamic Runtime Selection Assignment Topic Worker {foo, 1} : worker-1, ... Assignment Tailer RunTime Manager Spawner Function-1 Spawner Function-2 Thread Process
  • 43. © 2020 SPLUNK INC. Pulsar Functions:- Future Work MetaData Topic not compacted Stores all function change requests Worker needs to read from beginning upon startup Function MetaData Topic Compaction MetaData Topic Tailer MetaData Topic foo -> {functionDetails : {...}, version: 2, …}
  • 44. © 2020 SPLUNK INC. Pulsar Functions:- Future Work Chaining Functions Output of one going as input of others A simple workflow API Function Mesh f1 f2 f3 f4
  • 45. © 2020 SPLUNK INC. Pulsar Functions:- Future Work Discover/Collect Cycle Repeating Cycle Don’t drop discovered tasks on failures BatchSource public interface BatchSource<T> { void open(final Map<String, Object> config, SourceContext context); void discover(Consumer<byte[]> taskEater); void prepare(byte[] task); Record<T> readNext(); } Batch Sources
  • 46. Thank You © 2020 SPLUNK INC.