-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIP-261: Restructure Getting Started section #19912
Comments
Hi Asaf, thanks for proposing this improvement. Generally it looks good. I support the motivation.
|
Thanks for the feedback @momo-jun. Getting Started Guide
Then each section will have the heading it contains per the TOC (depth level 3 will be H1, ...). Regarding your second question on what to do with existing Getting Started section. "Run a Pulsar cluster locally with Docker Compose" is actually missing
|
Thanks for the further explanation. Adding a branch mode for Now I only have one concern - the structure of the TOC is not MECE and might be difficult to understand.
And logically, |
Your feedback is much appreciated and straight to the point. How we name the main headings as below?
|
Hi @asafm Thanks for your awesome proposal. The real-world examples are great additions to the docs! IssuesWhile there are some issues in the current proposal:
SolutionsTo resolve the issues above, I would suggest that:
BenefitsThis solution: (1) Highlights the roles and gives them what they need clearly. No missing or duplicates (MVP). (2) Makes short headings possible. Besides, for the common docs (e.g., concepts, references) which should be reused, we can link them richly in the 3 guides. Examples
|
Thanks for the detailed reply! Regarding the suggestion of moving the getting started for each role into it's own sub-section of a bigger guide (developer, operator): I was thinking about it.
So, when I think about it, in my opinion it's confusing to have in the same guide, two contradictory sections: We'll have a Getting Started section which is basically a tutorial. So the people that like to read first do later, will be confused - "so we're suppose to get started here, but what's going on? I see code here, no no no. I want to understand first. what's going here?". On the other hand, the people like to do first, read later, will not search inside a Developer Guide the getting started. For them, a Developer's Guide is big scary book, filled with way too many details. If you ask them, all they want is tutorials, from the getting started ones, to more complicated ones. So I imagined having a section in the docs named Tutorials, that contains exactly that, grouped by role (developer, operator). So from that perspective I prefer to have: Regarding second suggestion of having sub-pages of docs. You mean each guide will have their own "doc site"? Regarding
Why do you think that if the side bar has: then people will read them one after another? I do agree the titles are too lengthy.
|
Hi @asafm Thanks for your detailed explanations! I understand your points, and I'm trying to make the learning path more clear, simple, and direct for each role. Reasons for designing 3 guides1. Give prominent directions for usersSuppose that you're at a fork in the road, it's most clear for you to choose one way if the sign indicates the direction. This is the same for doc users. Whatever the user archetypes ("doing" or "learning") are, the most important thing is they're seeking solutions to resolve issues based on their roles. The roles are signs. So if we design the doc IA as below, users just need to choose one way based on their roles and finish the left journey. No other stuffs they need to take into consideration. It's simple, clear, and direct. 2. Provide required minimal info for users
Clarification: "read them all by sequence" means " all roles need to glance over all 3 headings (even though they are just interested in and click one later)" rather than "read them (docs) one after another". But if we design 3 guides respectively:
In this way, we provide the required info for each role at a minimal amount. Users will like it because it:
|
I wholeheartedly agree that the role-based learning paths are a great idea for future iterations of the documentation. In the short-term, it's a quick win to incrementally update the GSG. I suggest we title the first one "quickstart" and also link to it from GS menu on home page. And then call the others what they are: tutorials. WDYT? Getting Started
|
Hi @asafm! Thanks for starting this thread. I reviewed this proposal in two aspects. ContentFor three journeys in the proposal, we have contents for two of them:
The closest pages for getting started with operations are under the "Administration" chapter https://pulsar.apache.org/docs/2.11.x/administration-zk-bk/, while we don't have a portal page or getting started page. StructureThe "Get Started" chapter is located on the top of the sidebar, and it should be fine. The "Tutorial" chapter is somewhat hard to find, so we may set up some links or refactor the content and merge it into Get Started chapter. For reference, grpc-java has a Quickstart page to run the very simple demo and then a "Basic tutorial" page to talk about every basic concept. The operations getting started content needs to write and we may prepend it as the first item of "Administration" chapter. |
Ok. I'll try to combine the suggestion made above. How about we'll have 3 headings as below:
The quick start will contain the content I've placed under "Consume and Produce messages using the CLI". The main idea: give any role the ability to "feel" Pulsar locally, using the CLI. The Developer Guide will be, over time, a comprehensive guide, like a book, to learn Pulsar targeted at Developers. It's Getting Started section will contain what placed under "Developer Getting Started", mainly aimed at people who wish to learn by "hand" as I explained in previous comments and in the PIP. Same with The Operator Guide, but for Operators (DevOps). @tisonkun @D-2-Ed
The CLI journey - I plan to take the content from all three, but I simply structure it differently: Based on the revised solution I wrote, it will be located under Quick-start. The (1) will contain subsections to start it locally using binary downloaded, or docker, both in stand-alone mode, or using docker compose as a complete cluster (including ZK, BK). Today, it's copied and pasted cross each flavor of starting Pulsar. So in summary, I plan to re-use the existing content, and mainly restructure it. Regarding the Developer Guide / Getting Started. You mentioned "https://pulsar.apache.org/docs/2.11.x/how-to-landing/". This gives you a broken up tutorial (not one with steps). I hoped I answer all of your comments @tisonkun @D-2-Ed @Anonymitaet. |
@asafm @D-2-Ed thanks for your explanations! Record some discussions here for further learning:
|
I've updated the PIP according to all comments. |
Thanks for your updates @asafm! I believe it's good to go for a vote now. |
@asafm Thanks for your proposal. From the engineering side, the new document structure meets the beginner's reading behavior. I like reading by doing and understanding the key concept in practice. The getting started section works like a book with a real-work example to show which case Pulsar can work for and how it works. Our current website divides the context into several parts and it's a little hard for beginners to link them together in the first reading. For the discussion of creating clear 3 guides for 3 roles, I think it may not be so important in the Getting Started section. The Getting Started section aims to provide the basic knowledge of Pulsar for all the roles. In fact, those knowledge is the basic part of the 3 roles. For the concept part, I suggest doing some comparisons between different concepts, such as different subscription types.
Overall LGTM. I think we can start the vote. |
The issue had no activity for 30 days, mark with Stale label. |
This will be in motion soon |
The issue had no activity for 30 days, mark with Stale label. |
My plan - TOC - for the documentation site: Quick-start Guides💡 The basic premise in a quick-start guide is to cater for those who want to learn by “hand”, thus we’ll learn by running a ready-made application, or building it piece by piece. We’ll have a couple of applications to showcase several features (e.g. streaming, job-queue)
Developer Guide
Operator Guide
Contributor Guide
|
Background
As PIP-98 explained, Pulsar documentation site today is built like an encyclopedia. New users or existing are overwhelmed by it. Without a clear path per role (developer / DevOps / …), they resort to skim-read or read-it-all to fit the pieces of the puzzle together to form a complete picture of the knowledge they need.
New users usually start with the Getting Started section, which today is mainly focused on starting Pulsar on your development computer in several ways, and then test drive it by publishing and consuming messages using the CLI. It lacks a brief intro into subjects and terminology used throughout that section.
New users, approaching learning a subject for the first time, mainly divided into two types of learning methods:
Today, the people that learn by reading are forced to read the entire Pulsar documentation site and fit the pieces together, which is an immense high bar for newcomers. The ones learning by example don’t have any examples in today’s getting started section and are forced to google their way around many sites until they get their answers.
PIP-98, among other things, explained we should have several guides:
The people that learn by reading, in the future, will use the Developer or Operator guide, as it will be their “book” for it. The people who learn by doing will use the new getting started section we aim to present here, catering to both developers and operators (also referred to as SREs, Infrastructure, DevOps roles).
This PIP is focused on providing a new structure (table of contents) for the Getting Started Guide.
Goal
Table of Content
1. Quickstart
In this section, we will let the users, either a developer or DevOps (operator) role, “feel” Pulsar using the command line. First, we’ll present two ways to start Pulsar in stand-alone mode (which includes BK and ZK all within a single process) - by downloading a binary and running it or by issuing a single
docker run
command. Also present a way to start pulsar in a cluster mode, which includes a process for each component, using Docker Compose. Then we’ll continue by starting a producer, which will produce a message every 5 seconds, and in another terminal window, a consumer displaying those messages. We’ll utilize pulsar shell scripts for that either directly if they downloaded them or usedocker exec
.1.1 Step 1: Start Pulsar locally
1.1.1. Standalone mode
Here we’ll explain the standalone mode and explain two ways to start pulsar on your development machine. In each section, we’ll show how to view the logs to check if Pulsar started ok.
1.1.1.1 Using release binary
1.1.1.1.1. Downloading
1.1.1.1.2. Running
1.1.1.2. Using Docker
1.1.2. Cluster mode (Docker Compose)
Here we’ll take the content we have on the site showing how to start a Pulsar Cluster locally using Docker compose
1.2. Step 2: Publish and Consume messages using the CLI
1.2.1. Publish messages
Here we will explain how to use the CLI bundled with pulsar to produce a message every 5 seconds. Here we’ll take the opportunity to explain what a topic is briefly.
We’ll use tabs to display code running the CLI since, if you downloaded a binary, it’s one way and if you have used Docker then we’ll issue a
docker exec
command.1.2.2 Consume messages
Here we will explain how to use the CLI bundled with Pulsar to consume those messages and display them to the standard output.
Here we will take the opportunity to explain what a subscription is briefly.
1.4. Stopping Pulsar
Contain short steps how to stop pulsar, be it a release binary or docker, or docker compose, using tabs for the different ways.
2. Developer Guide
this will be a full blown guide for developers. For now we’re adding the first section: Getting Started.
2.1. Getting Started
This section is focused on developers wanting to have an introduction to Pulsar - basic level - by doing rather than by reading. Some people prefer to learn by doing and “feeling” it in their hands. Developers who prefer to learn by reading will skip and go straight to an Overview section.
We will have 2 tutorials, each featuring a ready-made application (micro-service) showcasing pulsar features and concepts (the most basic ones). Each tutorial will have a link to a repository containing the full example if they just want to see the complete code or just run the example. The tutorial will be a step-by-step explanation of the example app and basically building it in steps.
The tutorials were chosen such that, in my opinion, they are the most popular use case for Pulsar or any other messaging system. In other cases, you will resort to the Tutorials section (explained briefly at the beginning of the PIP), containing more use cases that are less popular.
Since Pulsar SDK is available in several languages, we’ll write the same application first in Java and eventually in all languages Pulsar supports. Each directory in the repository will be dedicated to a single language. Each code snippet will have tabs allowing you to choose which language to see this code snippet for.
2.1.1 Basic Job Queue
In this section, we’ll present a ready-made app that showcases Pulsar's ability to be used as a Job Queue. In our example, it will be a micro-service in charge of video encoding. Each message in the topic represents an encoding task to be done (download the file from S3, encode it, then upload it back to S3).
We’ll explain:
2.1.1.1 Prerequisite: running Pulsar in Standalone mode
Link to (1), where we show how to start Pulsar locally.
We prefer that option to Testcontainers since this library doesn’t exist in all languages yet.
2.1.1.2…2.1.1.x :
2.1.2. Event Sourcing example app
This section will showcase partitioned topics, Failover subscriptions, Key-shared subscriptions, and scaling producers.
The app environment is a beer factory. It has a warehouse micro-service for managing the warehouse. It writes the current stock level as a message into a partitioned topic each time the stock increases or decreases inside the physical warehouse. The key is the beer catalog number, and the message is the stock level in a number.
Another micro-service, Inventory, exposes a REST interface to retrieve current stock levels per beer catalog number. It consumes the stock level messages and persists them to Cassandra (key = beer catalog name).
At first, the rate of changes and the number of beers in the catalog were small. The beer factory owners started with the partitioned topic with one single partition and a Failover subscription since they had to update the inventory levels in Cassandra in order with respect to the same beer catalog number.
Once the beer factory got bigger, more changes were introduced, and more beers were added to the catalog. They were bottlenecked by the update to Cassandra, so they scaled Cassandra, but the bottleneck was now at the consumer, so they wanted to scale out the Inventory micro-service. Hence they switched to a Key-shared subscription to maintain order updates per beer catalog number.
As they got even bigger, the bottleneck was now the broker. They increased the number of partitions and made sure they used a partitioner that writes the same key to the same partition.
This example will include a brief explanation about:
3. Operator Guide
3.1. Getting Started
This section is aimed at a person with an operator role (sometimes referred to as Infrastructure / SRE / DevOps), who wants to get started with Pulsar. This role implies different needs compared to the developer getting started. Operators want to try out Pulsar on their k8s cluster (whether mini kube or a test k8s cluster) as opposed to Docker Compose or running a binary. The learning mostly focuses on how to operate it: monitoring, security, and handling failure scenarios.
We’ll start by deploying Pulsar, BK, and ZK using helm charts to k8s and test driving by publishing and consuming messages using the CLI.
We’ll then proceed to deploy a demo application, with one service generating data constantly and writing to Pulsar and the other consuming it and increasing a metric to showcase it. It will be deployed alongside a Prometheus instance for collecting metrics and Grafana with bundled dashboards for Pulsar and the demo app.
Next, we’ll see if the demo app is working and learn a bit about pulsar using the ready-made Pulsar and BK dashboards.
Next, we’ll walk through several scenarios to showcase pulsar features:
Sidebar
The sidebar will look like this:
Links
Discussion: https://lists.apache.org/thread/p8d8ks2ygqnq53oxqczxg2mtpf932wpg
Vote: https://lists.apache.org/thread/95p5mn873d6d3lsk5kgfks4n6x07x5pq
The text was updated successfully, but these errors were encountered: