Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PIP-149] Making the REST Admin API fully async #14365

Closed
Technoboy- opened this issue Feb 18, 2022 · 10 comments
Closed

[PIP-149] Making the REST Admin API fully async #14365

Technoboy- opened this issue Feb 18, 2022 · 10 comments
Assignees
Labels
Stale type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages type/PIP

Comments

@Technoboy-
Copy link
Contributor

Technoboy- commented Feb 18, 2022

Co-author: @mattisonchao @Technoboy-

Motivation

The Rest API was originally designed to be implemented asynchronously, but with the iteration of functions, some synchronous implementations were added, resulting in many asynchronous methods called synchronous implementations. Also, many synchronous calls do not add timeouts. This greatly reduces concurrency, user operations, and experience.
In order to prevent more problems, and improve code readability and maintainability, we intend to refactor these synchronous calls and standardize the implementation of the API.

Related discussion: https://lists.apache.org/thread/pkkz2jgwtzpksp6d4rdm1pyxzb3z6vmg

Goals

  • Try to avoid synchronous method calls in asynchronous methods.
  • Async variable (AsyncResponse) is placed in the first parameter position.
  • Async variable (AsyncResponse) cannot be substituted into method implementations.
  • Add more tests and increase the coverage.

Modification

  1. Avoid synchronous method calls in asynchronous methods.

    protected void internalDeleteNamespace(boolean authoritative) {
       validateTenantOperation(namespaceName.getTenant(), TenantOperation.DELETE_NAMESPACE);
       validatePoliciesReadOnlyAccess();
    }
    

    Suggest to do like this:

    protected CompletableFuture<Void> internalDeleteNamespace(boolean authoritative) {
        return validateTenantOperationAsync(namespaceName.getTenant(), TenantOperation.DELETE_NAMESPACE)
       .thenCompose(__ -> validatePoliciesReadOnlyAccessAsync());
    }
    
  2. Async variable (AsyncResponse) is placed in the first parameter position

    public void deleteNamespace(@Suspended final AsyncResponse asyncResponse, 
             @PathParam("tenant") String tenant,
             @PathParam("namespace") String namespace,
             @QueryParam("force") @DefaultValue("false") boolean force,
             @QueryParam("authoritative") @DefaultValue("false") boolean authoritative) {
    
    
  3. Async variable (AsyncResponse) cannot be substituted into method implementations

    internalCreateNonPartitionedTopicAsync(asyncResponse, authoritative, properties);
    

    Suggest to do like this:

    internalCreateNonPartitionedTopicAsync(authoritative, properties)
            .thenAccept(__ -> asyncResponse.resume(Response.noContent().build()))
            .exceptionally(ex -> {
                resumeAsyncResponseExceptionally(asyncResponse, ex.getCause());
                return null;
            });
    
  4. Exception
    Some methods will validate ownership, like namespace ownership, topic ownership, so will throw REDIRECT exception. we need to filter this exception and not print log.

 internalCreateNonPartitionedTopicAsync(authoritative, properties)
        .thenAccept(__ -> asyncResponse.resume(Response.noContent().build()))
        .exceptionally(ex -> {
            if (!isRedirectException(ex)) {
                 log.error("Failed to xxx");
            }
            resumeAsyncResponseExceptionally(asyncResponse, ex.getCause());
            return null;
        });

Task tracking

In order to unify the modification and track the modified part, it's better to open an issue to track, like #14353, #14013, #13854.

@Technoboy- Technoboy- added the type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages label Feb 18, 2022
@Technoboy- Technoboy- self-assigned this Feb 18, 2022
@Technoboy-
Copy link
Contributor Author

Technoboy- commented Feb 18, 2022

@codelipenghui @lhotari @eolivelli @BewareMyPower @nodece @mattisonchao
I have opened this issue to discuss first then decide to make to PIP

@lhotari
Copy link
Member

lhotari commented Feb 18, 2022

@codelipenghui @lhotari @eolivelli @BewareMyPower @nodece @mattisonchao I have opened this issue to discuss first then decide to make to PIP

Thank you @Technoboy- . Good work. I shared my thoughts and some expectations in yesterdays Pulsar Community meeting, you can find the discussion and @merlimat's responses of the rationale for the changes. It would be good to document the arguments that Matteo explained why mixing async and blocking (sync) causes issues.
Please check the last 35 minutes from the recording of the meeting (it's not available yet, @merlimat could you please add the recording on https://github.com/apache/pulsar/wiki/Community-Meetings#recordings ?)
I'll be able to follow up at the beginning of March the next time. There's no need to wait for me to reply.

While digging into the Jetty settings in Pulsar, I noticed a few gaps in backpressure handling, which are relevant when there are more requests which are handled asynchronously. I have a draft PR #14353 . I'll resume work on that in March. The current values for queue sizes and thread pool sizes are just guesses. Most likely we will use much lower values to prevent the broker taking in too much work in parallel. That's the essence of back pressure that it limits the in progress work so that incoming requests also slow down. Currently that is not the case since the thread pool queue can grow in an unbounded way (LinkedBlockingQueue is used under the covers). There are several kludges that attempt to add backpressure, but they aren't very effective in Pulsar currently. #14353 will help address backpressure issues in Pulsar Admin API. These problems will come more evident when there are more APIs which are implemented using asynchronous Servlet API. /cc @merlimat @codelipenghui

@codelipenghui
Copy link
Contributor

Is it better to combine this one and #14353 want to do to one PIP such as PIP-142?
WDYT? @Technoboy- @lhotari @merlimat

The advantage is that we can provide a clearer image to the community that what we want to improve for the REST API part.

@Technoboy-
Copy link
Contributor Author

Is it better to combine this one and #14353 want to do to one PIP such as PIP-142? WDYT? @Technoboy- @lhotari @merlimat

The advantage is that we can provide a clearer image to the community that what we want to improve for the REST API part.

Yes, agree.

@eolivelli
Copy link
Contributor

I agree with this proposal and @codelipenghui 's proposal

One formality...
If this is a discussion it should stay on the mailing list.

This looks like a PIP doc.
I suggest to rename this to a PIP and let the discussion happen mostly on the dev@ list

@Technoboy- Technoboy- changed the title [DISCUSS] Standardize Admin REST API Feb 25, 2022
@mattisonchao
Copy link
Member

Hi, @lhotari @Technoboy-

Is there any news in this PIP?

@Technoboy- Technoboy- changed the title [PIP-142] Standardize Admin REST API Mar 16, 2022
@Technoboy- Technoboy- changed the title [PIP-143] Standardize Admin REST API Mar 18, 2022
@Technoboy- Technoboy- changed the title [PIP-144] Standardize Admin REST API Mar 18, 2022
@Technoboy- Technoboy- changed the title [PIP-149] Standardize Admin REST API Mar 23, 2022
@lhotari
Copy link
Member

lhotari commented Apr 11, 2022

When more and more asynchronous request handling is happening, it is importance of having a backpressure handling that prevents too much work entering the system. Backpressure handling is broken at the moment. Please review PR #14353. It is fixes bugs in the Admin API / Jetty backpressure configuration.

@frankjkelly
Copy link
Contributor

Great point @lhotari I would also suggest that thread pool monitoring is also critical to understand

  • How many threads are in use
  • Are any threads in BLOCKED state or taking a long time
    So we'll probably need to make sure the appropriate metrics are exposed as well as ensure there is documentation on the monitoring of these threads and what to do if thread pools are exhausted, threads are blocked etc.
RobertIndie added a commit that referenced this issue Jul 13, 2022
Master Issue: #14365

### Motivation

Please see #14365

### Modifications

* Make Namespaces.deleteNamespaceBundle async
* Combine internalDeleteNamespaceBundle
* Make removeOwnedServiceUnit async
gaozhangmin pushed a commit to gaozhangmin/pulsar that referenced this issue Jul 14, 2022
…#16287)

Master Issue: apache#14365

### Motivation

Please see apache#14365

### Modifications

* Make Namespaces.deleteNamespaceBundle async
* Combine internalDeleteNamespaceBundle
* Make removeOwnedServiceUnit async
wuxuanqicn pushed a commit to wuxuanqicn/pulsar that referenced this issue Jul 14, 2022
…#16287)

Master Issue: apache#14365

### Motivation

Please see apache#14365

### Modifications

* Make Namespaces.deleteNamespaceBundle async
* Combine internalDeleteNamespaceBundle
* Make removeOwnedServiceUnit async
@Technoboy-
Copy link
Contributor Author

Close this as we have done all the tracking works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages type/PIP
6 participants