The major breaking change is that Channel::execute no longer internally
spawns RPC handlers, because it is no longer possible to place a Send
bound on the return type of Serve::serve. Instead, Channel::execute
returns a stream of RPC handler futures.
Service authors can reproduce the old behavior by spawning each response
handler (the compiler knows whether or not the futures can be spawned;
it's just that the bounds can't be expressed generically):
channel.execute(server.serve())
.for_each(|rpc| { tokio::spawn(rpc); })
This allows plugging in horizontal functionality, such as authorization,
throttling, or latency recording, that should run before and/or after
execution of every request, regardless of the request type.
The tracing example is updated to show off both client stubs as well as
server hooks.
As part of this change, there were some changes to the Serve trait:
1. Serve's output type is now a Result<Response, ServerError>..
Serve previously did not allow returning ServerErrors, which
prevented using Serve for horizontal functionality like throttling or
auth. Now, Serve's output type is Result<Resp, ServerError>, making
Serve a more natural integration point for horizontal capabilities.
2. Serve's generic Request type changed to an associated type. The
primary benefit of the generic type is that it allows one type to
impl a trait multiple times (for example, u64 impls TryFrom<usize>,
TryFrom<u128>, etc.). In the case of Serve impls, while it is
theoretically possible to contrive a type that could serve multiple
request types, in practice I don't expect that to be needed. Most
users will use the Serve impl generated by #[tarpc::service], which
only ever serves one type of request.
Also adds a Client stub trait alias for each generated service.
Now that generic associated types are stable, it's almost possible to
define a trait for Channel that works with async fns on stable. `impl
trait in type aliases` is still necessary (and unstable), but we're
getting closer.
As a proof of concept, three more implementations of Stub are implemented;
1. A load balancer that round-robins requests between different stubs.
2. A load balancer that selects a stub based on a request hash, so that
the same requests go to the same stubs.
3. A stub that retries requests based on a configurable policy.
The "serde/rc" feature is added to the "full" feature because the Retry
stub wraps the request in an Arc, so that the request is reusable for
multiple calls.
Server implementors commonly need to operate generically across all
services or request types. For example, a server throttler may want to
return errors telling clients to back off, which is not specific to any
one service.
* Make client::InFlightRequests generic over result.
Previously, InFlightRequests required the client response type to be a
server response. However, this prevented injection of non-server
responses: for example, if the client fails to send a request, it should
complete the request with an IO error rather than a server error.
* Gracefully handle client-side send errors.
Previously, a client channel would immediately disconnect when
encountering an error in Transport::try_send. One kind of error that can
occur in try_send is message validation, e.g. validating a message is
not larger than a configured frame size. The problem with shutting down
the client immediately is that debuggability suffers: it can be hard to
understand what caused the client to fail. Also, these errors are not
always fatal, as with frame size limits, so complete shutdown was
extreme.
By bubbling up errors, it's now possible for the caller to
programmatically handle them. For example, the error could be walked
via anyhow::Error:
```
2023-01-10T02:49:32.528939Z WARN client: the client failed to send the request
Caused by:
0: could not write to the transport
1: frame size too big
```
* Some follow-up work: right now, read errors will bubble up to all pending RPCs. However, on the write side, only `start_send` bubbles up. `poll_ready`, `poll_flush`, and `poll_close` do not propagate back to pending RPCs. This is probably okay in most circumstances, because fatal write errors likely coincide with fatal read errors, which *do* propagate back to clients. But it might still be worth unifying this logic.
---------
Co-authored-by: Tim Kuehn <tikue@google.com>
Example tarpc service that encodes messages with bincode written to a TLS-over-TCP transport.
Certs were generated with openssl 3 using https://github.com/rustls/rustls/tree/main/test-ca
New dependencies:
- tokio-rustls to set up the TLS connections
- rustls-pemfile to load certs from .pem files
mem::forget is a dangerous tool, and it was being used carelessly for
things that have safer alternatives. There was at least one bug where a
cloned tokio::sync::mpsc::UnboundedSender used for request cancellation
was being leaked on every successful server response, so its refcounts
were never decremented. Because these are atomic refcounts, they'll wrap
around rather than overflow when reaching the maximum value, so I don't
believe this could lead to panics or unsoundness.
This trait fn returns a private type, which means it's useless for
anyone using the Channel.
Instead, add an inert (now-public) ResponseGuard to TrackedRequest that,
when taken out of the ManuallyDrop, ensures a Channel's request state is
cleaned up. It's preferable to make ResponseGuard public instead of
RequestCancellations because it's a smaller API surface (no public
methods, just a Drop fn) and harder to misuse, because it is already
associated with the correct request ID to cancel.
## Problem
Library users might get stuck with or ran into issues while using tarpc because of incompatible third party libraries. in particular, tokio_serde and tokio_util.
## Solution
This PR does the following:
1. re-export tokio_serde as part of feature serde-transport, because the end user imports it to use some serde-transport APIs.
2. Update third library packages to latest release and fix resulting issues from that.
## Important Notes
tokio_util 7.3 DelayQueue::poll_expired API changed [0] therefore, InFlightRequests::poll_expired now returns Poll<Option<u64>>
[0] https://docs.rs/tokio-util/latest/tokio_util/time/delay_queue/struct.DelayQueue.html#method.poll_expired
When an InFlightRequest is dropped before response completion, request
data in the Channel persists until either the request expires or the
client cancels the request. In rare cases, requests with very large
deadlines could clog up the Channel long after request processing
ceases.
This commit adds a drop hook to InFlightRequest so that if it is dropped
before execution completes, a cancellation message is sent to the
Channel so that it can clean up the associated request data.
This only works for when using `InFlightRequest::execute` or
`Channel::execute`. However, users of raw `Channel` have access
to the `RequestCancellation` handle via `Channel::request_cancellation`,
so they can implement a similar method if they wish to manually clean up
request data.
Note that once a Channel's request data is cleaned up, that request can
never be responded to, even if a response is produced afterward.
Fixes https://github.com/google/tarpc/issues/314
These types do nothing unless polled / .awaited.
Annotating them with #[must_use] helps prevent a common class of coding errors.
Fixes https://github.com/google/tarpc/issues/368.
Duration was previously serialized as SystemTime. However, absolute
times run into problems with clock skew: if the remote machine's clock
is too far in the future, the RPC deadline will be exceeded before
request processing can begin. Conversely, if the remote machine's clock
is too far in the past, the RPC deadline will not be enforced.
By converting the absolute deadline to a relative duration, clock skew
is no longer relevant, as the remote machine will convert the deadline
into a time relative to its own clock. This mirrors how the gRPC HTTP2
protocol includes a Timeout in the request headers [0] but the SDK uses
timestamps [1]. Keeping the absolute time in the core APIs maintains all
the benefits of today, namely, natural deadline propagation between RPC
hops when using the current context.
This serialization strategy means that, generally, the remote machine's
deadline will be slightly in the future compared to the local machine.
Depending on network transfer latencies, this could be microseconds to
milliseconds, or worse in the worst case. Because the deadline is not
intended for high-precision scenarios, I don't view this is as
problematic.
Because this change only affects the serialization layer, local
transports that bypass serialization are not affected.
[0] https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
[1] https://grpc.io/blog/deadlines/#setting-a-deadline