Previously, `Context::current` would always return a new context. Now,
it uses tokio task-local data to look for the current context. Tokio
task locals are not actually tied to a tokio executor; instead, they
provide data scoped to a future.
The basic pattern is:
```rust
let ctx = Context::new_root();
ctx.scope(async {
let ctx2 = context::current();
assert_eq!(ctx2.trace_context.span_id, ctx.trace_context.span_id);
});
```
`server::InFlightRequest::execute` uses `Context::scope` to set the
current context before executing a request, so calls to
`context::current` in request handlers will return the context provided
by the client. This does not propagate to new spawned tasks. To
propagate the client context to child tasks, the following pattern will
work:
```rust
tokio::spawn(context::current().scope(async { /* do work here */ }));
```
This commit also introduces a breaking change to Context serialization.
Previously, the deadline only serialized second-level precision. Now, it
provides full fidelity serialization to the nanosecond.
👋 hello there! I'm a fellow Googler who works on projects that leverage GitHub Actions for CI/CD. Recently I noticed a large increase in our queue time, and I've tracked it down to the [limit of 180 concurrent jobs](https://docs.github.com/en/actions/reference/usage-limits-billing-and-administration) for an organization. To help be better citizens, I'm proposing changes across a few repositories that will reduce GitHub Actions hours and consumption. I hope these changes are reasonable and I'm happy to talk through them in more detail.
- Only run GitHub Actions for pushes and PRs against the main branch of the repository. If your team uses a forking model, this change will not affect you. If your team pushes branches to the repository directly, this changes actions to only run against the primary branches or if you open a Pull Request against a primary branch.
- For long-running jobs (especially tests), I added the "Cancel previous" workflow. This is very helpful to prevent a large queue backlog when you are doing rapid development and pushing multiple commits. Without this, GitHub Actions' default behavior is to run all actions on all commits.
There are other changes you could make, depending on your project (but I'm not an expert):
- If you have tests that should only run when a subset of code changes, consider gating your workflow to particular file paths. For example, we have some jobs that do Terraform linting, but [they only run when Terraform files are changed](c4f59fee71/.github/workflows/terraform.yml (L3-L11)).
Hopefully these changes are not too controversial and also hopefully you can see how this would reduce actions consumption to be good citizens to fellow Googlers. If you have any questions, feel free to respond here or ping me on chat. Thank you!
Previously, there were two loops:
- Expired in-flight requests are polled until Pending.
- New requests are polled until Pending.
Now there is one loop that alternates between polling expired requests
and new requests. This way, neither type of action can face starvation.
There is some important logic that is easy to overlook in the client and
server channels: streams of data to write to the transport should not be
polled until the transport is known to be ready to buffer a message. In
the case that a transport's buffer is full, it needs to be flushed to
make room for more messages.
Without this logic, start_send() could return an error when the buffer
is full, which would cause the entire Channel to error out.
Due to the importance of this logic, it's now factored out into its own
method that's easier to understand: fn ensure_writeable. There is one in
the client module and and one in the server module.
The deadline-exceeded response was largely redundant, because the client
shouldn't normally be waiting for such a response, anyway -- the normal
client will automatically remove the in-flight request when it reaches
the deadline.
This also allows for internalizing the expiration+cleanup logic entirely
within BaseChannel, without having it leak into the Channel trait and
requiring action taken by the Requests struct.
- Remove unnecessary Sync and Clone bounds.
- Merge client and client::channel modules.
- Run cargo clippy in the pre-push hook.
- Put DispatchResponse.cancellation in an Option. Previously, the
cancellation logic looked to see if `complete == true`, but it's a bit
less error prone to put the Cancellation in an Option, so that the
request can't accidentally be cancelled.
- Remove some unnecessary pins/projections.
- Clean up docs a bit. rustdoc had some warnings that are now gone.
This required the breaking change of removing the Client trait. The
intent of the Client trait was to facilitate the decorator pattern by
allowing users to create their own Clients that added behavior on top of
the base client. Unfortunately, this trait had become a maintenance
burden, consistently causing issues with lifetimes and the lack of
generic associated types. Specifically, it meant that Client impls could
not use async fns, which is no longer tenable today.
Before this commit, each request future had its own timeout and would
communicate to the client Channel when a request was no longer being
listened to. Now, instead, the Channel tracks deadlines of in-flight
requests and completes requests with deadline-exceeded errors when they
expire.
This should be functionally equivalent to the previous way. It just cuts
down on the amount of two-way processing required. Unfortunately,
dropping a response future early still requires the client to send a
cancellation message to the Channel.
Before this commit, deadlines were handled by a timeout future that
wrapped each request handler. However, request handlers can be dropped
before sending a response back to the channel, so they can't be relied
on for channel state cleanup. Additionally, clients can't be relied on
to send cancellation messages. It was therefore theoretically possible
for pathological behaviors to cause an unbounded growth in orphan
request data in the Channel.
With this change, as long as requests sent have reasonable deadlines,
then the channel will be able to clean itself up. It is still possible
for requests to be sent with very large deadlines, which would prevent
the channel from cleaning itself up.
1. Renames
Some of the items in this module were renamed to be less generic:
- Handler => Incoming
- ClientHandler => Requests
- ResponseHandler => InFlightRequest
- Channel::{respond_with => requests}
In the case of Handler: handler of *what*? Now it's a bit clearer that
this is a stream of Channels (aka *incoming* connections).
Similarly, ClientHandler was a stream of requests over a single
connection. Hopefully Requests better reflects that.
ResponseHandler was renamed InFlightRequest because it no longer
contains the serving function. Instead, it is just the request, plus
the response channel and an abort hook. As a result of this,
Channel::respond_with underwent a big change: it used to take the
serving function and return a ClientHandler; now it has been renamed
Channel::requests and does not take any args.
2. Execute methods
All methods thats actually result in responses being generated
have been consolidated into methods named `execute`:
- InFlightRequest::execute returns a future that completes when a
response has been generated and sent to the server Channel.
- Requests::execute automatically spawns response handlers for all
requests over a single channel.
- Channel::execute is a convenience for `channel.requests().execute()`.
- Incoming::execute automatically spawns response handlers for all
requests over all channels.
3. Removal of Server.
server::Server was removed, as it provided no value over the Incoming/Channel
abstractions. Additionally, server::new was removed, since it just
returned a Server.
warning: explicit lifetimes given in parameter types where they could be elided (or replaced with `'_` if needed by type declaration)
--> tarpc/src/rpc/server/filter.rs:127:5
|
127 | fn channel<'a>(self: Pin<&'a mut Self>) -> Pin<&'a mut C> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(clippy::needless_lifetimes)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes
warning: 1 warning emitted
I don't know what the intention was behind using u32::MAX + 1 but since the
argument's type is usize this is the only giant value that makes sense to me.
- it's more portable (some architectures like MIPS don't support AtomicU64)
- for most 64bit architectures usize should be 64bit as well
- for most users even 32bit would probably be enough because:
- it's tied to the connection(for streaming sockets)
- the ID wraps and by the time that happens, all previous requests would have
timed out unless you send a lot of requests and have a ton of RAM
Instead, serde_transport::tcp::connect returns a future named Connect
that has methods to directly access the framing config. This is
consistent with how serde_transport::tcp::listen returns a future with
methods to access the framing config. In addition to this consistency,
it reduces the API surface and provides a simpler user transition from
"zero config" to "some config".