๐Ÿ‘‹๐Ÿฝ Welcome

Welcome to the wg-async-foundations website!

Leads

The leads of this working group are @tmandry and @nikomatsakis. Both of them can be found on Zulip.

๐Ÿ› ๏ธ Getting involved

There is a weekly triage meeting that takes place in our Zulip stream. Feel free to stop by then (or any time!) to introduce yourself.

If you're interested in fixing bugs, though, there is no need to wait for the meeting! Take a look at the instructions here.

We are actively working on bringing the async vision to reality, so there are lots of ways to help. Check out the Roadmap to see the various things we are working on. Each of the high level goals should have further instructions for how to get starting helping with that goal in particular. Look for the ๐Ÿ› ๏ธ icon, which highlights areas where further how to help resources are available.

What is the goal of this working group?

This working group is focused around implementation/design of the โ€œfoundationsโ€ for Async I/O. This means that we are focused on designing and implementing extensions to the language, standard library, and other "core" bits of support offered by the Rust organization. We do not directly work on external projects like tokio, async-std, smol, embassy and so forth, although we definitely discuss ideas and coordinate with them where appropriate.

Zulip

We hold discussions on the #wg-async-foundations stream in Zulip

License

Licensed under either of

  • Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
  • MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

๐Ÿ”ฎ The vision

What is this

We believe Rust can become one of the most popular choices for building distributed systems, ranging from embedded devices to foundational cloud services. Whatever they're using it for, we want all developers to love using Async Rust. For that to happen, we need to move Async Rust beyond the "MVP" state it's in today and make it accessible to everyone.

This document is a collaborative effort to build a shared vision for Async Rust. Our goal is to engage the entire community in a collective act of the imagination: how can we make the end-to-end experience of using Async I/O not only a pragmatic choice, but a joyful one?

๐Ÿšง Under construction! Help needed! ๐Ÿšง

The first version of this document is not yet complete, but it's getting very close! We are in the process of finalizing the set of "status quo" and "shiny future" stories and the details of the proposed roadmap. The current content however is believed to be relatively final, at this point we are elaborating and improving it.

Where we are and where we are going

The "vision document" starts with a cast of characters. Each character is tied to a particular Rust value (e.g., performance, productivity, etc) determined by their background; this background also informs the expectations they bring when using Rust. Grace, for example, wants to keep the same level of performance she currently get with C, but with the productivity benefits of memory safety. Alan, meanwhile, is hoping Rust will give him higher performance without losing the safety and ergonomics that he enjoys with garbage collected languages.

For each character, we write "status quo" stories that describe the challenges they face as they try to achieve their goals (and typically fail in dramatic fashion!), These stories are not fiction. They are an amalgamation of the real experiences of people using Async Rust, as reported to us by interviews, blog posts, and tweets. Writing these stories helps us gauge the cumulative impact of the various papercuts and challenges that one encounters when using Async Rust.

The ultimate goal of the vision doc, of course, is not just to tell us where we are now, but where we are going and how we will get there. For this, we include "shiny future" stories that tell us how those same characters will fare in a few years time, when we've had a chance to improve the Async Rust experience.

The vision drives the work

The vision is not just idle speculation. It is the central document that we use to organize ourselves. When we think about our roadmap for any given year, it is always with the aim of moving us closer to the vision we lay out here.

Involving the whole community

The async vision document provides a forum where the Async Rust community can plan a great overall experience for Async Rust users. Async Rust was intentionally designed not to have a "one size fits all" mindset, and we don't want to change that. Our goal is to build a shared vision for the end-to-end experience while retaining the loosely coupled, exploration-oriented ecosystem we have built.

โ“ How to vision

How you can help

WhenWhat
๐Ÿ›‘ Coming soonParticipate in discussions and development towards roadmap goals
๐Ÿ›‘ Coming soonTake ownership of "help wanted" goals from the roadmap
โš ๏ธ Winding downPropose new "status quo" stories or comment on existing PRs
โš ๏ธ Winding downPropose new "shiny future" stories or comment on existing PRs
๐Ÿ›‘ Coming soonVote for the awards on the status quo and shiny future stories!

Making the vision real

We are currently working towards implementing the async vision described in the shiny future section. On the roadmap page, you can get an overview of the major goals that are part of implementing that future and how we have divided up the work. Each of the goals also has several initiatives, and those initiatives have upcoming milestones. If you'd like to participate in an initiative, you can find the appropriate Zulip stream and see if they are looking for help!

Goal and initiative owners

Each top-level goal and initiative in the roadmap has an owner. The owner of the top-level goal manages the goal overall, while the owner of an initiative manages the "nitty gritty" design work (for example, preparing the evaluation, authoring any RFCs required, or supervising the implementation). You can learn more about the responsibilities of owners in this page. If you have questions about whether you can help out with a goal or an initiative, the owner is probably the one to talk to.

Help wanted goals

Some of the top-level goals are marked with โœ‹, which means "help wanted". Those goals are looking for an owner. If you think you might be interested, you can read about the responsibilities of owners and contact the wg leads.

Stakeholders

While we always encourage feedback from the broader public, many of our initiatives also have identified sets of stakeholders. These are people who are specially consulted as part of the process to give feedback on the design and implementation. They can be representatives from major projects in the ecosystem, production users, or other sorts of experts.

Living document

Although many of the pieces are complete, the vision doc is a living document and it will never be done. During the brainstorming period, we had a lot of stories submitted and we are now in the process of "harmonizing" those into a small set of status quo and shiny future narratives, each based around a representative project and the same set of characters. If you'd like to help out with that, contact the wg leads.

We also plan to regularly revisit the vision once we've made significant progress on implementation or if new information has come to light.

Submitting status quo and shiny future story PRs

Although the brainstorming period has ended, we are still open to new PRs, particularly if they cover space that has not been well covered:

Wait, did somebody say awards?

Yes! We are planning on giving awards in various categories for folks who write status quo and shiny future PRs. The precise categories are TBD. Check out the awards page for more details.

Owning a goal or initiative

This page describes the roles and responsibilities associated with being the owner of an item on the roadmap. Roadmap items fall into two categories, top-level goals and initiatives. In both cases, being an owner means that you are responsible for ensuring that the item gets done, but the details of owning a top-level goal are different from owning an initiative.

Summary

Goal owners are responsible for splitting their area into a set of initiatives. These can be active or on hold.

They are also responsible for ensuring that for each initiative:

  • An owner is assigned
  • A landing page exists
  • Milestones are defined on the landing page
  • Stakeholders are identified and looped in at the proper stages

Finally, they are expected to attend sprint meetings.

Sprint meetings

We are organizing the working group in two week sprints. This means that every two weeks we have a sprint planning meeting. All goal owners are expected to attend! Initiative owners or other contributors are welcome as well.

The purpose of the sprint planning meeting is to check-in on the progress towards the milestones for each initiative and to see if they need to be adjusted. It's also a chance to raise interesting questions or get advice about tricky things or unexpected problems, as well as to celebrate our progress.

Owning a top-level goal

As the owner of a top-level goal your role is to figure out overall plan for how that goal will be achieved and to track progress. This means breaking up the goal into different initiatives, finding owners for those initiatives (which can be you!), and helping those owners to plan milestones. You are also generally responsible for staying on top of the state of things and updating other owners as to new or interesting developments.

Owning an initiative

Our definition of initiative is precisely the same as that used by the Rust lang team: it corresponds to a some active effort with a clear goal or deliverable(s). As the owner of an initiative, your role is to ensure that the work gets done (Which doesn't necessarily mean you do it yourself, it may be that you instead coordinate with volunteers or other implementors). You also guide the design of the deliverables within the initiative.

As in the lang team process, the role of the owner is not to make the final decision (that belongs to the relevant rust team(s)), but to develop the "menu" of design choices, elaborate the tradeoffs involved, and make recommendations. For particularly complex designs, these evaluations will take the form of evaluation documents and are developed in collaboration with a defined set of stakeholders.

Making a landing page

Each initiative should have a landing page, linked to from the roadmap. This can be a page on this website or a dedicated repo.

For in-progress initiatives the landing page should include, or have pointers to:

  • Goals and impact of the initiative
  • Milestones
  • Design notes and documentation
  • Links to any organizing tools, such as a project board
  • The initiative owner
  • The current set of stakeholders and the area(s) they represent
  • Notes on how to get involved
  • For landing pages not on this website, a link back to the overall roadmap

For making a dedicated repo, it's recommended to use this initiative template as a starting point.

Planning initiative milestones

When you own an initiative, you should work with the owner of the top-level goal and others to plan out a series of milestones around the initiative. These milestones correspond to the various steps you need to take to complete the initiative.

Milestones are not fixed and they frequently change as you progress. They usually start out quite vague, such as "author an RFC", and then get more precise as you learn more about what is required: "figure out the design for X", "implement feature Y". We update the status and set of milestones for each sprint status meeting.

Stakeholders

Many initiatives in the [roadmap] have an associated set of stakeholders. The role of a stakeholder is as follows:

  • They are consulted by the owner over the course of working on the initiative.
  • They do not have veto power; that belongs to the team.
  • When they do raise concerns, those concerns should either be addressed in the design or discussed explicitly in the FAQ.

Stakeholders can be:

  • Domain experts (perhaps from other languages)
  • Representatives from major libraries
  • Production users

Stakeholders can be selected in coordination with the async foundations working group leads. Potential new stakeholders can also get in touch with the owner.

Feedback on the design

One role for stakeholders is to give feedback on the design as it progresses. Stakeholders are thus consulted in course of preparing evaluation docs or RFCs.

Experimenting with the implementation

Another role for stakeholders is evaluating the implemenation. This is partiularly important for production users. Stakeholders might, for example, agree to port their code to use the nightly version of the feature and adapt it as the design evolves.

Writing an evaluation

When an initiative involves a complex design task, the initiative owner begins by writing an evaluation. The evaluation documents the various design options and their tradeoffs, and also includes a recommendation. Evaluations are posted publicly and presented to the relevant Rust teams, which will discuss with the owners and stakeholders ultimately make a choice on how to proceed.

The current draft for each evaluation will be maintained in some git repository, often a dedicated repository for the initiative. The repository will also list the stakeholders associated with that particular effort.

Getting feedback

Developing an evaluation consists of first preparing an initial draft by surveying initial work and then taking the following steps (repeat until satisfied):

  • Review draft in meetings with stakeholders
    • These meetings can be a small, productive group of people
    • Often better to have multiple stakeholders together so people can brainstorm together, but 1:1 may be useful too
  • Present the draft to the teams and take feedback
  • Review issues raised on the repo (see below)
  • Adjust draft in response to the above comments

Issues on the repo

In addition to the active outreach to stakeholders, anyone can submit feedback by opening issues on the repositories storing the draft evaluations. These reposies will have issue categories with templates that categorize the feedback and provide some structure. For example:

  • Experience report
  • Proposal feedback
  • Crazy new idea

โ“ How to vision: "Status quo" stories

We want to make sure all Async Rust users and their experiences are reflected in the async vision doc, so please help us by writing 'status quo' stories about your experiences or the experiences of others! Remember, status quo stories are not "real", but neither are they fiction. They are constructed from the real experiences of people using Async Rust (often multiple people).

TL;DR

Just want to get started? Here are quick instructions to get you going:

Optional: open an issue to discuss your story or find others with similar experiences

If you have a story idea but you don't have the time to write about it, or if you would like to know whether other folks have encountered the same sorts of problems, you can open up a "status quo" story issue on the wg-async-foundations repository. Alternatively, if you're looking for a story to write, you can browse the open issues tagged as status-quo-story-idea and see if anything catches your eye. If you see people describing problems you have hit, or have questions about the experiences people are sharing, then please leave a comment -- but remember to comment supportively. (You can also come to Zulip to discuss.)

How to open a PR

If you have an idea you'd like to write about, please open a PR using this template and adding a new file into the status_quo directory. Do not add your file to SUMMARY.md -- that will create conflicts, we'll do it manually after merging.

Goals of a status quo PR

When writing a status quo story, your goal is to present what you see as a major challenge for Async Rust. You want to draw upon people's experiences (sometimes multiple people) to show all the aspects of the problem in an engaging and entertaining way.

Each story is always presented from the POV of a particular character. Stories should be detailed, not abstract -- it's better to give specifics than generalities. Don't say "Grace visited a website to find the answer to her question", tell us whether she went to stackoverflow, asked on reddit, or found the answer on some random blog post. Ideally you should get this detail from whatever your "source" of the story is -- but if you are using multiple sources and they disagree, you can pick one and use the FAQ to convey some of the other alternatives.

The role of the FAQ

Every status quo PR includes a FAQ. This FAQ should always include answers to some standard questions:

  • What are the morals of the story?
    • Talk about the major takeaways-- what do you see as the biggest problems.
  • What are the sources for this story?
    • Talk about what the story is based on, ideally with links to blog posts, tweets, or other evidence.
  • Why did you choose NAME to tell this story?
    • Talk about the character you used for the story and why.
  • How would this story have played out differently for the other characters?
    • In some cases, there are problems that only occur for people from specific backgrounds, or which play out differently. This question can be used to highlight that.

You can feel free to add whatever other FAQs seem appropriate. You should also expect to grow the FAQ in response to questions that come up on the PR.

The review process

When you open a status quo PR, people will start to comment on it. These comments should always be constructive, with the goal not of negating the story but of making it more precise or more persuasive. Ideally, you should respond to every comment in one of two ways:

  • Adjust the story with more details or to correct factual errors.
  • Add something to the story's FAQ to explain the confusion.
    • If the question is already covered by a FAQ, you can just refer the commenter to that.

The goal is that, at the end of the review process, the status quo story has a lot more details that address the major questions people had.

๐Ÿค” Frequently Asked Questions

What is the process to propose a status quo story?

What if my story applies to multiple characters?

  • Look at the "morals" of your story and decide which character will let you get those across the best.
  • Use the FAQ to talk about how other characters might have been impacted.
  • If the story would play out really differently for other characters, maybe write it more than once!

How much detail should I give? How specific should I be?

  • Detailed is generally better, but only if those details are helpful for understanding the morals of your story.
  • Specific is generally better, since an abstract story doesn't feel as real.

What should I do when I'm trying to be specific but I have to make an arbitrary choice?

Add a FAQ with some of the other alterantives, or just acknowledging that you made an arbitrary choice there.

None of the characters are a fit for my story.

It doesn't have to be perfect. Pick the one that seems like the closest fit. If you really feel stuck, though, come talk to us on Zulip about it!

How should I describe the "evidence" for my status quo story?

The more specific you can get, the better. If you can link to tweets or blog posts, that's ideal. You can also add notes into the conversations folder and link to those. Of course, you should be sure people are ok with that.

โ“ How to vision: "Shiny future" stories

We want all Async Rust users and their hopes and dreams for what Async Rust should be in the future to be reflected in the async vision doc, so please help us by writing 'shiny future' stories about what you would like async Rust to look like! Remember: we are in a brainstorming period. Please feel free to leave comments in an effort to help someone improve their PRs, but if you would prefer a different approach, you are better off writing your own story. (In fact, you should write your own story even if you like their approach but just have a few alternatives that are worth thinking over.)

TL;DR

Just want to get started? Here are quick instructions to get you going:

  • To write your own story:

How to open a PR

If you have an idea you'd like to write about, please open a PR using this template and adding a new file into the shiny_future directory. Do not add your file to SUMMARY.md, that will create conflicts. We'll do it after merging.

Goals of a shiny future PR

Shiny future PRs "retell" the story from one or more status quo PRs. The story is now taking place 2-3 years in the future, when Async Rust has had the chance to make all sorts of improvements. Shiny future stories are aspirational: we don't have to know exactly how they will be achieved yet! (Of course, it never hurts to have a plan too.)

Like status quo stories, each shiny future story is always presented from the POV of a particular character. They should be detailed. Sometimes this will mean you have to make stuff up, like method names or other details -- you can use the FAQ to spell out areas of particular uncertainty.

The role of the FAQ

Every shiny future PR includes a FAQ. This FAQ should always include answers to some standard questions:

  • What status quo story or stories are you retelling?
    • Link to the status quo stories here. If there isn't a story that you're retelling, write it!
  • What is Alan most excited about in this future? Is he disappointed by anything?
    • Think about Alan's top priority (performance) and the expectations he brings (ease of use, tooling, etc). How do they fare in this future?
  • What is Grace most excited about in this future? Is she disappointed by anything?
    • Think about Grace's top priority (memory safety) and the expectations she brings (still able to use all the tricks she knows and loves). How do they fare in this future?
  • What is Niklaus most excited about in this future? Is he disappointed by anything?
    • Think about Niklaus's top priority (accessibility) and the expectations he brings (strong community that will support him). How do they fare in this future?
  • What is Barbara most excited about in this future? Is she disappointed by anything?
    • Think about Barbara's top priority (productivity, maintenance over time) and the expectations she brings (fits well with Rust). How do they fare in this future?
  • If this is an alternative to another shiny future, which one, and what motivated you to write an alternative?
    • Cite the story. Be specific, but focus on what you like about your version, not what you dislike about the other.
    • If this is not an alternative, you can skip this one. =)
  • What projects benefit the most from this future?
  • Are there any projects that are hindered by this future?

There are also some optional questions:

  • What are the incremental steps towards realizing this shiny future?
    • Talk about the actual work we will do. You can link to design docs or even add new ones, as appropriate.
    • You don't have to have the whole path figured out yet!
  • Does realizing this future require cooperation between many projects?
    • For example, if you are describing an interface in libstd that runtimes will have to implement, talk about that.

You can feel free to add whatever other FAQs seem appropriate. You should also expect to grow the FAQ in response to questions that come up on the PR.

The review process

When you opan a status quo PR, people will start to comment on it. These comments should always be constructive. They usually have the form of asking "in this future, what does NAME do when X happens?" or asking you to elaborate on other potential problems that might arise. Ideally, you should respond to every comment in one of two ways:

  • Adjust the story with more details or to correct factual errors.
  • Add something to the story's FAQ to explain the confusion.
    • If the question is already covered by a FAQ, you can just refer the commenter to that.

The goal is that, at the end of the review process, the status quo story has a lot more details that address the major questions people had.

๐Ÿค” Frequently Asked Questions

What is the process to propose a shiny future story?

What character should I use for my shiny future story?

  • Usually you would use the same character from the status quo story you are retelling.
  • If for some reason you chose a different character, add a FAQ to explain why.

What do I do if there is no status quo story for my shiny future?

Write the status quo story first!

What happens when there are multiple "shiny future" stories about the same thing?

During this brainstorming period, we want to focus on getting as many ideas as we can. Having multiple "shiny futures" that address the same problem is a feature, not a bug, as it will let us mix-and-match later to try and find the best overall plan.

How much detail should I give? How specific should I be?

  • Detailed is generally better, but only if those details are helpful for understanding the morals of your story.
  • Specific is generally better, since an abstract story doesn't feel as real.

What is the "scope" of a shiny future story? Can I tell shiny future stories that involve ecosystem projects?

All the stories in the vision doc are meant to cover the full "end to end" experience of using async Rust. That means that sometimes they will take about things that are really part of projects that are outside of the Rust org. For example, we might write a shiny future that involves how the standard library has published standard traits for core concepts and those concepts have been adopted by libraries throughout the ecosystem. There is a FAQ that asks you to talk about what kinds of coordinate between projects will be required to realize this vision.

What do I do when I get to details that I don't know yet?

Take your best guess and add a FAQ explaining which details are still up in the air.

Do we have to know exactly how we will achieve the "shiny future"?

You don't have to know how your idea will work yet. We will eventually have to figure out the precise designs, but at this point we're more interested in talking about the experience we aim to create. That said, if you do have plans for how to achieve your shiny future, you can also include [design docs] in the PR, or add FAQ that specify what you have in mind (and perhaps what you have to figure out still).

What do I do if somebody leaves a comment about how my idea will work and I don't know the answer?

Add it to the FAQ!

What if we write a "shiny future" story but it turns out to be impossible to implement?

Glad you asked! The vision document is a living document, and we intend to revisit it regularly. This is important because it turns out that predicting the future is hard. We fully expect that some aspects of the "shiny future" stories we write are going to be wrong, sometimes very wrong. We will be regularly returning to the vision document to check how things are going and adjust our trajectory appropriately.

โ“ How to vision: Constructive comments

Figuring out the future is tricky business. We all know the internet is not always a friendly place. We want this discussion to be different.

Be respectful and supportive

Writing a "status quo" or "shiny future" story is an act of bravery and vulnerability. In the status quo, we are asking people to talk about the things that they or others found hard, to admit that they had trouble figuring something out. In the case of the shiny future, we're asking people to put out half-baked ideas so that we can find the seeds that will grow into something amazing. It doesn't take much to make that go wrong.

Comment to understand or improve, not to negate or dissuade

โ€œMost people do not listen with the intent to understand; they listen with the intent to reply.โ€

-- Stephen Covey

The golden rule is that when you leave a comment, you are looking to understand or improve the story.

For status quo stories, remember that these are true stories about people's experiences -- they can't be wrong (though they could be inaccurate). It may be that somebody tries for days to solve a problem that would've been easy if they had just known to call a particular method. That story is not wrong, it's an opportunity to write a shiny future story in which you explain how they would've learned about that method, or perhaps about how that method would become unnecessary.

For shiny future stories, even if you don't like the idea, you should ask comments with the goal of better understanding what the author likes about it. Understanding that may give you an idea for how to get those same benefits in a way that you are happier with. It's also valid to encourage the author to elaborate on the impact their story will have on different characters.

You might just want to write your own story

Remember, opening your own PR is free (In fact, we're giving an award for being "most prolific"). If you find that you had a really different experience than a status quo story, or you have a different idea for a shiny future, consider just writing your own PR instead of commenting negatively on someone else's. The goal of the brainstorming phase is to put a lot of alternatives, so that we can look for opportunities to combine them and make something with the best of both.

Good questions for status quo stories

Here are some examples of good questions for "status quo" stories:

  • Tell me more about this step. What led NAME to do X?
  • What do you think OTHER_NAME would have done here?
  • Can you be more specific about this point? What library did they use?

Good questions for shiny future stories

Here are some examples of good questions for "shiny future" stories:

  • How does NAME do X in this future?
  • It seems like this would interfere with X, which is important for application A. How would NAME handle that case in this future?

You should not be afraid to raise technical concerns -- we need to have a robust technical discussion! But do so in a way that leaves room to find an answer that satisfies both of you.

โ“ How to vision: Awards

At the end of the brainstorming period, we'll also vote on various awards to give to the status quo and shiny future PRs that were submitted.

Award categories

These are the award categories:

  • Most humorous story
  • Most creative story
  • Most supportive -- who left the most helpful comments?
  • Most prolific -- who wrote the most stories?
  • Most unexpected -- which status quo story (or shiny future) took you by surprise?
  • Most painful "status quo" story
  • Most ambitious "shiny future" story
  • Most extensive FAQ

However, if you have an idea for another award category, we are happy to take suggestions. One rule: the awards can't be negative (e.g., no "most unrealistic"), and they can't be about which thing is "best". That would work against the brainstorming spirit.

Voting

At the end of the brainstorming period, we're going to have a voting session to select which PRs and people win the awards. The winners will be featured in a blog post. ๐Ÿ†

How using async Rust ought to feel (and why it doesn't today)

This section is, in many ways, the most important. It aims to identify the way it should feel to use Async Rust.

Consistent: "just add async/await"

Async Rust should be a small delta atop Sync Rust. People who are familiar with sync Rust should be able to leverage what they know to make adopting Async Rust straightforward. Porting a sync code base to async should be relatively smooth: just add async/await, adopt the async variants of the various libraries, and you're done.

Reliable: "if it compiles, it works"

One of the great things about Rust is the feeling of "it if compiles, it works". This is what allows you to do a giant refactoring and find that the code runs on the first try. It is what lets you deploy code that uses parallelism or other fancy features without exhausting fuzz testing and worry about every possible corner case.

Empowering: "complex stuff feels easy"

Rust's great strength is taking formerly complex, wizard-like things and making them easy to do. In the case of async, that means letting people use the latest and greatest stuff, like io-uring. It also means enabling parallelism and complex scheduling patterns very easily.

Performant: "ran well right out of the box"

Rust code tends to perform "quite well" right out of the box. You don't have to give up the "nice things" in the language, like closures or high-level APIs, in order to get good performance and tight memory usage. In fact, those high-level APIs often perform as well or better than what you would get if you wrote the code yourself.

Productive: "great crates for every need, just mix and match"

Being able to leverage a large ecosystem of top-notch crates is a key part of what makes Rust (and most any modern language) productive. When using async Rust, you should be able to search crates.io and find crates that cover all kinds of things you might want to do. You should be able to add those crates to your Cargo.toml and readily connect them to one another without surprising hiccups.

Transparent and tunable: "it's easy to diagnose deadlocks and performance bottlenecks"

Using Rust means most things work and perform well by default, but of course it can't prevent all problems. When you do find bugs, you need to be able to easily track what happened and figure out how to fix it. When your performance is subpar, you need to be able to peek under the covers and understand what's going on so that you can tune it up. In synchronous Rust, this means integrating with but also improving on existing tooling like debuggers and profilers. In asynchronous Rust, though, there's an extra hurdle, because the terms that users are thinking in (asynchronous tasks etc) exist within the runtime, but are not the same terms that synchronous debuggers and profilers expose. There is a need for more customized tooling to help users debug problems without having to map between the async concept and the underlying implementation.

Control: "I can do all the weird things"

Part of what's great about Rust is that it lets you get into explore all the corner cases. Want to target the kernel? Develop embedded systems using async networking without any operating system? Run on WebAssembly? No problem, we can do that.

Interoperable: "integrating with C++, node.js, etc is easy"

Much like C, Rust aims to be a "lingua franca", something you can integrate into your existing systems on a piecemeal basis. In synchronous Rust, this means that functions can "speak" the C ABI and Rust structures can be compiled with C-compatible layouts, and that we use native system functionality like the default memory allocator or the native threading APIs. In asynchronous Rust, it means that we are able to integrate into other systems, like C++ futures, Grand Central Dispatch, or JavaScript promises.

๐Ÿ™‹โ€โ™€๏ธ Cast of characters

What is this?

We've created four characters that we use to guide our thinking. These characters are the protagonists of our status quo and shiny future stories, and they help us to think about the different kinds of priorities and expectations that people bring to Async Rust. Having names and personalities also makes the stories more fun and approachable.

The characters

  • Alan: the experienced "GC'd language" developer, new to Rust
    • Top priority: performance -- that's what he is not getting from current GC'd language
    • Expectations: absence of memory safety bugs (he gets that now from his GC), strong ecosystem, great tooling
  • Grace: the systems programming expert, new to Rust
    • Top priority: memory safety -- that's what she is not getting from C/C++
    • Expectations: able to do all the things she's used to from C/C++
  • Niklaus: new programmer from an unconventional background
    • Top priority: accessibility -- he's learning a lot of new things at once
    • Expectations: community -- the community enabled him to have early success, and he is excited to have it support him and him grow more
  • Barbara: the experienced Rust developer
    • Top priority: overall productivity and long-term maintenance -- she loves Rust, and wants to see it extended to new areas; she has an existing code base to maintain
    • Expectations: elegance and craftsmanship, fits well with Rust

๐Ÿค” Frequently Asked Questions

Where do the names come from?

Famous programming language designers and theorists. Alan Turing, Grace Hopper, Niklaus Wirth, and Barbara Liskov.

I don't see myself in these characters. What should I do?

Come to Zulip and talk to us about it! Maybe they need to be adjusted!

I see myself in more than one of these characters!

Yeah, me too.

๐Ÿ™‹โ€โ™€๏ธ Cast of characters

Alan: the experienced "GC'd language" developer, new to Rust

Variant A: Dynamic languages

Alan has been programming for years. He has built systems in Ruby on Rails, node.js, and used Django too. Lately he's been learning Rust and he is tinkering with integrating Rust into some of his projects to get better performance and reliability. He's also building some projects entirely in Rust.

Variant B: Java

Alan works at a Java shop. They run a number of network services built in Java, along with some that use Kotlin or Scala. He's very familiar with the Java ecosystem and the tooling that the JVM offers. He's also sometimes had to tweak his code to work around garbage collector latencies or to reduce overall memory usage. He's curious to try porting some systems to Rust to see how it works.

Variant C: Kotlin

Alan is developing networking programs in Kotlin. He loves Kotlin for its expressive syntax and clean integration with Java. Still, he sometimes encounters problems running his services due to garbage collection latencies or overall memory usage. He's heard that Rust can be fun to use too, and is curious to try it out.

Variant D: Go

Alan develops a distributed database in Go, enjoying its simplicity and first-class treatment of concurrency. He's successfully built a transactional database that handles over 100K QPS. Intrigued by Rust's promise of "fearless concurrency", Alan tries Rust for more efficient use of memory and CPU. He's curious what classes of errors Rust async prevents and how Rust guarantees its safety without sacrificing the speed.

๐Ÿค” Frequently Asked Questions

What does Alan want most from Async Rust?

  • The promise of better performance and memory usage than the languages he's been using. Rust's safety guarantees are important too; he's considered using C++ in the past but always judged the maintenance burden would be too high.

What expectations does Alan bring from his current environment?

  • A focus on ease of use, a strong ecosystem, and great tooling.

๐Ÿ™‹โ€โ™€๏ธ Cast of characters

Grace: the systems programming expert, new to Rust

Grace has been writing C and C++ for a number of years. She's accustomed to hacking lots of low-level details to coax the most performance she can from her code. She's also experienced her share of epic debugging sessions resulting from memory errors in C. She's intrigued by Rust: she likes the idea of getting the same control and performance she gets from C but with the productivity benefits she gets from memory safety. She's currently experimenting with introducing Rust into some of the systems she works on, and she's considering Rust for a few greenfield projects as well.

๐Ÿค” Frequently Asked Questions

What does Grace want most from Async Rust?

Grace is most interested in memory safety. She is comfortable with C and C++ but she's also aware of the maintenance burden that arises from the lack of memory safety.

What expectations does Grace bring from her current environment?

  • Grace expects to be able to get the same performance she used to get from C or C++.
  • Grace is accustomed to various bits of low-level tooling, such as gdb or perf. It's nice if Rust works reasonably well with those tools, but she'd be happy to have access to better alternatives if they were available. She's happy using cargo instead of make, for example.

๐Ÿ™‹โ€โ™€๏ธ Cast of characters

Niklaus: new programmer from an unconventional background

He's always been interested in programming but doesn't have experience with it. He's been working as a tech writer and decided to dip his toe in by opening PRs to improve the documentation for one of the libraries he was playing with. The feedback was positive so he fixed a small bug. He's now considering getting involved in a deeper way.

๐Ÿค” Frequently Asked Questions

What does Niklaus want most from Async Rust?

  • Niklaus values accessibility. He's learning a lot of new things at once and it can be overwhelming.

What expectations does Niklaus bring from his current environment?

  • Niklaus expects a strong and supportive community. The Rust community enabled him to have early success, and he is excited to have it support him and for it to help him grow more.

๐Ÿ™‹โ€โ™€๏ธ Cast of characters

Barbara: the experienced Rust developer

Barbara has been using Rust since the 0.1 release. She remembers some of the crazy syntax in Ye Olde Rust of Yore and secretly still misses the alt keyword (don't tell anyone). Lately she's maintaining various projects in the async space.

๐Ÿค” Frequently Asked Questions

What does Barbara want most from Async Rust?

  • She is using Rust for its feeling of productivity, and she expects Async Rust to continue in that tradition.
  • She maintains several existing projects, so stability is important to her.

What expectations does Barbara bring from her current environment?

  • She wants a design that feels like the rest of Rust.
  • She loves Rust and she expects Async Rust to share its overall values.

โšก Projects

What is this?

This section describes various sample projects that are referenced in our stories. Each project is meant to represent some domain that we are targeting.

List of projects

See the sidebar for the full list.

Don't find a project like yours here?

Don't despair! This is just a list of fun projects that we've needed for stories. If you'd like to add a project for your story, feel free to do so! Note though that you may find that some existing project has the same basic characteristics as your project, in which case it's probably better to reuse the existing project.

โšก Projects: NAME (DOMAIN)

This is a template for adding new projects. See the instructions for more details on how to add new project!

What is this?

This is a sample project for use within the various "status quo" or "shiny future" stories.

Description

Give a fun description of the project here! Include whatever details are needed.

๐Ÿค” Frequently Asked Questions

What makes this project different from others?

Does this project require a custom tailored runtime?

How much of this project is likely to be built with open source components from crates.io?

What is of most concern to this project?

What is of least concern to this project?

โšก Projects: MonsterMesh (embedded sensors)

What is this?

This is a sample project for use within the various "status quo" or "shiny future" stories.

Description

"MonsterMesh" is a sensor mesh on microcontrollers using Rust. The nodes communicate wirelessly to relay their results. These sensors are built using very constrained and low power hardware without operating system, so the code is written in a #[no_std] environment and is very careful about available resources.

๐Ÿค” Frequently Asked Questions

What makes embedded projects like MonsterMesh different from others?

  • Embedded developers need to write error-free applications outside of the comfort zone of an operating system. Rust helps to prevent many classes of programming errors at compile time which inspires confidence in the software quality and and cuts time intensive build-flash-test iterations.
  • Embedded developers needs good hardware abstraction. Frameworks in other languages do not provide the sophisticated memory mapped IO to safe type abstraction tooling which have been created by the Rust teams.
  • Embedded developers care about hard real time capabilities; the concept of "you only pay for what you use" is very important in embedded applications. The combination of the inherently asynchronous interrupt handling of microcontrollers with the Rust async building blocks are a perfect match to effortlessly create applications with hard realtime capabilities.
  • Embedded developers are particularly appreciative of strong tooling support. The availability of the full environment via rustup and the integration of the full toolchain with cargo and build.rs make her very happy because she can focus on what she does best instead of having regular fights with the environment.

Does MonsterMesh require a custom tailored runtime?

Yes! The tradeoffs for an embedded application like MonsterMesh and a typical server are very different. Further, most server-grade frameworks are not #[no_std] compatible and far exceeded the available footprint on the sensor nodes.

How much of this project is likely to be built with open source components from crates.io?

Having no operating system to provide abstractions to it, MonsterMesh will contain all the logic it needs to run. Much of this, especially around the hardware-software-interface is unlikely to be unique to MonsterMesh and will be sourced from crates.io. However, the further up the stack one goes, the more specialized the requirements will become.

How did you pick the name?

So glad you asked! Please watch this entertaining video.

โšก Projects: DistriData (Generic Infrastructure)

What is this?

This is a sample project for use within the various "status quo" or "shiny future" stories.

Description

DistriData is the latest in containerized, micro-service distributed database technology. Developed completely in the open as part of Cloud Native Computing Foundation, this utility is now deployed in a large portion of networked server applications across the entire industry. Since it's so widely used, DistriData has to balance flexibility with having sensible defaults.

๐Ÿค” Frequently Asked Questions

What makes DistriData different from others?

  • This project is meant to be used in many different ways in many different projects, and is not unique to any one application.
  • Many of those using this project will not even need or want to know that it's written in Rust.

Does DistriData require a custom tailored runtime?

DistriData's concerns are at a higher level than the runtime. A fast, reliable, and resource conscious general purpose runtime will serve DistriData's needs.

How much of this project is likely to be built with open source components from crates.io?

Yes, while DistriData receives many contributions, it's important to the team that when possible they utilize existing technologies that developers are already familiar with to ensure that contributing to the project is easy.

What is of most concern to this project?

It needs to be resource conscious, fast, reliable, but above all else it needs to be easy to run, monitor, and maintain.

What is of least concern to this project?

While DistriData is resource conscious, it's not resource starved. There's no need to make life difficult to save on a memory allocation here or there.

โšก Projects: TrafficMonitor (Custom Infrastructure)

What is this?

This is a sample project for use within the various "status quo" or "shiny future" stories.

Description

TrafficMonitor is a utility written by AmoogleSoft, a public cloud provider, for monitoring network traffic as it comes into its data centers to prevent things like distributed denial-of-service attacks. It monitors all network traffic, looking for patterns, and deciding when to take action against certain threat vectors. TrafficMonitor runs across almost all server racks in a data center, and while it does run on top of an operating system, it is resource conscious. It's also extremely important that TrafficMonitor stay running and handle network traffic with as few "hiccups" as possible. TrafficMonitor is highly tuned to the needs of AmoogleSoft's cloud offering and won't run anywhere else.

๐Ÿค” Frequently Asked Questions

What makes networking infrastructure projects like TrafficMonitor different from others?

  • Networking infrastructure powers entire datacenters or even public internet infrastructure, and as such it is imperative that it run without failure.
  • It is also extremely important that such projects take few resources as possible. Being on an operating system and large server racks may mean that using the standard library is possible, but memory and CPU usage should be kept to a minimum.
  • This project is worked on by software developers with different backgrounds. Some are networking infrastructure experts (usually using C) while others have experience in networked applications (usually using GCed languages like Java, Go, or Node).

Does TrafficMonitor require a custom tailored runtime?

Maybe? TrafficMonitor runs on top of a full operating system and takes full advantage of that operating systems networking stack. It's possible that a runtime meant for server workloads will work with TrafficMonitor.

How much of this project is likely to be built with open source components from crates.io?

  • TrafficMonitor is highly specialized to the internal workings of AmoogleSoft's public cloud offering. Thus, "off-the-shelf" solutions will only work if they're highly flexible and highly tuneable.
  • TrafficMonitor is central to AmoogleSoft's success meaning that getting things "just right" is much more important than having something from crates.io that mostly works but requires little custom tuning.

What is of most concern to this project?

  • Reliability is the number one concern. This infrastructure is at the core of the business - it needs to work extremely reliable. A close second is being easily monitorible. If something goes wrong, AmoogleSoft needs to know very quickly what the issue is.
  • AmoggleSoft is a large company with many existing custom tooling for building, monitoring, and deploying its software. TrafficMonitor has to play nicely in a world that existed long before it came around.

What is of least concern to this project?

AmoogleSoft is a large company with time and resources. High-level frameworks that remove control in favor of peak developer productivity is not what they're after. Sure, the easier things are to get working, the better, but that should not be at the sacrifice of control.

โšก Projects: YouBuy (Traditional Server Application)

What is this?

This is a sample project for use within the various "status quo" or "shiny future" stories.

Description

YouBuy is a growing e-commerce website that now has millions of users. The team behind YouBuy is struggling to keep up with traffic and keep server costs low. Having originally written YouBuy in a mix of Ruby on Rails and Node, the YouBuy team decides to rewrite many parts of their service in Rust which they've investigated and found to be performant while still allowing for high levels of abstraction they're used to.

๐Ÿค” Frequently Asked Questions

What makes YouBuy and other server applications different from others?

  • Many server applications are written in languages with garbage collectors. Many of the things that Rust forces users to care about are not first order concerns for those working on server applications (e.g., memory management, stack vs heap allocations, etc.).
  • Many server applications are written in languages without static type checking. The developers of YouBuy don't have much experience with statically typed languages and some of the developers early in their Rust learning journeys expressed frustration that they found it hard to get their programs to compile especially when using async constructs.

Does YouBuy require a custom tailored runtime?

YouBuy should be perfectly fine with a runtime from crates.io. In fact, their concern isn't at the runtime level but at the high-level server framework level.

How much of this project is likely to be built with open source components from crates.io?

YouBuy is in fierce competition with many other e-commerce sites. Therefore, the less that YouBuy engineers have to write themselves, the better. Ideally, YouBuy can focus 100% of its energy on features that differentiate it from its competition and none of its time on tweaking its networking stack.

What is of most concern to this project?

It seems like YouBuy is always on the verge of either becoming the next billion-dollar company with hundreds of millions of users or completely going out of business. YouBuy needs to be able to move fast and focus on the application business logic.

What is of least concern to this project?

Since moving fast is of primary concern, the ins and outs of the underlying networking stack are only of concern when something goes wrong. The hope is that that rarely if ever happens and when it does, it's easy to find the source of the issue.

โšก Projects: SLOW (Protocol implementation)

What is this?

This is a sample project for use within the various "status quo" or "shiny future" stories.

Description

SLOW is an open source implementation of a fancy new protocol. This protocol uses a mix of TCP and UDP packets and is designed to operate particularly well over high latency, low throughput links.

๐Ÿค” Frequently Asked Questions

What makes this project different from others?

SLOW is a library, not an application.

Does this project require a custom tailored runtime?

Ideally, SLOW would be developed in an independent way that permits it to be used across many runtimes in a variety of different environments.

How much of this project is likely to be built with open source components from crates.io?

SLOW builds on other generic libraries available from crates.io. For example, it would like to make use of compression algorithms that others have written, or to use future adapters.

What is of most concern to this project?

Uh, I don't really know! If you develop software like this, maybe open a PR and tell me! --nikomatsakis

What is of least concern to this project?

Uh, I don't really know! If you develop software like this, maybe open a PR and tell me! --nikomatsakis

Why is this called SLOW?

It's like QUIC, but slow! Get it? Get it? :D

๐Ÿ˜ฑ Status quo

Where did all the stories go?

The full set of "submitted" status quo stories have been moved here. This area will be used for a "combined" status quo story which has not yet been written!

โœจ Shiny future

This page represents a complete vision for where we want async to go. This vision is what we believe to be the best way to achieve the experiences that we want async to provide.

Work in progress

Note that while a lot of the steps needed are fairly clear, several of them also have significant unknowns or points of controversy. We have attempted to highlight those and expect to be working through those points as we go.

Certainty levels

  • ๐ŸŒˆ -- Implemented and stable
  • ๐ŸŒž -- Everything is looking good
  • ๐ŸŒค๏ธ -- Still some stuff to figure out, but unlikely to see major changes in the design
  • ๐ŸŒฅ๏ธ -- Got one or two solid leads, but still have to figure out if it will work
  • ๐ŸŒง๏ธ -- No clear path yet, this may not even be a good idea

Key aspects of the future

  • ๐ŸŒค๏ธ If you know sync Rust, getting started in Async Rust is straightforward ([more][async_fn_everywhere])
    • ๐ŸŒค๏ธ Mostly, you change fn to async fn, add some calls to await, and change over to other parts of the stdlib, though supporting dyn Trait requires making some choices, particularly in a no-std environment
    • ๐ŸŒค๏ธ It still has that "if it compiles, it generally works, and it runs pretty darn fast" feeling
    • ๐ŸŒค๏ธ Destructors and cleanup also work the same way as in sync Rust, thanks to Drop to AsyncDrop
    • ๐ŸŒค๏ธ No need to write poll functions or to interact with pin except in quite specialized scenarios
  • ๐ŸŒค๏ธ High-quality documentation and tutorials helps you to get started and learn the ropes
    • ๐ŸŒค๏ธ The docs also identify common patterns for structuring your async programs and their advantages and disadvantages
  • ๐ŸŒฅ๏ธ Tooling and debugger integration gives insight into the behavior of your program
    • ๐ŸŒฅ๏ธ Easy to get a snapshot of overall acitivity (e.g. to find out what tasks or exist or why a task is blocked)
    • ๐ŸŒฅ๏ธ Easy to see aggregate performance trends over time (e.g., number of active connections, waiting connections, etc)
    • ๐ŸŒฅ๏ธ Easy to profile things in terms of your async tasks (e.g., to get a flamegraph of a specific connection)
  • ๐ŸŒฅ๏ธ Variety of high-quality runtimes available in cargo, and it's easy to change between them:
    • ๐ŸŒง๏ธ When you use things from the standard library, they work across runtimes automatically
    • ๐ŸŒฅ๏ธ There are standardized, foundational traits for common operations like I/O, spawning tasks, timers
  • ๐ŸŒฅ๏ธ Hierarchical scopes allow you to easily spawn parallel and concurrent tasks
    • ๐ŸŒฅ๏ธ These can reference borrowed data, enabling easy parallel processing of async iterators (think "async rayon")
  • ๐ŸŒฅ๏ธ Cancellation works well and without surprises
    • ๐ŸŒฅ๏ธ When cancellation is requested, it propagates to subtasks within a scope
    • ๐ŸŒง๏ธ I/O operations and the like begin to fail, so that cancellation is automatic and flows through familiar error paths
    • ๐ŸŒฅ๏ธ If desired, you can "opt-in" to synchronous cancellation, in which case any await becomes a cancellation point. This allows your async fn to be used with select without spawning a task.

Learn more

Check out...

Where did all the stories go?

The full set of "submitted" shiny future stories have been moved here.

User's Manual of the Future

I always dreamed of seeing the future

This text is written from the perspective of async Rust's "shiny future". It describes the async Rust that future users will experience. Embedded within are links of the form "deliv_xxx" that connect to the specific deliverables that are being described.

Note: Not everything in the future is great. Search for "Caveat" and you'll find a few notes of problems that we don't expect to fix.

Introduction: Async I/O as a user

What is async I/O?

These days, most Rust code that interacts with the network or does high-performance I/O is Async I/O. Async I/O is, in some sense, an implementation detail. It is a set of language extensions that make it easy to run many asynchronous tasks using only a small number of underlying operating system threads. This means that you can scale up to a very large number of tasks using only a small amount of resources. To be frank, for many applications, async I/O is overkill. However, there are some for which it is absolutely essential, and that's why most of the high quality libraries are using asynchronous interfaces. Fortunately, async Rust is quite easy to use, so even if you don't really need the power right now, that's not a problem.

Choosing a runtime

When you use sync Rust, operations like I/O and so forth are taken care of by your operating system (or your libc implementation, in any case). When you use async Rust, though, the mapping between asynchronous tasks is performed by a library, called a runtime. One of Rust's key distinguishing features is that it doesn't bake in the choice of a runtime. This means that people are free to develop libaries which use a variety of different strategies to schedule tasks, I/O, and so forth. The choice of runtime can in some cases make a big difference to your overall performance, or what kind of environments you can run in.

If this seems overwhelming, don't worry. Rust makes it easy to experiment with runtimes and try different ones (deliv_portable). Here is a list of some of the popular runtimes, and the sorts of applications where they are suitable:

  • General purpose, good for just about anything: tokio, async-std
  • High-performance file I/O, thread-per-core architecture: glommio
  • Focused on reliability: bastion
  • Embedded environments: embassy

If you are not sure what's best for you, we recommend picking any of the general purpose runtimes.

Async fn: where it all starts

Getting started with async Rust is easy. Most anywhere that you write fn in Rust, you can now write async fn (exception: extern blocks), starting with the main function:

#[tokio::main] // or async_std::main, glommio::main, etc
async fn main() {
    println!("Hello, world!"); // <-- expect a warning here
}

You can see that we decorated main with a #[tokio::main] attribute. This is how we select the runtime we will use: most runtimes emit a similar decorator, so you could change this to #[async_std::main], #[glommio::main], or #[embassy::main] and all the examples and code we talk about in this document would work just the same. (deliv_portable)

Whichever runtime you choose, if you actually try to compile this, you're going to see that you get a warning (deliv_lint_blocking_fn):

    println!("Hello, world!");
    ^^^^^^^ synchronous I/O in async fn

This is because macros like println! expand to blocking operations, that take control of the underlying thread and don't allow the scheduler to continue. You need to use the async equivalent (deliv_portable_stdlib), then await the result:

async fn main() {
    async_println!("Hello, world!").await;
}

When you await on something, you are pausing execution and waiting for it to complete before you continue. Under the hood, an await corresponds to giving up the current thread of control so that the runtime can do something else instead while you wait (e.g., process another task).

Documentation and common patterns

This document is a survey of some of the major aspects of writing async functions. If you'd like a deeper introduction, the async book both explains how to get started in async but also common patterns, mistakes to avoid, and some of the details of the various runtimes you can choose from. (deliv_documentation)

Async iterators

So far, using async seems like mostly more work to accomplish the same thing, since you have to add await keywords everywhere. But async functions are like synchronous functions with superpowers: they have the ability to easily compose complex schedules of parallel and concurrent workloads. This is particularly true when you start messing around with asynchronous iterators.

Consider this example. Imagine that you have a bunch of networking requests coming in. For each one, you have to do a bit of lightweight preparation, and then some heavyweight processing. This processing can take up a lot of RAM, and takes a while, so you can only process one request at a time, but you would like to do up to 5 instances of that lightweight preparation in parallel while you wait, so that things are all queued up and ready to go. You want a schedule like this, in other words:

   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ Preparation 1 โ”‚ โ”€โ”€โ”€โ”€โ”€โ”
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
                          โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ Preparation 2 โ”‚ โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ–บโ”‚ Process item  โ”‚ โ”€โ”€โ”€โ”€โ”€โ–บ
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                          โ”‚
     ...                  โ”‚
                          โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
   โ”‚ Preparation 5 โ”‚ โ”€โ”€โ”€โ”€โ”€โ”˜
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

You can create that quite easily:


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    let work = do_select(database, FIND_WORK_QUERY)?;
    stream::iter(work)
        .map(async |item| preparation(database, item).await)
        .buffered(5)
        .for_each(async |work_item| process_work_item(database, work_item).await)
        .await;
}
}

The buffered combinator on async iterators creates a schedule that does up to 5 items in parallel, but still produces one item at a time as the result. Thus for_each executes on only one item at a time.

How does all this work? The basic AsyncIterator trait (deliv_async_iter) looks quite similar to the standard Iterator trait, except that it has an async fn (this fn also has a #[repr] annotation; you can ignore it for now, but we discuss it later).


#![allow(unused)]
fn main() {
trait AsyncIter {
    type Item;

    #[repr(inline)]
    async fn next(&mut self) -> Self::Item;
}
}

However, when you use combinators like buffered that introduce parallelism, you are now using a parallel async iterator (deliv_async_iter), similar to the parallel iterators offered by [rayon]. The core operation here is for_each (which processes each item in the iterator):


#![allow(unused)]
fn main() {
trait ParAsyncIter {
    type Item;

    async fn for_each(&mut self, op: impl AsyncFn(Self::Item));
}
}

Editor's note: There's a subtle difference between for_each here and Rayon's for_each. It might actually be nice to rework Rayon's approach too. Detail hammering still required!

Scopes

Parallel async iterators are implemented atop of something called scopes (deliv_scope_api). Scopes are a way of structuring your async tasks into a hierarchy. In this hierarchy, every parent task waits for its children to complete before it itself is complete. Scopes are also connected to cancellation: when you mark a parent task as cancelled, it propagates that cancellation down to its children as well (but still waits for them to finish up) (deliv_cancellation).

Scopes allow you to spawn parallel tasks that access borrowed data (deliv_borrowed_data). For example, you could rewrite the parallel iterator above using scopes. For simplicity, we'll ignore the "up to 5 items being prepared" and just spawn a task for all items at once:


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    std::async_thread::scope(async |s| {
        // Channel to send prepared items over to the
        // task that processes them one at a time:
        let mut (tx, rx) = std::async_sync::mpsc::channel();

        // Spawn a task to spawn tasks:
        s.spawn(async move || {
            let work = do_select(database, FIND_WORK_QUERY)?;
            work.for_each(|item| {
                // Spawn a task processing each item and then
                // sending it on the channel:
                s.spawn(async |item| {
                    let prepared_item = preparation(database, item).await
                    tx.send(prepared_item).await;
                });
            });
        });

        // Spawn a task to spawn tasks:
        s.spawn(async move || {
            while let Some(item) = rx.next().await {
                process_item(item).await;
            }
        });
    });
}
}

Cancellation

Cancelling a task is a common operation in async code. Often this is because of a dropped connection, but it could also be because of non-error conditions, such as waiting for the first of two requests to complete and taking whichever finished first. (deliv_cancellation)

Editor's note: Clearly, this needs to be elaborated. Topics:

  • Ambient cancellation flag vs explicit passing
  • Connecting to I/O operations so they produce errors
  • Opt-in synchronous cancellation, select

Async read and write traits

The AsyncRead and AsyncWrite traits are the most common way to do I/O. They are the async equivalent of the std::io::Read and std::io::Write traits. They are used in a similar way. deliv_async_read_write

Editor's note: This requires elaboration. The challenge is that the best design for these traits is unclear.

Async fns in traits, overview

Async functions work in traits, too (deliv_async_fundamentals):


#![allow(unused)]
fn main() {
trait HttpRequest {
    async fn request(&self, url: &Url) -> HttpResponse;
}
}

Desugaring async fn in traits into impl Trait and generic associated types

Async functions actually desugar into functions that return an impl Future. When you use an async function in a trait (deliv_impl_trait_in_trait), that is desugared into a (generic) associated type in the trait (deliv_gats) whose value is inferred by the compiler (deliv_tait):


#![allow(unused)]
fn main() {
trait SomeTrait {
    async fn foo(&mut self);
}

// becomes:

trait SomeTrait {
    fn foo<(&mut self) -> impl Future<Output = ()> + '_;
}

// becomes something like:
//
// Editor's note: The name of the associated type is under debate;
// it may or may not be something user can name, though they should
// have *some* syntax for referring to it.

trait SomeTrait {
    type Foo<'me>: Future<Output = ()> + '_
    where
        Self: 'me;

    async fn foo(&mut self) -> Self::Foo<'_>;
}
}

What this means is that the future type SomeTrait::Foo is going to be a generated type returned by the compiler that is speciic to that future.

Caveat: Gritty details around dyn Trait and no-std

However, there is a catch here. When a trait contains async fn, using dyn types (e.g., dyn HttpRequest, for the trait above) can get a bit complicated. (deliv_dyn_async_trait) By default, we assume that folks using dyn HttpRequest are doing so in a multithreaded, standard environment. This means that, by default:

  • A reference like &T can only be cast to &dyn HttpRequest if all the async fn in its impl are Send
    • Note that you can still write impls whose async fn are not send, but you cannot use them with dyn (again, by default).
  • Async calls that go through a dyn HttpRequest will allocate a Box to store their data
    • This is usually fine, but in particularly tight loops can be a performance hazard.
    • Note that this only applies when you use dyn HttpRequest; most tight loops tend to use generics like T: HttpRequest anyway, and here there is no issue.

These assumptions don't work for everyone, so there are some knobs you can turn:

  • You can request that the futures not be assumed to be Send.
  • You can change the "smart pointer" type used to allocate data; for example, instead of Box, a choice like Stack<32> would stack allocate up to 32 bytes (compilation errors will result if more than 32 bytes are required), and SmallBox<32> would stack allocate up to 32 bytes but heap allocate after that. (deliv_dyn_async_trait)
  • You can use 'inline' async functions, though these are not always suitable. (These are covered under "Diving into the details".)

The way that all of this is implemented is that users can define their own impls of the form impl Trait for dyn Trait (deliv_dyn_trait). This permits us to supply a number of derives that can be used to implement the above options.

Tooling

There are a number of powerful development tools available for debugging, profiling, and tuning your Async Rust applications (deliv_tooling). These tools allow you to easily view the current tasks in your application, find out what they are blocked on, and do profiling to see where they spend their time.

Async Rust includes profiling tools that are sufficiently lightweight that you can run them in your production runs, giving very accurate data about what is really happening in your system. They also allow you to process the data in a number of ways, such as viewing profiles per request, or for requests coming from a specific source.

The tools also include "hazard detection" that uncovers potential bugs or performance problems that you may not have noticed. For example, they can identify functions that run too long with any form of await or yield, which can lead to "hogging" the CPU and preventing other tasks from running.

Finally, the tools can make suggestions to help you to tune your async code performance. They can identify code that ought to be outlined into separate functions, for example, or instances where the size of futures can be reduced through judicious use of heap allocation (deliv_boxable). These edits come in the form of suggestions, much like the compiler, which can be automatically applied with cargo fix.

Bridging the sync and async worlds

One of the challenges of async programming is how to embed synchronous snippets of code. A synchronous snippet is anything that may occupy the thread for a long period of time without executing an await. This might be because it is a very long-running long loop, or it may be because of it invokes blocking primitives (like synchronous I/O). For efficiency, the async runtimes are setup to assume that this doesn't happen. This means that it is your responsibility to mark any piece of synchronous code with a call to blocking. This is a signal to the runtime that the code may block, and it allows the runtime to execute the code on another thread or take other forms of action:


#![allow(unused)]
fn main() {
std::async::blocking(|| ...).await;
}

Note that blocking is an async function. Interally, it is built on the scope method spawn_blocking, which spawns out a task into an inner scope (deliv_scope_api):


#![allow(unused)]
fn main() {
async fn blocking<R>(f: impl FnOnce() -> R) -> R {
    scope(|s| s.spawn_blocking(f).await).await
}
}

Caveat: Beware the async sandwich

One challenge with integrating sync and async code is called the "async sandwich". This occurs when you have async code that calls into sync code which in turn wishes to invoke async code:

  • an async fn A that calls ..
  • a synchronous fn B that wishes to block on ..
  • an async fn C doing some I/O

The problem here is that, for this to work, the async fn A really needs to call the synchronous function with blocking, but that may not be apparent, and A may not be in your control (that is, you may be authoring B and/or C, and not be able to modify A). This is a difficult situation without a great answer. Some runtimes offer methods that can help in this situation, but deadlocks may result.

We hope to address this with 'overloaded async' functions, but more work is needed to flesh out that design (deliv_async_overloading).

Diving into the details

The previous topics covered the "high-level view" of async. This section dives a bit more into some of the details of how things work.

"Inline" async functions

Inline async functions (deliv_inline_async_fn) are an optimization that is useful for traits where the trait represents the primary purpose of the type that implements it; typically such traits are implemented by dedicated types that exist just for that purpose. Examples include:

  • The read and write traits.
  • Async iterators.
  • Async functions.

Inline async functions are also crucial to AsyncDrop (deliv_async_drop), discussed below.

Inline async functions are declared within a trait body. They indicate that all intermediate state for the function is stored within the struct itself:


#![allow(unused)]
fn main() {
trait AsyncIter {
    type Item;

    #[repr(inline)]
    async fn next(&mut self) -> Self::Item;
}
}

This implies some limitations, but it has some benefits as well. For example, traits that contain only inline async functions are purely dyn safe without any overhead or limitations.

Boxable heap allocation

One of the challening parts of writing a system that juggles many concurrent requests is deciding how much stack to allocate. Pthread-based systems solve this problem by reserving a very large portion of memory for the stack, but this doesn't scale up well when you have very large numbers of requests. A better alternative is to start with a small stack and grow dynamically: but that can be tricky to do without hitting potential performance hazards.

Rust occupies an interesting spot in the design space. For simple Rust futures, we will allocate exactly as much stack space as is needed. This is done by analyzing the future and seeing what possible calls it may make.

Sometimes, though, this analysis is not possible. For example, a recursive function can use infinite stack. In cases like this, you can annotate your async function to indicate that its stack space should be allocated on the heap where it is called (deliv_boxable):


#![allow(unused)]
fn main() {
box async fn foo() { .. }
}

These annotations are also useful for tuning performance. The tooling (deliv_tooling) can be used to suggest inserting box keywords on cold code paths, thus avoiding allocating stack space that is rarely used.

Async drop

Cleaning up resources in async code is done using destructors, just as in synchronous Rust. Simply implement the AsyncDrop trait (deliv_async_drop) instead of Drop, and you're good to go:


#![allow(unused)]
fn main() {
impl AsyncDrop for MyType {
    async fn drop(&mut self) {
        ...
    }
}
}

Just as in synchronous Rust, you are advised to keep destructors limited in their effects.

Caveat: Synchronous drop

One thing to be aware of when you implement AsyncDrop is that, because any Rust value can be dropped at any point, the type system will allow your type to be dropped synchronously as well. We do however have a lint that detects the most common cases and gives you a warning, so this is rare in practice.

Note: If a type that implements AsyncDrop but not Drop is dropped synchronously, the program will abort!

Caveat: Implicit await points

One other thing to be aware of is that async drop will trigger implicit awaits each time a value is dropped (e.g., when a block is exited). This is rarely an issue.

Roadmap

What follows is a list of high-level goals, like "async fn everywhere", that capture some part of the improved user experience. Each goal has associated initiatives, which are particular streams of work within that goal. Each goal and each initiative have an associated owner -- in some cases multiple owners -- who are the people responsible for ensuring that the goal/initiative is making progress. If you click on a goal/initiative, you will get a high-level description of its impact. That is, how the experience of using async Rust is going to change as a result of this work.

We categorize the goals and initiatives into four states:

StateMeaning
โœ…Done.
๐Ÿฆ€In progress: Work is ongoing!
โœ‹Help wanted: Seeking an owner to pursue this! Talk to the wg leads if you are interested.
๐Ÿ’คPaused: we are waiting to work on this until some other stuff gets done.

Some goals and initiatives have further "how to help" instructions for those wanting to contribute. These are marked by the ๐Ÿ› ๏ธ symbol.

Impact and milesetones

Clicking on active initiatives also shows a list of milestones. These milestones (things like "write an evaluation doc") indicate the planned work ahead of us. We meet every 2 weeks to assess our progress on these milestones and to update the list as needed.

Overview

DeliverableStateProgressOwner
๐Ÿ”ป Async fn everywhere๐Ÿฆ€โ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑtmandry
ย ย โ†ณ Type Alias Impl Trait๐Ÿฆ€โ–ฐโ–ฐโ–ฐโ–ฐโ–ฐโ–ฑoli-obk
ย ย โ†ณ Generic Associated Types๐Ÿฆ€โ–ฐโ–ฐโ–ฐโ–ฐโ–ฐโ–ฑjackh726
ย ย โ†ณ Fundamentals๐Ÿฆ€โ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑtmandry
ย ย โ†ณ Boxable async functions๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Async main and tests๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Scoped spawn and reliable cancellation๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Capability๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Scope API๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Async iteration๐Ÿฆ€โ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑestebank
ย ย โ†ณ Async iteration trait๐Ÿ’คโ–ฐโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Generator syntax๐Ÿ’คโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Portable across runtimes๐Ÿ’คโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Read/write traits๐Ÿ’คโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Timer traits๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Spawn traits๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Runtime trait๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Polish [๐Ÿ› ๏ธ]๐Ÿฆ€โ–ฐโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑeholk
ย ย โ†ณ Error messages๐Ÿ’คโ–ฐโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Must not suspend lint๐Ÿฆ€โ–ฐโ–ฐโ–ฐโ–ฐโ–ฑโ–ฑ
ย ย โ†ณ Blocking function lint๐Ÿ’คโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Lint against large copies๐Ÿ’คโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Cleaner async stacktraces๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Precise generator captures๐Ÿฆ€โ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Sync and async behave the same๐Ÿฆ€โ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Tooling๐Ÿฆ€โ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑpnkfelix
ย ย โ†ณ Tokio console๐Ÿฆ€โ–ฐโ–ฐโ–ฐโ–ฐโ–ฑโ–ฑeliza weisman
ย ย โ†ณ Crashdump debugging๐Ÿฆ€โ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑmichaelwoerister
๐Ÿ”ป Documentation๐Ÿฆ€โ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ Async book๐Ÿ’คโ–ฐโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Testing๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ tbd๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Threadsafe portability๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ tbd๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
๐Ÿ”ป Async overloading๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ
ย ย โ†ณ tbd๐Ÿ’คโ–ฐโ–ฑโ–ฑโ–ฑโ–ฑโ–ฑ

Async fn everywhere

Impact

Boxable async fn

Impact

  • Able to easily cause some async functions, blocks, or closures to allocate their stack space lazilly when called (by 'boxing' it)
    • Combined with profiler or other tooling support, this can help to tune the size of futures
  • Boxed async blocks allows particular portions of a function to be boxed, e.g. cold paths

Milestones

MilestoneStateKey participants
Author evaluation doc๐Ÿ’ค
Feature complete implementation๐Ÿ’ค

Design notes

Example might be to use a decorator:


#![allow(unused)]
fn main() {
#[boxed]
async fn foo() { }
}

This does not have to desugar to -> Box<dyn Future<...>>; it can instead desugar to Box<impl Future>, or perhaps a nominal type to permit recursion.

Another approach is the box keyword:


#![allow(unused)]
fn main() {
box async fn foo() { }
}

We can apply the keyword modifier to async blocks and closures:


#![allow(unused)]
fn main() {
fn foo() -> BoxFuture<Output = ()> {
    box async { ... }
}
}

#![allow(unused)]
fn main() {
async fn stuff(s: impl AsyncIterator) {
    s.map(box async |x| { ... })
}
}

This is useful for breaking up future types to make them more shallow.

Async main and tests

Impact

  • Able to write #[test] that easily use async functions.
  • In the case of portable libraries, end users are able to re-run test suites with distinct runtimes.

Milestones

Able to write async fn main and #[test] async fn just like you would in synchronous code.

This initiative is on hold while we investigate mechanisms for portability across runtimes.

Scopes

Impact

  • Able to spawn parallel tasks or blocking work that accesses borrowed data
  • Easily create expressive scheduler patterns that make use of borrowed data using high-level combinators and APIs
  • When data is no longer needed, able to cancel work and have it reliably and promptly terminate, including any subtasks or other bits of work it may have created
  • Cancellation does not leave work "half-finished", but reliably cleans up program state
  • Able to use DMA, io-uring, etc to write directly into output buffers, and to recover in the case of cancellation

Requires

Design notes

Async functions are commonly written with borrowed references as arguments:


#![allow(unused)]
fn main() {
async fn do_something(db: &Db) { ... }
}

but important utilities like spawn and spawn_blocking require 'static tasks. Building on non-cancelable traits, we can implement a "scope" API that allows one to introduce an async scope. This scope API should permit one to spawn tasks into a scope, but have various kinds of scopes (e.g., synchronous execution, parallel execution, and so forth). It should ultimately reside in the standard library and hook into different runtimes for scheduling. This will take some experimentation!


#![allow(unused)]
fn main() {
async fn foo(db: &Database) {
    let result = std::async_thread::scope(|s| {
        let job1 = s.spawn(async || {
            async_thing(db)
        });
        let job2 = s.spawn_blocking(|| {
            sync_thing(db)
        });

        (job1.await, job2.await)
    }).await;
}
}

Side-stepping the nested await problem

One goal of scopes is to avoid the "nested await" problem, as described in Barbara battles buffered streams (BBBS). The idea is like this: the standard combinators which run work "in the background" and which give access to intermediate results from that work should schedule that work into a scope.1 This would typically be done by using an "interior iterator" pattern, but it could also be done by taking a scope parameter. Some examples from today's APIs are FuturesUnordered and Stream::buffered.

1

This is not a hard rule. But invoking poll manually is best regarded as a risky thing to be managed with care -- not only because of the formal safety guarantees, but because of the possibility for "nested await"-style failures.

In the case of BBBS, the problem arises because of buffered, which spawns off concurrent work to process multiple connections. Under this system, the implementation of buffered would create an internal scope for spawn its tasks into that scope, side-stepping the problem. One could imagine also offering a variant of buffered like buffered_in that takes a scope parameter, permitting the user to choose the scope of those spawned tasks:


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    std::async_thread::scope(|s| {
        let work = do_select(database, FIND_WORK_QUERY).await?;
        std::async_iter::from_iter(work)
            .map(|item| do_select(database, work_from_item(item)))
            .buffered_in(5, scope)
            .for_each(|work_item| process_work_item(database, work_item))
            .await;
    }).await;
}
}

Concurrency without scopes: Join, select, race, and friends

It is possible to introduce concurrency in ways that both (a) do not require scopes and (b) avoid the "nested await" problem. Any combinator which takes multiple Async instances and polls them to completion (or cancels them) before it itself returns is ok. This includes:

  • join, because the join(a, b) doesn't complete until both a and b have completed;
  • select, because selecting will cancel the alternatives that are not chosen;
  • race, which is a variant of select.

This is important because embedded systems often avoid allocators, and the scope API implicitly requires allocation (one can spawn an unbounded number of tasks).

Cancellation

In today's Rust, any async function can be synchronously cancelled at any await point: the code simply stops executing, and destructors are run for any extant variables. This leads to a lot of bugs. (TODO: link to stories)

Under systems like Swift's proposed structured concurrency model, or with APIs like .NET's CancellationToken, cancellation is "voluntary". What this means is that when a task is cancelled, a flag is set; the task can query this flag but is not otherwise affected. Under structured concurrency systems, this flag is propagated to all chidren (and transitively to their children).

Voluntary cancellation is a requirement for scoped access. If there are parallel tasks executing within a scope, and the scope itself is canceled, those parallel tasks must be joined and halted before the memory for the scope can be freed.

One downside of such a system is that cancellation may not take effect. We can make it more likely to work by integrating the cancellation flag into the standard library methods, similar to how tokio encourages "voluntary preemption". This means that file reads and things will start to report errors (Err(TaskCanceled)) once the task has been canceled. This has the advantage that it exercises existing error paths and permits recovery.

Cancellation and select

The select macro chooses from N futures and returns the first one that matches. Today, the others are immediately canceled. This behavior doesn't play especially well with voluntary cancellation. There are a few options here:

  • We could make select signal cancellation for each of the things it is selecting over and then wait for them to finish.
  • We could also make select continue to take Future (not Async) values, which effectively makes Future a "cancel-safe" trait (or perhaps we introduce a CancelSafe marker trait that extends Async).
    • This would mean that typical async fn could not be given to select, though we might allow people to mark async fn as "cancel-safe", in which case they would implement Future. They would also not have access to ordinary async fn, though.
      • Effectively, the current Future trait becomes the "cancel-safe" form of Async. This is a bit odd, since it has other distinctions, like using Pin, so it might be preferable to use a 'marker trait'.
    • Of course, users could spawn a task that calls the function and give the handle to select.

Frequently asked questions

Could there be a convenient way to access the current scope?

If we wanted to integrate the idea of scopes more deeply, we could have some way to get access to the current scope and reference its lifetime. Lots of unknowns to work out here, though. For example, suppose you have a function that creates a scope and invokes a closure within. Do we have a way to indicate to the closure that 'scope in that closure may be different?

It starts to feel like simply passing "scope" values may be simpler, and perhaps we need a way to automate the threading of state instead. Another advantage of passing a scope explicitly is that it is clear when parallel tasks may be launched.

How does cancellation work in other settings?

Many other languages use a shard flag to observe when cancellation has been requested.

In some languages, there is also an immediate callback that is invoked when cancellation is requested which permits you to take immediate action. Swift proposal E0304, for example, includes "cancellation handlers" that are run immediately.

  • Kotlin cancellation:
    • You can invoke cancel on launched jobs (spawned tasks).
    • Cancelling sets a flag that the job can check for.
    • Builtin functions check for the flag and throw an exception if it is set.

What is the relationship between AsyncDrop and cancellation?

In async Rust today, one signals cancellation of a future by (synchronously) dropping it. This forces the future to stop executing, and drops the values that are on the stack. Experience has shown that this is someting users have a lot of trouble doing correctly, particularly at fine granularities (see e.g. Alan builds a cache or Barbara gets burned by select).

Given AsyncDrop, we could adopt a similar convention, where canceling an Async is done by (asynchronously) dropping it. This would presumably amend the unsafe contract of the Async trait so that the value must be polled to completion or async-dropped. To avoid the footguns we see today, a typical future could simply continue execution from its AsyncDrop method (but disregard the result). It might however set an internal flag to true or otherwise allow the user to find out that it has been canceled. It's not clear, though, precisely what value is being added by AsyncDrop in this scenario versus the Async simply not implementing AsyncDrop -- perhaps though it serves as an elegant way to give both an immediate "cancellation" callback and an opportunity to continue.

An alternative is to use a cancellation token of some kind, so that scopes can be canceled and that cancelation can be observed. The main reason to have that token or observation mechanism be "built-in" to some degree is so that it can be observed and used to drive "voluntary cancellation" from I/O routines and the like. Under that model, AsyncDrop would be intended more for values (like database handles) that have cleanup to be done, much like Drop today, and less as a way to signal cancellation.

Capability

Impact

  • The ability to create async tasks that can be safely given access to borrowed data, similar to crossbeam or rayon scopes
  • There are potentially multiple routes with which this can be accomplished

Design notes

Today's Future trait lacks one fundamental capability compared to synchronous code: there is no (known?) way to "block" your caller and be sure that the caller will not continue executing until you agree. In synchronous code, you can use a closure and a destructor to achieve this, which is the technique used for things like rayon::scope and crossbeam's scoped threads. In async code, because the Future trait has a safe poll function, it is always possible to poll it part way and then mem::forget (or otherwise leak) the value; this means that one cannot have parallel threads executing and using those references.

Async functions are commonly written with borrowed references as arguments:


#![allow(unused)]
fn main() {
async fn do_something(db: &Db) { ... }
}

but important utilities like spawn and spawn_blocking require 'static tasks. Without "unfogettable" traits, the only way to circumvent this is with mechanisms like FuturesUnordered, which is then subject to footguns as described in Barbara battles buffered streams.

There are two main approaches under consideration to address this issue:

Variant: Async trait

As proposed in https://github.com/Matthias247/rfcs/pull/1, one way to solve this is to introduce a new future trait with an unsafe poll method:


#![allow(unused)]
fn main() {
trait Async {
    type Output;

    /// # Unsafe conditions
    ///
    /// * Once polled, cannot be moved
    /// * Once polled, destructor must execute before memory is deallocated
    /// * Once polled, must be polled to completion
    ///
    /// FIXME: Have to specify how to manage panic.
    unsafe fn poll(
        &mut self,
        context: &mut core::task::Context<'_>,
    ) -> core::task::Poll<Self::Output>;
}
}

This would then require "bridging impls" to convert the (now likely deprecated, or at least repurposed) Future trait:


#![allow(unused)]
fn main() {
impl<F: Future> Async for F { .. } // impl A
}

which in turn creates an interesting question, since if we wish to have a single combinator that is usable with either trait, specialization would be required:


#![allow(unused)]
fn main() {
impl<F: Future> Future for Combinator<F> { .. } // impl B
impl<F: Async> Async for Combinator<F> { .. }  // impl C

// Coherence error: Given some type `F1: Future`, 
// two ways to show that `Combinator<F1>: Async`.
}

Bridging

Introduce "bridge impls" like the following:


#![allow(unused)]
fn main() {
impl<F> Async for F where F: Future {

}
}

Newer runtimes will be built atop the Async trait, but older code will still work with them, since everything that implements Future implements Async.

Combinators

One tricky case has to do with bridging combinators. If you have a combinator like Join:


#![allow(unused)]
fn main() {
struct Join<A, B> { ... }

impl<A, B> Future for Join<A, B>
where
    A: Future,
    B: Future,
{ }
}

This combinator cannot then be used with Async values. You cannot (today) add a second impl like the following for coherence reasons:


#![allow(unused)]
fn main() {
impl<A, B> Async for Join<A, B>
where
    A: Async,
    B: Async,
{ }
}

The problem is that this second impl creates multiple routes to implement Async for Join<A, B> where A and B are futures. These routes are of course equivalent, but the compiler doesn't know that.

Solution A: Don't solve it

We might simply introduce new combinators for the Async trait. Particularly given the move to scoped threads it is likely that the set of combinators would want to change anyhow.

Solution B: Specialization

Specialization can be used to resolve this, and it would be a great feature for Rust overall. However, specialization has a number of challenges to overcome. Some related articles:

Variant: Leak trait

(Requires elaboration)

API

Impact

  • Able to create hierarchical scopes, easily spawn async & blocking tasks within those scopes, and propagate cancellation.
  • Able to select any runtime to back the API

Flexible async iteration

Impact

  • Able to create and compose iterators that await async results
  • Able to create complex parallel or concurrent schedules that work reliably

Async iteration

Impact

  • Able to write code that takes "something iterable"
  • Able to use combinators similar to synchronous Iterator
  • Able to construct complex, parallel schedules that can refer to borrow data

Requires

Design notes

The async iterator trait can leverage inline async functions:


#![allow(unused)]
fn main() {
#[repr(inline_async)]
trait AsyncIterator {
    type Item;

    async fn next(&mut self) -> Self::Item;
}
}

Note the name change from Stream to AsyncIterator.

One implication of this change is that pinning is no longer necessary when driving an async iterator. For example, one could now write an async iterator that recursively walks through a set of URLs like so (presuming std::async_iter::from_fn and async closures):


#![allow(unused)]
fn main() {
fn explore(start_url: Url) -> impl AsyncIterator {
    let mut urls = vec![start_url];
    std::async_iter::from_fn(async move || {
        if let Some(url) = urls.pop() {
            let mut successor_urls = fetch_successor_urls(url).await;
            urls.extend(successor_urls);
            Some(url)
        } else {
            None
        }
    })
}
}

Parallel async iteration

We should have combinators like buffered that enable parallel async iteration, similar to the parallel iterators offered by [rayon]. The core operation here is for_each (which processes each item in the iterator):


#![allow(unused)]
fn main() {
trait ParAsyncIter {
    type Item;

    async fn for_each(&mut self, op: impl AsyncFn(Self::Item));
}
}

The buffered combinator would be implemented by creating an internal scope and spawning tasks into it as needed.

Generators

Impact

  • Able to write iterators (and async iterators) with ease, comparable to languages like Python or JavaScript
  • Able to extend the resulting iterators with "optimization" traits like ExactSizeIterator for maximum efficiency

Portable across runtimes, easy to switch

Impact

  • Able to grab libraries from crates.io and mix-and-match them with confidence, no matter what runtime or other libraries you are using
  • Able to easily author libraries that can be combined with other libraries and are independent of runtime
  • Able to easily change applications between runtimes to explore new possibilities
  • Able to easily author new runtimes that try out a new execution strategy or for some new environment and have them interoperate with most extant libraries, without the need to change those libraries
  • Able to find runtimes that fit a wide variety of scenarios and use patterns

Async read/write

Impact

  • Able to abstract over "something readable" and "something writeable"
  • Able to use these traits with dyn Trait
  • Able to easily write wrappers that "instrument" other readables/writeables
  • Able to author wrappers like SSL, where reading may require reading and writing on the underlying data stream

Design notes

Challenge: Permitting simultaneous reads/writes

The obvious version of the existing AsyncRead and AsyncWrite traits would be:


#![allow(unused)]
fn main() {
#[repr(inline_async)]
trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize>;
}

#[repr(inline_async)]
trait AsyncWrite {
    async fn write(&mut self, buf: &[u8]) -> std::io::Result<usize>;
}
}

This form doesn't permit one to simultaneously be reading and writing. Moreover, SSL requires changing modes, so that e.g. performing a read may require writing to the underlying socket, and vice versa. (Link?)

Note also that using std::io::Result would make the traits unusable in #[no_std] (this is also the case with the regular Read and Write traits), which might preclude embedded uses of these traits. These fundamental traits could all be added to alloc (but not core, because std::io::Error depends on Box).

Variant A: Readiness

One possibility is the design that CarlLerche proposed, which separates "readiness" from the actual (non-async) methods to acquire the data:

pub struct Interest(...);
pub struct Ready(...);

impl Interest {
    pub const READ = ...;
    pub const WRITE = ...;
}

#[repr(inline)]
pub trait AsyncIo {
    /// Wait for any of the requested input, returns the actual readiness.
    ///
    /// # Examples
    ///
    /// ```
    /// async fn main() -> Result<(), Box<dyn Error>> {
    ///     let stream = TcpStream::connect("127.0.0.1:8080").await?;
    ///
    ///     loop {
    ///         let ready = stream.ready(Interest::READABLE | Interest::WRITABLE).await?;
    ///
    ///         if ready.is_readable() {
    ///             let mut data = vec![0; 1024];
    ///             // Try to read data, this may still fail with `WouldBlock`
    ///             // if the readiness event is a false positive.
    ///             match stream.try_read(&mut data) {
    ///                 Ok(n) => {
    ///                     println!("read {} bytes", n);
    ///                 }
    ///                 Err(ref e) if e.kind() == io::ErrorKind::WouldBlock => {
    ///                     continue;
    ///                 }
    ///                 Err(e) => {
    ///                     return Err(e.into());
    ///                 }
    ///             }
    ///
    ///         }
    ///
    ///         if ready.is_writable() {
    ///             // Try to write data, this may still fail with `WouldBlock`
    ///             // if the readiness event is a false positive.
    ///             match stream.try_write(b"hello world") {
    ///                 Ok(n) => {
    ///                     println!("write {} bytes", n);
    ///                 }
    ///                 Err(ref e) if e.kind() == io::ErrorKind::WouldBlock => {
    ///                     continue
    ///                 }
    ///                 Err(e) => {
    ///                     return Err(e.into());
    ///                 }
    ///             }
    ///         }
    ///     }
    /// }
    /// ```
    async fn ready(&mut self, interest: Interest) -> io::Result<Ready>;
}

pub trait AsyncRead: AsyncIo {
    fn try_read(&mut self, buf: &mut ReadBuf<'_>) -> io::Result<()>;
}

pub trait AsyncWrite: AsyncIo {
    fn try_write(&mut self, buf: &[u8]) -> io::Result<usize>;
}

This allows users to:

  • Take T: AsyncRead, T: AsyncWrite, or T: AsyncRead + AsyncWrite

Note that it is always possible to ask whether writes are "ready", even for a read-only source; the answer will just be "no" (or perhaps an error).

Can we convert all existing code to this form?

The try_read and try_write methods are basically identical to the existing "poll" methods. So the real question is what it takes to implement the ready async function. Note that tokio internally already adopts a model very similar to this on many types (though there is no trait for it).

It seems like the torture case to validate this is openssl.

Variant B: Some form of split

Another alternative is to have read/write traits and a way to "split" a single object into separate read/write traits:


#![allow(unused)]
fn main() {
#[repr(inline_async)]
trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize>;
}

#[repr(inline_async)]
trait AsyncWrite {
    async fn write(&mut self, buf: &[u8]) -> std::io::Result<usize>;
}

#[repr(inline_async)]
trait AsyncBidirectional: AsyncRead + AsyncWrite {
    async fn split(&mut self) -> (impl AsyncRead + '_, impl AsyncWrite + '_)
}
}

The challenge here is to figure out exactly how that definition should look. The version I gave above includes the possibility that the resulting readers/writers have access to the fields of self.

Variant C: Extend traits to permit expressing that functions can both execute

Ranging further out into unknowns, it is possible to imagine extending traits with a way to declare that two &mut self methods could both be invoked concurrently. This would be generally useful but would be a fundamental extension to the trait system for which we don't really have any existing design. There is a further complication that the read and write methods are in distinct traits (AsyncRead and AsyncWrite, respectively) and hence cannot


#![allow(unused)]
fn main() {
#[repr(inline_async)]
trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize>;
    async fn write(&mut self, buf: &[u8]) -> std::io::Result<usize>;
}

#[repr(inline_async)]
trait AsyncWrite {
}

#[repr(inline_async)]
trait AsyncBidirectional: AsyncRead + AsyncWrite {
    async fn split(&mut self) -> (impl AsyncRead + '_, impl AsyncWrite + '_)
}
}

Variant D: Implement the AsyncRead and AsyncWrite traits for &T

In std, there are Read and Write impls for &File, and the async-std runtime has followed suit. This means that you can express "can do both AsyncRead + AsyncWrite" as AsyncRead + AsyncWrite + Copy, more or less, or other similar tricks. However, it's not possible to do this for any type. Worth exploring.

Async timer

Impact

  • Able to write libraries or applications that use a trait to create a timer without referring to a particular runtime
  • Able to use the trait in a dyn-safe fashion

Async spawn, spawn-blocking

Impact

  • Able to write libraries or applications that use a trait to spawn async or blocking tasks without referring to a particular runtime
  • Able to use the trait in a dyn-safe fashion

Runtime

Impact

  • Able to write simple, non-generic async Rust code that performs common operations like opening TCP sockets, sending UDP packets, accessing files, sleeping, and spawning tasks, but which is not specific to a particular runtime.
  • Able to retarget code that relies on these APIs across different runtimes with no effort.

Design notes

When writing sync code, it is possible to simply access I/O and other facilities without needing to thread generics around:


#![allow(unused)]
fn main() {
fn load_socket_addr() -> Result<SocketAddr, Box<dyn Error>> {
    Ok(std::fs::read_to_string("address.txt")?.parse()?)
}
}

This code will work no matter what operating system you run it on.

Similarly, if you don't mind hard-coding your runtime, one can use tokio or async_std in a similar fashion


#![allow(unused)]
fn main() {
// Pick one:
//
// use tokio as my_runtime;
// use async_std as my_runtime;

async fn load_socket_addr() -> Result<SocketAddr, Box<dyn Error>> {
    Ok(my_runtime::fs::read_to_string("address.txt").await?.parse()?)
}
}

Given suitable traits in the stdlib, it would be possible to write generic code that feels similar:


#![allow(unused)]
fn main() {
async fn load_socket_addr<F: AsyncFs>() -> Result<SocketAddr, Box<dyn Error>> {
    Ok(F::read_to_string("address.txt").await?.parse()?)
}
}

Alternatively, that might be done with dyn trait:


#![allow(unused)]
fn main() {
async fn load_socket_addr(fs: &dyn AsyncFs)) -> Result<SocketAddr, Box<dyn Error>> {
    Ok(F::read_to_string("address.txt").await?.parse()?)
}
}

Either approach is significantly more annoying, both as the author of the library and for folks who invoke your library.

Preferred experience

The ideal would be that you can write an async function that is "as easy" to use as a non-async one, and have it be portable across runtimes:


#![allow(unused)]
fn main() {
async fn load_socket_addr() -> Result<SocketAddr, Box<dyn Error>> {
    Ok(std::async_fs::read_to_string("address.txt").await?.parse()?)
}
}

But how to achieve it?

The basic idea is to extract out a "core API" of things that a runtime must provide and to make those functions available as part of the Context that Async values are invoked with. To avoid the need for generics and monomorphization, this would have to be based purely on dyn values. This interface ought to be compatible with no-std runtimes as well, which imposes some challenges.

Frequently asked questions

What about async overloading?

Good question! The async overloading feature may be another, better route to this same goal. At minimum it implies that std::async_fs etc might not be the right names (although those modules could be deprecated and merged going forward).

It definitely suggests that the names and signatures of all functions, methods, and types should be kept very strictly analogous. In particular, sync APIs should be a subset of async APIs.

What about cap-std?

It's interesting to observe that the dyn approach is feeling very close to cap-std. That might be worth taking into consideration. Some targets, like wasm, may well prefer if we took a more "capability oriented" approach.

What about spawning and scopes?

Given that spawning should occur through scopes, it may be that we don't need a std::async_thread::spawn API so much as standards for scopes.

What about evolving the API?

We will want to be able to start with a small API and grow it. How is that possible, given that the implementation of the API lives in external runtimes?

What methods are needed?

We need to cover the things that exist in the sync stdlib

  • spawn, spawn-blocking
  • timers (sleep)
  • TCP streams, UDP sockets
  • file I/O
  • channels and other primitives
    • mutexes?

Polish

Impact

  • Users can predict and understand why the compiler raises error messages. Errors are aligned with an experienced user's intuition about how Rust works.
  • Error messages identify common misconceptions, suggest solutions, and are generally on par with sync Rust.
    • Errors not only show that there is a problem, they help the user to fix it and to learn more about Rust (possibly directing the user to other documentation).
    • The compiler may suggest crates from the ecosystem to help solve problems when appropriate.
  • Lints guide the user away from common errors and help them both to get started with async Rust and to maintain async Rust programs over time.
  • Rust's async implementation is high quality and reflects an attention to detail.
    • No internal compiler errors
    • Compiler analysis and code generation passes are precise and not unnecessarily conservative.
    • Integration with low-level tooling and the like is high-quality.
    • The generated code from the compiler is high quality and performant.

๐Ÿ› ๏ธ How to Help

The goal of a highly polished async experience in Rust has many details and touches many aspects of the project, including both the async area in particular and the Rust project in general. This means there are lots of ways to get involved!

The weekly triage meeting primarily focuses on polish issues, so that is a great place to get to know people already working on the project and find out what people are actively working on. We meet over Zulip, so feel free to just lurk, or chime in if you want to. See the triage meeting page for details about when the meeting happens and how to join.

Even outside of regularly scheduled meetings, you are welcome to hang out in the Async Working Group's Zulip stream. There are usually a few people active there who are happy to discuss async-related topics.

If you are looking for a specific area to help, there are several places where we track work.

  • The Initiatives list down below.
  • The Async Work Group Project Board. The "On Deck" column is a good place to start looking.
  • Issues on the wg-async-foundations repo. These tend to relate to project organization and longer term objectives.
  • Issues on the Rust repo. Specifically, issues tagged AsyncAwait-Polish, A-async-await. Issues that are also tagged with E-mentor will have mentoring instructions, which are usually pointers to specific points in the code where changes will be needed to fix the issue.

Finally, a great way to contribute is to point out any rough edges you come across with writing async Rust. This can be done either through issues on the Rust repo, or by starting a topic on our Zulip stream. Examples of rough edges that we are interested in include confusing error messages or places where Rust behaved in a way you found surprising or counter-intuitive. Knowing about these issues helps to ensure we are fixing the right things.

Initiatives

InitiativeStateKey participants
Error messages๐Ÿ’ค
Lint: Must not suspend๐Ÿฆ€Gus Wynn
Lint: Blocking in async context๐Ÿ’ค
Lint: Large copies, large generators๐Ÿ’ค
Cleaner async stacktraces๐Ÿ’ค
Precise generator captures๐Ÿฆ€eholk
Sync and async behave the same๐Ÿ’ค

Lint must not suspend

Impact

  • Warnings when values which ought not to be live over an await are, well, live over an await.
    • Example: lock guards.

Milestones

MilestoneStatusKey Participants
Implemented the RFC๐Ÿฆ€Gus Wynn

Lint blocking fns

Impact

  • Identify calls to blocking functions from within async functions and guide the user to an async replacement.

Milestones

MilestoneStatusKey Participants
RFC proposed and accepted๐Ÿ’ค
Implemented๐Ÿ’ค

Lint large copies

Impact

  • Identify when large types are being copied and issue a warning. This is particularly useful for large futures, but applies to other Rust types as well.

Milestones

MilestoneStatusKey Participants
Lang team initiative proposal๐Ÿ’ค
Implemented๐Ÿ’ค

Design notes

This is already implemented in experimental form. We would also need easy and effective ways to reduce the size of a future, though, such as deliv_boxable.

Error messages for most confusing scenarios

Impact

  • Errors not only show that there is a problem, they help the user to fix it and to learn more about Rust (possibly directing the user to other documentation).

Design notes

Of course there are an infinite number of improvements one could make. The point of this deliverable is to target the most common situations and confusions people see in practice. The final list is still being enumerated:

Stacktraces

Impact

  • Async stacktraces contain only the information that people need to figure out what has happened, and are free of extraneous or runtime-internal details
  • Users are able to recover the full, unabridged stacktrace if needed

Precise Generator Captures

Impact

  • Users can predict and understand why the compiler raises error messages. Errors are aligned with an experienced user's intuition about how Rust works.
  • Compiler analysis and code generation passes are precise and not unnecessarily conservative.

Milestones

MilestoneStatusKey Participants
Prototyped๐Ÿฆ€eholk
Documented in Rust Reference๐Ÿฆ€eholk
Lang team initiative proposal๐Ÿ’คeholk
Lang team signoff๐Ÿ’คLang team
Stabilized๐Ÿ’คeholk

Sync and async behave the same

Impact

Async code should not be surprising. In general, if you surround a block of synchronous code with async or mark a sync fn as async, nothing unexpected should happen.

  • The code should evaluate to the same value after awaiting.
  • Any compilation errors should be essentially the same, modulo details around implicit futures in the return type.

Milestones

MilestoneStatusKey Participants
Define "behave the same"๐Ÿ’ค
Create testing to ensure same behavior๐Ÿ’ค

Tooling

Impact

  • Tooling that gives insight into the state of async runtimes
    • How many tasks are running and what is their state
    • What are tasks blocked on and why?
    • Where is memory allocated?
    • Where is CPU time spent?
    • Flamegraph of where things spend their time
    • Perf-style profile of where things spend their time
  • Tooling should also allow you to limit profiles to a particular request or to requests that meet particular criteria (e.g., coming from a particular source)
  • Tooling should detect common hazards and identify them, suggesting fixes
    • Tasks that clone a Waker but don't trigger it
    • Tasks that don't respond to a request to cancellation for a long time
    • Outlier tasks that sleep for a very long time without being awoken
  • Tooling should permit "always on" profiling that can be used in production
  • Tooling can provide profile-based feedback:
    • Where to "heap-allocate" futures
    • Poll functions that execute for a long time without yielding
    • Imbalanced workloads across cores
  • Tooling can be either customized or integrated into existing tools like perf, gdb, lldb, etc, as appropriate

Crashdump

  • Able to get information about the state of the runtime and async tasks from crashdumps.

Testing

Impact

  • Async applications need the ability to write tests that let them simulate and mock the outside world
  • Ability to test edge cases:
    • Long latencies
    • Dropped connections
    • Funky schedules

Design notes

At the moment, this is an "experimentation" area, but it represents a common need without well-established, widely used solutions.

Documentation

Impact

  • Quality, easily findable documentation to help folks get started with async Rust

Requires

Async book

Impact

  • Centralized documentation explainined how Async Rust works
  • Docs explain how to get started, identify common patterns, and cover concepts that are common to all or most runtimes

Threadsafe portability

Impact

  • Able to write code that can be easily made Send or not Send
    • The resulting code is able to switch between helper types, like Rc and Arc, appropriately.

Async overloading

Impact

  • By default, function definitions can be compiled into either sync or async mode
  • Able to overload a function with two variants, one for sync and one for async

Design notes

This is a highly speculative deliverable. However, it would be great if one were able to write code that is neither sync nor sync, but potentially either. Further, one should be able to provide specialized variants that perform the same task but in slightly different ways; this would be particularly useful for primitives like TCP streams.

Monomorphize

The way to think of this is that every function has an implicit generic parameter indicating its scheduler mode. When one writes fn foo(), that is like creating a generic impl:


#![allow(unused)]
fn main() {
impl<SM> Fn<(), SM> for Foo 
where 
    SM: SchedulerMode,
{
    ...
}
}

When one writes async fn or sync fn, those are like providing specific impls:


#![allow(unused)]
fn main() {
impl Fn<(), AsyncSchedulerMode> for Foo {
    ...
}

impl Fn<(), SchedulerMode> for Foo {
    ...
}
}

Further, by default, when you call a function, you invoke it in the same scheduler mode as the caller.

Implications for elsewhere

  • If we had this feature, then having distinct modules like use std::io and use std::async_io would not be necessary.
  • Further, we would want to design our traits and so forth to have a "common subset" of functions that differ only in the presence or absence of the keyword async.

Major unresolved questions or controveries

This section contains places where there remains significant design work to be done. It also contains some points of major controversy, where the path is clear, but many people disagree on whether to take it. These are places where further input can be useful.

The page for each controversy attempts to summarize the various options available and some of the tradeoffs involved.

Default runtime?

The User's Manual of the future suggests that one must still pick a runtime upfront and use a decorator like #[runtime::main]. This is "accidental complexity" for people learning async Rust: the choice of runtime is something they are not yet equipped to make. It would be better for users if they could just write async fn main and not choose a runtime yet (and then, later, once they are equipped to make the choice, opt for other runtimes).

However, we also wish to avoid shipping and maintaining a runtime in the Rust stdlib. We want runtimes to live in the ecosystem and evolve over time. If we were to pick a "default runtime", that might favor one runtime at the expense of others.

Should we pick a default runtime? If so, what criteria do we use to pick one, and how do we manage the technical side of things (e.g., we need to either ship the runtime with rustup or else insert some kind of implicit cargo dependency).

How to represent the AsyncFn traits?

As noted in the async fn page, the "inline async fn" technique cannot represent async closures.

How best to integrate voluntary cancellation?

Extend stdlib to permit portable async without generics?

To await or not to await?

Should we require you to use .await? After the epic syntax debates we had, wouldn't it be ironic if we got rid of it altogether, as carllerche has proposed?

Basic idea:

  • When you invoke an async function, it could await by default.
  • You would write async foo() to create an "async expression" -- i.e., to get a impl Async.
    • You might instead write async || foo(), i.e., create an async closure.

Appealing characteristics:

  • More analogous to sync code. In sync code, if you want to defer immediately executing something, you make a closure. Same in async code, but it's an async closure.
  • Consistency around async-drop. If we adopt an async drop proposal, that implies that there will be "awaits" that occur as you exit a block (or perhaps from the control-flow of a break or ?). These will not be signaled with a .await. So you can no longer rely on every await point being visible with a keyword.
  • No confusion around remembering to await. Right now the compiler has to go to some lengths to offer you messages suggesting you insert .await. It'd be nice if you just didn't have to remember.
  • Room for optimization. When you first invoke an async function, it can immediately start executing; it only needs to create a future in the event that it suspends. This may also make closures somewhat smaller.
    • This could be partially achieved by adding an optional method on the trait that compiles a version of the fn meant to be used when it is immediately awaited.

But there are some downsides:

  • Churn. Introducing a new future trait is largely invisible to users except in that it manifests as version mismatches. Removing the await keyword is a much more visible change.
  • Await points are less visible. There may be opportunity to introduce concurrency and so forth that is harder to spot when reading the code, particularly outside of an IDE. (In Kotlin, which adopts this model, suspend points are visible in the "gutter" of the editor, but this is not visible when reviewing patches on github.)
    • Await points today also indicate where a live Send or Sync value will affect if the future is send or sync (but with async-drop, this would no longer be true).
  • Async becomes an effect. In today's Rust, an "async function" desugars into a traditional function that returns a future. This function is called like any other, and hence it can implement the Fn traits and so forth. In this "await-less" Rust, an async function is called differently from other functions, because it induces an await. This means that we need to consider async as a kind of "effect" (like unsafe) in a way that is not today.
    • Similarly, how do we handle the case of fn foo() -> impl Future? Does that auto-await, or does it require an explicit await keyword?
    • What happens when you invoke an async fn in a sync environment?

Frequently asked questions

How could you do this anyway? Wouldn't it be a massive breaking change?

It would have to take place over an edition.

๐Ÿ’ Appendix: Submitted stories

This appendix contains the full list of status quo and shiny future stories that were submitted by users as part of the vision doc construction. The lessons and ideas from these stories have been incorporated into the current roadmap.

๐Ÿ˜ฑ Status quo stories

๐Ÿšง Under construction! Help needed! ๐Ÿšง

We are still in the process of drafting the vision document. The stories you see on this page are examples meant to give a feeling for how a status quo story looks; you can expect them to change. See the "How to vision" page for instructions and details.

What is this

The "status quo" stories document the experience of using Async Rust today. Each story narrates the challenges encountered by one of our characters as they try (and typically fail in dramatic fashion) to achieve their goals.

Writing the "status quo" stories helps us to compensate for the curse of knowledge: the folks working on Async Rust tend to be experts in Async Rust. We've gotten used to the workarounds required to be productive, and we know the little tips and tricks that can get you out of a jam. The stories help us gauge the cumulative impact all the paper cuts can have on someone still learning their way around. This gives us the data we need to prioritize.

Based on a true story

These stories may not be true, but they are not fiction. They are based on real-life experiences of actual people. Each story contains a "Frequently Asked Questions" section referencing sources used to create the story. In some cases, it may link to notes or summaries in the conversations section, though that is not required. The "Frequently Asked Questions" section also contains a summary of what the "morals" of the story are (i.e., what are the key takeaways), along with answers to questions that people have raised along the way.

The stories provide data we use to prioritize, not a prioritization itself

Just because a user story is represented here doesn't mean we're going to be able to fix it right now. Some of these user stories will indicate more severe problems than others. As we consider the stories, we'll select some subset to try and address; that choice is reflected in the roadmap.

Metanarrative

What follows is a kind of "metanarrative" of using async Rust that summarizes the challenges that are present today. At each point, we link to the various stories; you can read the full set in the table of contents on the left. We would like to extend this to also cover some of its glories, since reading the current stories is a litany of difficulties, but obviouly we see great promise in async Rust. Note that many stories here appear more than once.

Rust strives to be a language that brings together performance, productivity, and correctness. Rust programs are designed to surface bugs early and to make common patterns both ergonomic and efficient, leading to a sense that "if it compiles, it generally works, and works efficiently". Async Rust aims to extend that same feeling to an async setting, in which a single process interweaves numerous tasks that execute concurrently. Sometimes this works beautifully. However, other times, the reality falls short of that goal.

Making hard choices from a complex ecosystem from the start

The problems begin from the very first moment a user starts to try out async Rust. The async Rust support in Rust itself is very basic, consisting only of the core Future mechanism. Everything else -- including the basic async runtimes themselves -- lives in user space. This means that users must make a number of choices rom the very beginning:

Once your basic setup is done, the best design patterns are subtle and not always known.

Writing async programs turns out to have all kinds of subtle tradeoffs. Rust aims to be a language that gives its users control, but that also means that users wind up having to make a lot of choices, and we don't give them much guidance.

Even once you've chosen a pattern, gettings things to compile can be a challenge.
Once you get it to compile, things don't "just work" at runtime, or they may be unexpectedly slow.
When you have those problems, you can't readily debug them or get visibility into what is going on.
Rust has always aimed to interoperate well with other languages and to fit itself into every niche, but that's harder with async.

๐Ÿ˜ฑ Status quo stories: Template

This is a template for adding new "status quo" stories. To propose a new status quo PR, do the following:

  • Create a new file in the status_quo directory named something like Alan_tries_to_foo.md or Grace_does_bar.md, and start from the raw source from this template. You can replace all the italicized stuff. :)
  • Do not add a link to your story to the SUMMARY.md file; we'll do it after merging, otherwise there will be too many conflicts.

For more detailed instructions, see the How To Vision: Status Quo page!

If you're looking for ideas of what to write about, take a look at the open issues. You can also open an issue of your own to throw out an idea for others.

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Write your story here! Feel free to add subsections, citations, links, code examples, whatever you think is best.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

Talk about the major takeaways-- what do you see as the biggest problems.

What are the sources for this story?

Talk about what the story is based on, ideally with links to blog posts, tweets, or other evidence.

Why did you choose NAME to tell this story?

Talk about the character you used for the story and why.

How would this story have played out differently for the other characters?

In some cases, there are problems that only occur for people from specific backgrounds, or which play out differently. This question can be used to highlight that.

๐Ÿ˜ฑ Status quo stories: Alan builds a task scheduler

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

A core component of DistriData, called TaskScheduler, is in charge of (1) receiving client requests via an HTTP server, (2) serializing them in a task queue, (3) relaying each task to the state machine applier (e.g., apply change to the storage backend), and (4) returning the result back to the client.

TaskScheduler was originally implemented in Go. New to Rust, Alan believes Rust could provide the same quality of service but with less memory. Then decides to reimplement TaskScheduler in Rust, without knowing the challenges ahead.

Alan only read the first few chapters of Rust book to understand the core concepts like ownership model and syntax. Already proficient in Go, Alan jumped into the coding by working through a hands-on project. Alan often referred to the examples found in each Rust crate but may lack deep understanding of how Rust works. Alan first focused on translating the Go code to Rust and as a result, the first iteration may be filled with non-idiomatic Rust code.

Implementing request ID generator

Alan first transliterates request ID generator code, originally written in Go:

import "sync/atomic"

type Generator interface {
	next() uint64
}

type generator struct {
	prefix uint64
	suffix uint64
}

func (gen *generator) next() uint64 {
	suffix := atomic.SwapUint64(&gen.suffix, gen.suffix+1)
	id := gen.prefix | (suffix & (math.MaxUint64 >> 16))
	return id
}

Alan learns Rust trait as the closest concept to Go interface but is now torn between std::sync::atomic and crossbeam::atomic::AtomicCell. Reading multiple articles about how great crossbeam is and for its thread-safety promises, Alan chooses crossbeam (see "crates better than std (from Reddit)"):


#![allow(unused)]
fn main() {
use crossbeam::atomic::AtomicCell;

pub struct Generator {
    prefix: u64,
    suffix: AtomicCell<u64>,
}

impl Generator {
    pub fn new(...) -> Self {
        ...
    }

    pub fn next(&self) -> u64 {
        let suffix = self.suffix.fetch_add(1);
        let id = self.prefix | (suffix & (u64::MAX >> 16));
        id
    }
}
}

Accustomed to an opinionated way of doing concurrency in Go, Alan loses confidence in Rust async support, as he sees fragmented but specialized solutions in Rust async ecosystem.

Implementing event notifier

Alan then implements the notifier to propagate the request and apply the progress with the scheduler and low-level state machine. In Go, it can be simply implemented as below:

type Notifier interface {
	register(id uint64) (<-chan string, error)
	trigger(id uint64, x string) error
}

type notifier struct {
	mu       sync.RWMutex
	requests map[uint64]chan string
}

func (ntf *notifier) register(id uint64) (<-chan string, error) {
	ntf.mu.Lock()
	defer ntf.mu.Unlock()
	ch := ntf.requests[id]
	if ch != nil {
		return nil, fmt.Errorf("dup id %x", id)
	}

	ch = make(chan string, 1)
	ntf.requests[id] = ch
	return ch, nil
}

func (ntf *notifier) trigger(id uint64, x string) error {
	ntf.mu.Lock()
	ch, ok := ntf.requests[id]
	if ch == nil || !ok {
		ntf.mu.Unlock()
		return fmt.Errorf("request ID %d not found", id)
	}
	delete(ntf.requests, id)
	ntf.mu.Unlock()
	ch <- x
	close(ch)
	return nil
}

Alan now needs the equivalent to Go sync.RWMutex, and found multiple options:

Already losing confidence in Rust std, Alan instead chooses parking_lot, as it claims up to 5x faster performance than std::sync::Mutex (see github). After numeruous hours of trials and errors, Alan discovered that parking_lot::RwLock is not intended for async/future environments (see github issue). Having to think about which library to use for thread and async programming, Alan appreciates the simplicity of Go concurrency where threads are effectively abstracted away from its users. Alan is now using async_std::sync::RwLock which seems nicely integrated with Rust async programming.

To send and receive events, Alan needs the equivalent of Go channel but is not sure about std::sync::mpsc::channel, as he sees two other options: Flume which claims to be much faster than std (see "Flume, a 100% safe MPSC that's faster than std (from Reddit)"), and crossbeam_channel. Having used crossbeam, Alan chose crossbeam channel:


#![allow(unused)]
fn main() {
use async_std::sync::RwLock;
use crossbeam_channel::{self, unbounded};

pub struct Notifier {
    requests: RwLock<HashMap<u64, crossbeam_channel::Sender<String>>>,
}

impl Notifier {
    pub fn new() -> Self {
        Self {
            requests: RwLock::new(HashMap::new()),
        }
    }

    pub fn register(&self, id: u64) -> io::Result<crossbeam_channel::Receiver<String>> {
        let mut _mu;
        match self.requests.try_write() {
            Some(guard) => _mu = guard,
            None => return Err(...),
        }

        let (request_tx, request_rx) = unbounded();
        if _mu.get(&id).is_none() {
            _mu.insert(id, request_tx);
        } else {
            return Err(...)
        }

        Ok(request_rx)
    }

    pub fn trigger(&self, id: u64, x: String) -> io::Result<()> {
        let mut _mu;
        match self.requests.try_write() {
            Some(guard) => _mu = guard,
            None => return Err(...),
        }

        let request_tx;
        match _mu.get(&id) {
            Some(ch) => request_tx = ch,
            None => return Err(...),
        }

        match request_tx.send(x) {
            Ok(_) => _mu.remove(&id),
            Err(e) => return Err(...),
        }

        Ok(())
    }
}
}

Alan is still not sure if crossbeam_channel is safe for async programming and whether he should instead use another crate async_std::channel. While crossbeam_channel seems to work, Alan is not confident about his choice. Disgruntled with seemingly unnecessary divergence in the community, Alan wonders why all those cool improvements had not been made back to Rust core std libraries.

Implementing task applier

Alan implements a task applier, which simply echoes the requested message, as in Go:

type EchoManager interface {
	apply(req *EchoRequest) (string, error)
}

type echoManager struct {
	mu sync.RWMutex
}

func (ea *echoManager) apply(req *EchoRequest) (string, error) {
	ea.mu.Lock()
	defer ea.mu.Unlock()
	switch req.Kind {
	case "create":
		return fmt.Sprintf("SUCCESS create %q", req.Message), nil
	case "delete":
		return fmt.Sprintf("SUCCESS delete %q", req.Message), nil
	default:
		return "", fmt.Errorf("unknown request %q", req)
	}
}

Having implemented event notifier above, Alan is now somewhat familiar with Rust mutex and writes the following Rust code:


#![allow(unused)]
fn main() {
// 1st version
use async_std::sync::RwLock;

pub struct Manager {
    mu: RwLock<()>,
}

impl Manager {
    pub fn new() -> Self {
        Self {
            mu: RwLock::new(()),
        }
    }

    pub fn apply(&self, req: &Request) -> io::Result<String> {
        let _mu;
        match self.mu.try_write() {
            Some(guard) => _mu = guard,
            None => return Err(...),
        }
        match req.kind.as_str() {
            "create" => Ok(format!(
                "SUCCESS create {}",
                to_string(req.message.to_owned())
            )),
            "delete" => Ok(format!(
                "SUCCESS delete {}",
                to_string(req.message.to_owned())
            )),
            _ => Err(...),
        }
    }
}
}

The code compiles and thus must be safe. However, after reviewing the code with Barbara, Alan learns that while std::sync::Mutex protects data from concurrent access, std::sync::Mutex itselt must be also protected between threads. And the code will not compile if he tries to use it from multiple threads. This is where std::sync::Arc comes in to provide safe multi-threaded access to the Mutex.

std::sync::Mutex documentation explains Arc in depth. If Alan had chosen std::sync::Mutex library, he would have known about Arc. Because Alan was initially given multiple alternatives for mutex, he overlooked the documentation in std::sync::Mutex and instead used async_std::sync::RwLock whose documentation did not explain Arc. As a result, Alan did not know how to properly use mutex in Rust.

Deeply confused, Alan made a quick fix to wrap Mutex with Arc:


#![allow(unused)]
fn main() {
// 2nd version
use async_std::{sync::Arc, sync::RwLock};

pub struct Manager {
    mu: Arc<RwLock<()>>,
}

impl Manager {
    pub fn new() -> Self {
        Self {
            mu: Arc::new(RwLock::new(())),
        }
    }
    ...
}

This raises multiple questions for Alan:

  1. If Mutex itself had to be protected, why Arc is not unified into a single type? Is the flexibility of having different types really worth the less safety guarantee?
  2. Rust claims unparalleled safety. Is it still true for async programming? Rust compiler not complaining about the missing Arc means Mutex is still safe without Arc?
  3. What happens if the code went into production without Arc? Would the code have race conditions?
  4. Does having Arc make code slower? Did I just introduce extra runtime cost?
  5. Which one is safe for async programming: std::sync::Arc and async_std::sync::Arc?

Implementing task scheduler

Alan then implements the task scheduler that calls event notifier and task applier above, as in Go:

type Request struct {
	echoRequest *EchoRequest
}

type Applier interface {
	start()
	stop() error
	apply(req Request) (string, error)
}

type applier struct {
	requestTimeout time.Duration

	requestIDGenerator Generator
	notifier           Notifier

	requestCh chan requestTuple

	stopCh chan struct{}
	doneCh chan struct{}

	echoManager EchoManager
}

type requestTuple struct {
	requestID uint64
	request   Request
}

func (ap *applier) start() {
	go func() {
		for {
			select {
			case tup := <-ap.requestCh:
				reqID := tup.requestID
				req := tup.request
				switch {
				case req.echoRequest != nil:
					rs, err := ap.echoManager.apply(req.echoRequest)
					if err != nil {
						rs = fmt.Sprintf("failed to apply %v", err)
					}
					if err = ap.notifier.trigger(reqID, rs); err != nil {
						fmt.Printf("failed to trigger %v", err)
					}
				default:
				}
			case <-ap.stopCh:
				ap.doneCh <- struct{}{}
				return
			}
		}
	}()
}

func (ap *applier) stop() error {
	select {
	case ap.stopCh <- struct{}{}:
	case <-time.After(5 * time.Second):
		return errors.New("took too long to signal stop")
	}
	select {
	case <-ap.doneCh:
	case <-time.After(5 * time.Second):
		return errors.New("took too long to receive done")
	}
	return nil
}

func (ap *applier) apply(req Request) (string, error) {
	reqID := ap.requestIDGenerator.next()
	respRx, err := ap.notifier.register(reqID)
	if err != nil {
		return "", err
	}

	select {
	case ap.requestCh <- requestTuple{requestID: reqID, request: req}:
	case <-time.After(ap.requestTimeout):
		if err = ap.notifier.trigger(reqID, fmt.Sprintf("failed to schedule %d in time", reqID)); err != nil {
			return "", err
		}
	}

	msg := ""
	select {
	case msg = <-respRx:
	case <-time.After(ap.requestTimeout):
		return "", errors.New("apply timeout")
	}

	return msg, nil
}

Not fully grokking Rust ownership model in async, Alan implements the following code, but faced with a bunch of compiler error messages:


#![allow(unused)]
fn main() {
use async_std::task;

pub struct Applier {
    notifier: notify::Notifier,
    ...
}

impl Applier {
    pub fn new(req_timeout: Duration) -> Self {
        ...
        Self {
            ...
            notifier: notify::Notifier::new(),
            ...
        }
    }
    ...

    pub async fn start(&self) -> io::Result<()> {
        task::spawn(apply_async(
            self.notifier,
            ...
        ));
        ...
        Ok(())
    }
    ...


pub async fn apply_async(
    notifier: notify::Notifier,
    ...
) -> io::Result<()> {
  ...
}
error[E0507]: cannot move out of `self.notifier` which is behind a shared reference
  --> src/apply.rs:72:13
   |
72 |             self.notifier,
   |             ^^^^^^^^^^^^^ move occurs because `self.notifier` has type `Notifier`, which does not implement the `Copy` trait

After discussing with Barbara, Alan adds Arc to provide a shared ownership between async tasks:


#![allow(unused)]
fn main() {
use async_std::{sync::Arc, task};

pub struct Applier {
    notifier: Arc<notify::Notifier>,
    ...
}

impl Applier {
    pub fn new(req_timeout: Duration) -> Self {
        ...
        Self {
            ...
            notifier: Arc::new(notify::Notifier::new()),
            ...
        }
    }
    ...

    pub async fn start(&self) -> io::Result<()> {
        task::spawn(apply_async(
            self.notifier.clone(),
            ...
        ));
        ...
        Ok(())
    }
    ...


pub async fn apply_async(
    notifier: Arc<notify::Notifier>,
    ...
) -> io::Result<()> {
  ...
}

Alan is satisfied with the compilation success for the moment, but doesn't feel confident about the production readiness of Rust async.

Implementing HTTP server handler

Familiar with Go standard libraries, Alan implemented the following request handler without any third-party dependencies:

import (
	"encoding/json"
	"fmt"
	"net/http"
	"os"
	"os/signal"
	"syscall"
	"time"
)

type Handler interface {
	start()
}

type handler struct {
	listenerPort uint64
	applier      Applier
}

func (hd *handler) start() {
	hd.applier.start()

	serverMux := http.NewServeMux()
	serverMux.HandleFunc("/echo", hd.wrapFunc(handleRequest))

	httpServer := &http.Server{
		Addr:    fmt.Sprintf(":%d", hd.listenerPort),
		Handler: serverMux,
	}

	tch := make(chan os.Signal, 1)
	signal.Notify(tch, syscall.SIGINT)
	done := make(chan struct{})
	go func() {
		httpServer.Close()
		close(done)
	}()

	if err := httpServer.ListenAndServe(); err != nil {
		fmt.Printf("http server error: %v\n", err)
	}
	select {
	case <-done:
	default:
	}

	if err := hd.applier.stop(); err != nil {
		panic(err)
	}
}

func (hd *handler) wrapFunc(fn func(applier Applier, w http.ResponseWriter, req *http.Request)) func(w http.ResponseWriter, req *http.Request) {
	return func(w http.ResponseWriter, req *http.Request) {
		fn(hd.applier, w, req)
	}
}

func handleRequest(applier Applier, w http.ResponseWriter, req *http.Request) {
	switch req.Method {
	case "POST":
		var echoRequest EchoRequest
		err := json.NewDecoder(req.Body).Decode(&echoRequest)
		if err != nil {
			fmt.Fprintf(w, "failed to read request %v", err)
			return
		}
		s, err := applier.apply(Request{echoRequest: &echoRequest})
		if err != nil {
			fmt.Fprintf(w, "failed to apply request %v", err)
			return
		}
		fmt.Fprint(w, s)

	default:
		http.Error(w, "Method Not Allowed", 405)
	}
}

For Rust, Alan has multiple options to build a web server: hyper, actix-web, warp, rocket, tide, etc..

Alan strongly believes in Go's minimal dependency approach, and thereby chooses "hyper" for its low-level API. While "hyper" is meant to be a low-level building block, implementing a simple request handler in "hyper" still requires four different external dependencies. Alan is not surprised anymore, and rather accepts the status quo of split Rust ecosystem:

cargo add http
cargo add futures
cargo add hyper --features full
cargo add tokio --features full

After multiple days, Alan finally writes the following code:


#![allow(unused)]
fn main() {
use async_std::sync::Arc;
use futures::TryStreamExt;
use http::{Method, Request, Response, StatusCode, Version};
use hyper::server::conn::AddrStream;
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Server};
use tokio::signal;

pub struct Handler {
    listener_port: u16,
    applier: Arc<apply::Applier>,
}

impl Handler {
    ...
    pub async fn start(&self) -> Result<(), Box<dyn std::error::Error>> {
        println!("starting server");
        match self.applier.start().await {
            Ok(_) => println!("started applier"),
            Err(e) => panic!("failed to stop applier {}", e),
        }

        let addr = ([0, 0, 0, 0], self.listener_port).into();
        let svc = make_service_fn(|socket: &AddrStream| {
            let remote_addr = socket.remote_addr();
            let applier = self.applier.clone();
            async move {
                Ok::<_, Infallible>(service_fn(move |req: Request<Body>| {
                    handle_request(remote_addr, req, applier.clone())
                }))
            }
        });

        let server = Server::bind(&addr)
            .serve(svc)
            .with_graceful_shutdown(handle_sigint());

        if let Err(e) = server.await {
            println!("server error: {}", e);
        }

        match self.applier.stop().await {
            Ok(_) => println!("stopped applier"),
            Err(e) => println!("failed to stop applier {}", e),
        }

        Ok(())
    }
}

async fn handle_request(
    addr: SocketAddr,
    req: Request<Body>,
    applier: Arc<apply::Applier>,
) -> Result<Response<Body>, hyper::Error> {
    let http_version = req.version();
    let method = req.method().clone();
    let cloned_uri = req.uri().clone();
    let path = cloned_uri.path();

    let resp = match http_version {
        Version::HTTP_11 => {
            match method {
                Method::POST => {
                    let mut resp = Response::builder()
                        .status(StatusCode::INTERNAL_SERVER_ERROR)...
                    match req
                        .into_body()
                        .try_fold(Vec::new(), |mut data, chunk| async move {
                            data.extend_from_slice(&chunk);
                            Ok(data)
                        })
                        .await
                    {
                        Ok(body) => {
                            let mut success = false;
                            let mut req = apply::Request::new();
                            match path {
                                "/echo" => match echo::parse_request(&body) {
                                    Ok(bb) => {
                                        req.echo_request = Some(bb);
                                        success = true;
                                    }
                                    Err(e) => {
                                        resp = Response::builder()
                                            .status(StatusCode::INTERNAL_SERVER_ERROR)...
                                    }
                                },
                                _ => {
                                    println!("unknown path {}", path);
                                    resp = Response::builder()
                                        .status(StatusCode::INTERNAL_SERVER_ERROR)...
                                }
                            }
                            if success {
                                match applier.apply(req).await {
                                    Ok(rs) => resp = Response::new(Body::from(rs)),
                                    Err(e) => {
                                        resp = Response::builder()
                                            .status(StatusCode::INTERNAL_SERVER_ERROR)...
                                    }
                                }
                            }
                        }
                        Err(e) => ...
                    }
                    resp
                }

                _ => Response::builder()
                    .status(StatusCode::NOT_FOUND)...
            }
        }

        _ => Response::builder()
            .status(StatusCode::HTTP_VERSION_NOT_SUPPORTED)...
    };
    Ok(resp)
}
}

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

Alan's trust in Go mainly comes from its consistent and coherent approach to the problems. Alan prefers a standard way of doing things and as a result, multiple libraries available for async Rust caused Alan confusion. For instance, etcd relies on Go's standard HTTP libraries for HTTP/1 and grpc-go for HTTP/2 which is used by many other Go projects. The core networking library golang.org/x/net is actively maintained by Go team with common interests from the community.

The existing Rust syntax becomes more unwieldy and complicated to use for async Rust code. To make things worse, the lack of coherence in Rust async ecosystem can easily undermine basic user trust in a significant way.

What are the sources for this story?

  • Years of experience building a distributed key-value store in Go, etcd.
  • Simplified etcd server implementation in Go and Rust can be found at gyuho/task-scheduler-examples.

Why did you choose Alan to tell this story?

I chose Alan because he is used to Go, where these issues play out differently. Go natively supports: (1) asynchronous task with "goroutine", (2) asynchronous communication with "channel", and (3) performant HTTP server library. Each component is nicely composed together. There is no need to worry about picking the right external dependencies or resolving dependency conflicts. Concurrency being treated as first-class by Go maintainers built great confidence in Alan's decision to use Go.

How would this story have played out differently for the other characters?

This story would likely have played out the same for almost everyone new to Rust (except Barbara).

๐Ÿ˜ฑ Status quo stories: Alan Creates a Hanging Alarm

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Alan is a developer on the Bottlerocket project. Bottlerocket is a Linux-based open-source operating system that is purpose-built by Amazon Web Services for running containers. Alan created a rust program, pubsys, to ensure that Bottlerocket update repositories are healthy. A repository verification alarm uses pubsys to check the validity of Bottlerocket update repositories and notifies the team if any issues are found.

Multiple Tokio Runtimes

Bottlerocket uses its own tough library to read and write TUF repositories. This library was created before async became widespread and reqwest changed its main interface to async. When reqwest switched to async, Alan used the reqwest::blocking feature instead of re-writing tough to be an async interface. (Maybe Alan should make tough an async interface, but he hasn't yet.) In order to provide a non-async interface, reqwest::blocking creates a tokio runtime so that it can await futures.

In pubsys Alan created some parallel downloading logic while using the above libraries: Without realizing the danger, he created a tokio runtime in pubsys and used futures/await to do this parallelization, like this:


#![allow(unused)]
fn main() {
for target in targets {
    // use pubsys, which uses reqwest::blocking to get a response body reader
    let mut reader = pubsys_repo.read_target(&target).unwrap();

    // spawn a task in our own tokio runtime that conflicts with reqwest::blocking's runtime
    tasks.push(tokio::spawn(async move {
        io::copy(&mut reader, &mut io::sink()).context(error::TargetDownload {
            target: target.to_string(),
        })
    }));
}
}

Surprisingly, in retrospect, this worked... until it didn't.

Recently Alan discovered that his repository verification alarm was hanging. Alan discovered this by turning on trace level debugging and noticing that tokio was in an endless loop. Alan remembered previous development efforts when multiple tokio runtimes caused a panic, but he had never seen a hang for this reason. Still, he suspected multiple runtimes might be in play and audited to code. The root cause was, in fact, having multiple tokio runtimes, though Alan don't know what change exposed the issue. (Maybe it was a cargo update?)

The fix was to eliminate the need for a tokio runtime in the pubsys code path by doing the parallel downloads in a different way (first with threads for a quick fix, then with a thread pool).

Alan is surprised and sad since he thought the compiler would help him write safe code. Instead the compiler was ignorant of his misuse of the de-facto standard Rust async runtime.

Addendum: Multiple Tokio Major Versions

Alan is also sad that the cargo package manager doesn't understand the de-facto standard runtime's versioning requirements.

Alan had trouble updating to tokio v1 because:

  • Having two major versions of the tokio runtime can/will cause problems.
  • Cargo does not understand this and allows multiple major versions of tokio.

Ultimately Alan's strategy for this in Bottlerocket is to ensure that only one version of tokio exists in the Cargo.lock. This requirement delayed his ability to upgrade to tokio v1 and caused him to use a beta version of actix-web since all depenencies need to agree on tokio v1.

Not Easy to Block-On

When Alan is writing a procedural program, and it is perfectly fine to block, then encountering an async function is problematic.


#![allow(unused)]
fn main() {
fn my_blocking_program() {
    blocking_function_1();
    blocking_function_2();

    // uh oh, now what?
    async_function_1().await
}
}

Uh oh. Now Alan needs to decide what third-party runtime to use. Should he create that runtime around main, or should I create it and clean it up around this one function call? Put differently, should he bubble up async throughout the program even though the program is blocking and procedural (non-async) by nature?

If he uses tokio, and gets it wrong (foot-guns described above), his program may hang or panic at runtime.

In this scenario, Alan would consider this a nicer experience:


#![allow(unused)]
fn main() {
fn my_blocking_program() {
    blocking_function_1();
    blocking_function_2();

    std::thread::block_on({
        async_function_1()
    })
}
}

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

When you use a Rust async runtime, which is unavoidable these days, you really need to know what you're doing.

Although the first two of the following points are about tokio, they are really about Rust async since tokio serves as the de-facto std::runtime for Rust.

  • It is confusing and dangerous that multiple tokio runtimes can panic or hang at program runtime.
  • It is challenging that using multiple major versions of tokio (which is allowed by cargo) can fail at runtime.
  • It is unfortunate that we need a 3rd party runtime in order to block_on a future, even if we are not trying to write async code.

What are the sources for this story?

See the links embedded in the story itself (mostly at the top).

Why did you choose Bottlerocket to tell this story?

Bottlerocket is a real-life project that experienced these real-life challenges! Alan is representative of several programmers on the project that have experience with batteries-included languages like Go and Java.

How would this story have played out differently for the other characters?

  • Barbara would not have made this mistake given her experience.
  • Grace could have made the same mistake since this issue is very specific to the Rust ecosystem.
  • Niklaus could have easily made this mistake and might also have had a hard time understanding anything about the runtime or what went wrong.

๐Ÿ˜ฑ Status quo stories: Alan finds dropping database handles is hard.

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The problem

Alan has been adding an extension to YouBuy that launches a singleton actor which interacts with a Sqlite database using the sqlx crate. The Sqlite database only permits a single active connection at a time, but this is not a problem, because the actor is a singleton, and so there only should be one at a time. He consults the documentation for sqlx and comes up with the following code to create a connection and do the query he needs:

use sqlx::Connection;

#[async_std::main]
async fn main() -> Result<(), sqlx::Error> {
    // Create a connection

    let conn = SqliteConnection::connect("sqlite::memory:").await?;

    // Make a simple query to return the given parameter
    let row: (i64,) = sqlx::query_as("SELECT $1")
        .bind(150_i64)
        .fetch_one(&conn).await?;

    assert_eq!(row.0, 150);

    Ok(())
}

Things seem to be working fairly well but sometimes when he refreshes the page he encounters a panic with the message "Cannot open a new connection: connection is already open". He is flummoxed.

Searching for the Solution

Alan tries to figure out what happened from the logs, but the only information he sees is that a new connection has been received. Alan turns to the documentation for the sqlx crate to see if there are flags that might enable extra instrumentation but he can't find any.

He's a bit confused, because he's accustomed to having things generally be cleaned up automatically when they get dropped (for example, dropping a File will close it). Searching the docs, he sees the close method, but the comments confirm that he shouldn't have to call it explicitly: "This method is not required for safe and consistent operation. However, it is recommended to call it instead of letting a connection drop as the database backend will be faster at cleaning up resources." Still, just in case, he decides to add a call to close into his code. It does seem to help some, but he is still able to reproduce the problem if he refreshes often enough. Feeling confused, he adds a log statement right before calling close to see if it is working:


#![allow(unused)]
fn main() {
use sqlx::Connection;

#[async_std::main]
async fn do_the_thing() -> Result<(), sqlx::Error> {
    // Create a connection
    let conn = SqliteConnection::connect("sqlite::memory:").await?;

    // Make a simple query to return the given parameter
    let row: (i64,) = sqlx::query_as("SELECT $1")
        .bind(150_i64)
        .fetch_one(&conn).await?; // <----- if this await is cancelled, doesn't help

    assert_eq!(row.0, 150);
    
    // he adds this:
    log!("closing the connection");
    conn.close();

    Ok(())
}
}

He observes that in the cases where he has the problem the log statement never executes. He asks Barbara for help and she points him to this gist that explains how await can be canceled, and cancellation will invoke the destructors for things that are in scope. He reads the source for the SqliteConnection destructor and finds that destructor spawns a task to actually close the connection.

He realizes there is a race condition and the task may not have actually closed the connection before do_the_thing is called a second time. At this point, he is feeling pretty frustrated!

Next, Alan seeks verification and validation of his understanding of the source code from the sqlx forum. Someone on the forum explains why the destructor launches a fresh task: Rust doesn't have a way to execute async operations in a destructor.

Finding the Solution

Alan briefly considers rearchitecting his application in more extreme ways to retain use of async, but he gives up and seeks a more straight forward solution. He discovers rusqlite, a synchronous database library and adopts it. This requires some rearchitecting but solves the problem.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Rust's async story is lacking a way of executing async operations in destructors. Spawning is a workaround, but it can have unexpected side-effects.
  • The story demonstrates solid research steps that Alan uses to understand and resolve his problem.
  • Completion of the Cancellation and timeouts docs may have been helpful. It's difficult to know how something absent might have improved the solution search process.

What are the sources for this story?

This specific story describes an actual bug encountered by Sergey Galich at 1Password.

Why did you choose Alan to tell this story?

His experience and understanding of other languages coupled with his desire to apply Rust would likely lead him to try solutions before deeply researching them.

How would this story have played out differently for the other characters?

This story would likely have played out the same for everyone.

๐Ÿ˜ฑ Status quo stories: Alan has an external event loop and wants to use futures/streams

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

As a first Rust Project, Alan decides to program his own IRC Client.

Since it is Alan's first Project in Rust, it is going to be a private one. He is going to use it on is Mac, so he decides to go with the cocoa crate to not have to learn any Framework specific quirks. This way Alan can get a feel of Rust itself.

Alans hopes and dreams

Despite a learning curve, he managed to creating a first window and have some buttons and menus works. After the initialisation is done, the App hand over control to CFRunLoop::Run.

Once Alan is happy with his Mock UI, he wants to make it actually do something. Reading about async Rust, he sees that several of the concepts there map pretty well to some core Cocoa concepts:

  • Promises => Futures
  • Observables => Streams.

Alan smiles, thinking he knows what and more importantly how to do this.

First time dealing with runtimes

Unfortunately, coming from frameworks like Angular or Node.js, Alan is not used to being responsible for driving the processing of Futures/Streams.

After reading up about Runtimes, his mental image of a runtime is something like:


#![allow(unused)]
fn main() {
impl Runtime {
    fn run() {
        while !self.tasks.is_empty() {
            while let Some(task) = self.awoken_tasks.pop() {
                task.poll();
                //... remove finished task from 'tasks'
            }
        }
    }
}
}

Coming from Single-Threaded Angular development, Alan decides to limit his new App to Single-Threaded. He does not feel like learning about Send/Sync/Mutex as well as struggling with the borrow checker.

On top of that, his App is not doing any heavy calculation so he feels async should be enough to not block the main thread too bad and have a hanging UI.

Fun time is over

Soon Alan realises that he cannot use any of those runtimes because they all take control of the thread and block. The same as the OS Event loop.

Alan spends quite some time to look through several runtime implementations. Ignoring most internal things, all he wants is a runtime that looks a bit like this:


#![allow(unused)]
fn main() {
impl Runtime {
    fn make_progress() {
        while let Some(task) = self.awoken_tasks.pop() {
            task.poll();
            //... remove finished task from 'tasks'
        }
    }
    fn run() {
        while !self.tasks.is_empty() {
            self.make_progress();
        }
    }
}
}

It could be so easy. Unfortunately he does not find any such solution. Having already looked through quite a bit of low level documentation and runtime code, Alan thinks about implementing his own runtime...

...but only for a very short time. Soon after looking into it, he finds out that he has to deal with RawWakerVTable, RawWaker, Pointers. Worst of all, he has to do that without the safety net of the rust compiler, because this stuff is unsafe.

Reimplementing the OS Event Loop is also not an option he wants to take. See here >Override run() if you want the app to manage the main event loop differently than it does by default. (This a critical and complex task, however, that you should only attempt with good reason).

The cheap way out

Alan gives up and uses a runtime in a separate thread from the UI. This means he has to deal with the additional burden of syncing and he has to give up the frictionless use of some of the patterns he is accustomed to by treating UI events as Stream<Item = UIEvent>.

๐Ÿค” Frequently Asked Questions

  • What are the morals of the story?
    • Even though you come from a language that has async support, does not mean you are used to selecting und driving a runtime.
    • It should be possible to integrate runtimes into existing Event loops.
  • What are the sources for this story?
  • Why did you choose Alan to tell this story?
    • The story deals about UI event loops, but the other characters could run into similar issues when trying to combine event loops from different systems/frameworks.
  • Is this Apple specific?
    • No! You have the same issue with other OSs/Frameworks that don't already support Rust Async.
  • How would this story have played out differently for the other characters?
    • Since this is a technical and not a skill or experience issue, this would play out similar for other Characters. Although someone with deep knowledge of those Event loops, like Grace, might be more willing to re-implement them.

๐Ÿ˜ฑ Status quo stories: Alan hates writing a Stream

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Alan is used to writing web server applications using async sockets, but wants to try Rust to get that signature vroom vroom.

After a couple weeks learning Rust basics, Alan quickly understands async and await, and therefore has several routes built for his application that await a few things and then construct an HTTP response and send a buffered body. To build the buffered response bodies, Alan was reading a file, and then appending a signature, and putting that all into a single buffer of bytes.

Eventually, Alan realizes that some responses have enormous bodies, and would like to stream them instead of buffering them fully in memory. He's used the Stream trait before. Using it was very natural, and followed a similar pattern to regular async/await:


#![allow(unused)]
fn main() {
while let Some(chunk) = body.next().await? {
    file.write_all(&chunk).await?;
}
}

However, implementing Stream turns out to be rather different. With a quick search, he learned the simple way to turn a File into a Stream with ReaderStream, but the signing part was much harder.

Imperatively Wrong

Alan first hoped he could simply write signing stream imperatively, reusing his new knowledge of async and await, and assuming it'd be similar to JavaScript:


#![allow(unused)]
fn main() {
async* fn sign(file: ReaderStream) -> Result<Vec<u8>, Error> {
    let mut sig = Signature::new();

    while let Some(chunk) = file.next().await? {
        sig.push(&chunk);
        yield Ok(chunk)
    }

    yield Ok(sig.digest().await)
}
}

Unfortunately, that doesn't work. The compiler first complains about the async* fn syntax:

error: expected item, found keyword `async`
  --> src/lib.rs:21:1
   |
21 | async* fn sign(file: ReaderStream) -> Result<Vec<u8>, Error> {
   | ^^^^^ expected item

Less hopeful, Alan tries just deleting the asterisk:

error[E0658]: yield syntax is experimental
  --> src/lib.rs:27:9
   |
27 |         yield Ok(chunk)
   |         ^^^^^^^^^^^^^^^
   |
   = note: see issue #43122 <https://github.com/rust-lang/rust/issues/43122> for more information

After reading about how yield is experimental, and giving up reading the 100+ comments in the linked issue, Alan figures he's just got to implement Stream manually.

Implementing Stream

Implementing a Stream means writing async code in a way that doesn't feel like the async fn that Alan has written so far. He needs to write a poll function and it has a lot of unfamiliar concepts:

  • Pin
  • State machines
  • Wakers

Unsure of what the final code will look like, he starts with:


#![allow(unused)]
fn main() {
struct SigningFile;

impl Stream for SigningFile {
    type Item = Result<Vec<u8>, Error>;
    
    fn poll_next(self: Pin<&mut Self>, cx: &mut Context)
        -> Poll<Self::Item>
    {
 
    }
}
}

Pin :scream:

First, he notices Pin. Alan wonders, "Why does self have bounds? I've only ever seen self, &self, and &mut self before". Curious, he reads the std::pin page, and a bunch of jargon about pinning data in memory. He also reads that this is useful to guarantee that an object cannot move, and he wonders why he cares about that. The only example on the page explains how to write a "self-referential struct", but notices it needs unsafe code, and that triggers an internal alarm in Alan: "I thought Rust was safe..."

After asking Barbara, Alan realizes that the types he's depending on are Unpin, and so he doesn't need to worry about the unsafe stuff. It's just a more-annoying pointer type.

State Machine

With Pin hopefully ignored, Alan next notices that in the imperative style he wanted originally, he didn't need to explicitly keep track of state. The state was simply the imperative order of the function. But in a poll function, the state isn't saved by the compiler. Alan finds blog posts about the dark ages of Futures 0.1, when it was more common for manual Futures to be written with a "state machine".

He thinks about his stream's states, and settles on the following structure:


#![allow(unused)]
fn main() {
struct SigningFile {
    state: State,
    file: ReaderStream,
    sig: Signature,
}

enum State {
    File,
    Sign,
}
}

It turns out it was more complicated than Alan thought (the author made this same mistake). The digest method of Signature is async, and it consumes the signature, so the state machine needs to be adjusted. The signature needs to be able to be moved out, and it needs to be able to store a future from an async fn. Trying to figure out how to represent that in the type system was difficult. He considered adding a generic T: Future to the State enum, but then wasn't sure what to set that generic to. Then, he tries just writing Signing(impl Future) as a state variant, but that triggers a compiler error that impl Trait isn't allowed outside of function return types. Patient Barbara helped again, so that Alan learns to just store a Pin<Box<dyn Future>>, wondering if the Pin there is important.


#![allow(unused)]
fn main() {
struct SigningFile {
    state: State,
}

enum State {
    File(ReaderStream, Signature),
    Signing(Pin<Box<dyn Future<Output = Vec<u8>>>>),
    Done,
}
}

Now he tries to write the poll_next method, checking readiness of individual steps (thankfully, Alan remembers ready! from the futures 0.1 blog posts he read) and proceeding to the next state, while grumbling away the weird Pin noise:


#![allow(unused)]
fn main() {
match self.state {
    State::File(ref mut file, ref mut sig) => {
        match ready!(Pin::new(file).poll_next(cx)) {
            Some(result) => {
                let chunk = result?;
                sig.push(&chunk);
                Poll::Ready(Some(Ok(chunk)))
            },
            None => {
                let sig = match std::mem::replace(&mut self.state, State::Done) {
                    State::File(_, sig) => sig,
                    _ => unreachable!(),
                };
                self.state = State::Signing(Box::pin(sig.digest()));
                Poll::Pending
            }
        }
    },
    State::Signing(ref mut sig) => {
        let last_chunk = ready!(sig.as_mut().poll(cx));
        self.state = State::Done;
        Poll::Ready(Some(Ok(last_chunk)))
    }
    State::Done => Poll::Ready(None),
}
}

Oh well, at least it works, right?

Wakers

So far, Alan hasn't paid too much attention to Context and Poll. It's been fine to simply pass them along untouched. There's a confusing bug in his state machine. Let's look more closely:


#![allow(unused)]
fn main() {
// zooming in!
match ready!(Pin::new(file).poll_next(cx)) {
    Some(result) => {
        let chunk = result?;
        sig.push(&chunk);
        return Poll::Ready(Some(Ok(val));
    },
    None => {
        self.set_state_to_signing();
        // oops!
        return Poll::Pending;
    }
}
}

In one of the branches, the state is changed, and Poll::Pending is returned. Alan assumes that the task will be polled again with the new state. But, since the file was done (and has returned Poll::Ready), there was actually no waker registered to wake the task again. So his stream just hangs forever.

The compiler doesn't help at all, and he re-reads his code multiple times, but because of this easy-to-misunderstand logic error, Alan eventually has to ask for help in a chat room. After a half hour of explaining all sorts of details, a kind person points out he either needs to register a waker, or perhaps use a loop.

All too often, since we don't want to duplicate code in multiple branches, the solution for Alan is to add an odd loop around the whole thing, so that the next match branch uses the Context:


#![allow(unused)]
fn main() {
loop {
    match self.state {
        State::File(ref mut file, ref mut sig) => {
            match ready!(Pin::new(file).poll_next(cx)) {
                Some(result) => {
                    let chunk = result?;
                    sig.push(&chunk);
                    return Poll::Ready(Some(Ok(chunk)))
                },
                None => {
                    let sig = match std::mem::replace(&mut self.state, State::Done) {
                        State::File(_, sig) => sig,
                        _ => unreachable!(),
                    };
                    self.state = State::Signing(Box::pin(sig.digest()));
                    // loop again, to catch the `State::Signing` branch
                }
            }
        },
        State::Signing(ref mut sig) => {
            let last_chunk = ready!(sig.as_mut().poll(cx));
            self.state = State::Done;
            return Poll::Ready(Some(Ok(last_chunk)))
        }
        State::Done => return Poll::Ready(None),
    }
}
}

Gives Up

A little later, Alan needs to add some response body transforming to some routes, to add some app-specific framing. Upon realizing he needs to implement another Stream in a generic fashion, he instead closes the editor and complains on Twitter.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Writing an async Stream is drastically different than writing an async fn.
  • The documentation for Pin doesn't provide much practical guidance in how to use it, instead focusing on more abstract considerations.
  • Missing a waker registration is a runtime error, and very hard to debug. If it's even possible, a compiler warning or hint would go a long way.

What are the sources for this story?

Part of this story is based on the original motivation for async/await in Rust, since similar problems exist writing impl Future.

Why did you choose Alan to tell this story?

Choosing Alan was somewhat arbitrary, but this does get to reuse the experience that Alan may already have around await coming from JavaScript.

How would this story have played out differently for the other characters?

  • This likely would have been a similar story for any character.
  • It's possible Grace would be more used to writing state machines, coming from C.

๐Ÿ˜ฑ Status quo stories: Alan iteratively regresses performance

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

A core part of DistriData, called DDSplit, is in charge of splitting input data records into fragments that are stored on distinct servers, and then reassembling those fragments back into records in response to user queries.

DDSplit was originally implemented using Java code (plus some C, interfaced via JNI). Alan thinks that Rust could provide the same quality of service while requiring less memory. He decides to try reimplementing DDSplit in Rust, atop tokio.

Alan wants to copy some of the abstractions he sees in the Java code that are defined via Java interfaces. Alan sees Rust traits as the closest thing to Java interfaces. However, when he experimentally defines a trait with an async fn, he gets the following message from the compiler:

error[E0706]: functions in traits cannot be declared `async`
 --> src/main.rs:2:5
  |
2 |     async fn method() { }
  |     -----^^^^^^^^^^^^^^^^
  |     |
  |     `async` because of this
  |
  = note: `async` trait functions are not currently supported
  = note: consider using the `async-trait` crate: https://crates.io/crates/async-trait

This diagnostic leads Alan to add the async-trait crate as a dependency to his project. Alan then uses the #[async_trait] attribute provided by that crate to be able to define async fn methods within traits.

When Alan finishes the prototype code, he finds the prototype performance has 20% slower throughput compared to the Java version.

Alan is disappointed; his experience has been that Rust code performs great, (at least once you managed to get the code to be accepted by the compiler). Alan was not expecting to suffer a 20% performance hit over the Java code.

The DDSplit service is being developed on a Linux machine, so Alan is able use the perf tool to gather sampling-based profiling data the async/await port of DDSplit.

Looking at a flamegraph for the call stacks, Alan identified two sources of execution time overhead that he did not expect: calls into the memory allocator (malloc) with about 1% of the execution time, and calls to move values in memory (memcpy), with about 8% of execution time.

Alan reaches out to Barbara, as the local Rust expert, for help on how identify where the performance pitfalls are coming from.

Alan asks Barbara whether the problem could be caused by the tokio executor. Barbara says it is hard to know that without more instrumentation. She explains it could be that the program is overloading tokio's task scheduler (for example), but it also could be that the application code itself has expensive operations, such as lots of small I/O operations rather than using a buffer.

Alan and Barbara look at the perf data. They find the output of perf report difficult to navigate and interpret. The data has stack trace fragments available, which gives them a few hints to follow up on. But when they try to make perf report annotate the original source, perf only shows disassembled machine code, not the original Rust source code. Alan and Barbara both agree that trying to dissect the problem from the machine code is not an attractive strategy.

Alan asks Barbara what she thinks about the malloc calls in the profile. Barbara recommends that Alan try to eliminate the allocation calls, and if they cannot be eliminated, then that Alan try tuning the parameters for the global memory allocator, or even switching which global memory allocator he is using. Alan looks at Barbara in despair: his time tweaking GC settings on the Java Virtual Machine taught him that allocator tuning is often a black art.

Barbara suggests that they investigate where the calls to memcpy are arising, since they look like a larger source of overhead based on the profile data. From the call stacks in perf report, Alan and Barbara decide to skim over the source code files for the corresponding functions.

Upon seeing #[async_trait] in Alan's source code, Barbara recommends that if performance is a concern, then Alan should avoid #[async_trait]. She explains that #[async_trait] transforms a trait's async methods into methods that return Pin<Box<dyn Future>>, and the overhead that injects that will be hard to diagnose and impossible to remove. When Alan asks what other options he could adopt, Barbara thinks for a moment, and says he could make an enum that carries all the different implementations of the code. Alan says he'll consider it, but in the meantime he wants to see how far they can improve the code while keeping #[async_trait].

They continue looking at the code itself, essentially guessing at potential sources of where problematic memcpy's may be arising. They identify two potential sources of moves of large datatypes in the code: pushes and pops on vectors of type Vec<DistriQuery>, and functions with return types of the form Result<SuccessCode, DistriErr>.

Barbara asks how large the DistriQuery, SuccessCode, and DistriErr types are. Alan immediately notes that DistriQuery may be large, and they discuss options for avoiding the memory traffic incurred by pushing and popping DistriQuery.

For the other two types, Alan responds that the SuccessCode is small, and that the error variants are never constructed in his benchmark code. Barbara explains that the size of Result<T, E> has to be large enough to hold either variant, and that memcpy'ing a result is going to move all of those bytes. Alan investigates and sees that DistriErr has variants that embed byte arrays that go up to 50kb in size. Barbara recommends that Alan look into boxing the variants, or the whole DistriErr type itself, in order to reduce the cost of moving it around.

Alan uses Barbara's feedback to box some of the data, and this cuts the memcpy traffic in the perf report to one quarter of what it had been reporting previously.

However, there remains a significant performance delta between the Java version and the Rust version. Alan is not sure his Rust-rewrite attempt is going to get anywhere beyond the prototype stage.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  1. Rust promises great performance, but when performance is not meeting one's targets, it is hard to know what to do next. Rust mostly leans on leveraging existing tools for native code development, but those tools are (a.) foreign to many of our developers, (b.) do not always measure up to what our developers have access to elsewhere, (c.) do not integrate as well with Rust as they might with C or C++.

  2. Lack of certain language features leads developers to use constructs like #[async_trait] which add performance overhead that is (a.) hard to understand and (b.) may be significant.

  3. Rust makes some things very explicit, e.g. the distinction between Box<T> versus T is quite prominent. But Rust's expressive type system also makes it easy to compose types without realizing how large they have gotten.

  4. Programmers do not always have a good mental model for where expensive moves are coming from.

  5. An important specific instance of (1c.) for the async vision: Native code tools do not have any insight into Rust's async model, as that is even more distant from the execution model of C and C++.

  6. We can actually generalize (5.) further: When async performance does not match expectations, developers do not have much insight into whether the performance pitfalls arise from issues deep in the async executor that they have selected, or if the problems come directly from overheads built into the code they themselves have written.

What are the sources for this story?

Discussions with engineers at Amazon Web Services.

Why did you choose Alan to tell this story?

I chose Alan because he is used to Java, where these issues play out differently.

Java has very mature tooling, including for performance investigations. Alan has used JProfiler at his work, and VisualVM for personal hobby projects. Alan is frustrated by his attempts to use (or even identify) equivalent tools for Rust.

With respect to memory traffic: In Java, every object is handled via a reference, and those references are cheap to copy. (One pays for that convenience in other ways, of course.)

How would this story have played out differently for the other characters?

From her C and C++ background, Grace probably would avoid letting her types get so large. But then again, C and C++ do not have enums with a payload, so Grace would likely have fallen in the same trap that Alan did (of assuming that the cost of moving an enum value is proportional to its current variant, rather than to its type's overall size). Also, Grace might report that her experience with gcc-based projects yielded programs that worked better with perf, due in part to gcc producing higher quality DWARF debuginfo.

Barbara probably would have added direct instrumentation via the tracing crate, potentially even to tokio itself, rather than spend much time wrestling with perf.

Niklaus is unlikely to be as concerned about the 20% throughput hit; he probably would have been happy to get code that seems functionally equivalent to the original Java version.

๐Ÿ˜ฑ Status quo stories: Alan lost the world!

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Alan heard about a project to reimplement a deprecated browser plugin using Rust and WASM. This old technology had the ability to load resources over HTTP; so it makes sense to try and implement that functionality using the Fetch API. Alan looks up the documentation of web_sys and realizes they need to...

  1. Call one of the fetch methods, which returns a Promise
  2. Convert the Promise into a Rust thing called a Future
  3. await the Future in an async function
  4. Do whatever they want with the resulting data

#![allow(unused)]
fn main() {
use web_sys::{Request, window};

fn make_request(src: &url) -> Request {
    // Pretend this contains all of the complicated code necessary to
    // initialize a Fetch API request from Rust
}

async fn load_image(src: String) {
    let request = make_request(&url);
    window().unwrap().fetch_with_request(&request).await;
    log::error!("It worked");
}
}

Alan adds calls to load_image where appropriate. They realize that nothing is happening, so they look through more documentation and find a thing called spawn_local. Once they pass the result of load_image into that function, they see their log message pop up in the console, and figure it's time to actually do something to that loaded image data.

At this point, Alan wants to put the downloaded image onto the screen, which in this project means putting it into a Node of the current World. A World is a bundle of global state that's passed around as things are loaded, rendered, and scripts are executed. It looks like this:


#![allow(unused)]

fn main() {
/// All of the player's global state.
pub struct World<'a> {
    /// A list of all display Nodes.
    nodes: &'a mut Vec<Node>,

    /// The last known mouse position.
    mouse_pos &'a mut (u16, u16),

    // ...
}
}

In synchronous code, this was perfectly fine. Alan figures it'll be fine in async code, too. So Alan adds the world as a function parameter and everything else needed to parse an image and add it to our list of nodes:


#![allow(unused)]
fn main() {
async fn load_image(src: String, inside_of: usize, world: &mut World<'_>) {
    let request = make_request(&url);
    let data = window().unwrap().fetch_with_request(&request).await.unwrap().etc.etc.etc;
    let image = parse_png(data, context);

    let new_node_index = world.nodes.len();
    if let Some(parent) = world.nodes.get(inside_of) {
        parent.set_child(new_node_index);
    }
    world.nodes.push(image.into());
}
}

Bang! Suddenly, the project stops compiling, giving errors like...

error[E0597]: `world` does not live long enough
  --> src/motionscript/globals/loader.rs:21:43

Hmm, okay, that's kind of odd. We can pass a World to a regular function just fine - why do we have a problem here? Alan glances over at loader.rs...


#![allow(unused)]
fn main() {
fn attach_image_from_net(world: &mut World<'_>, args: &[Value]) -> Result<Value, Error> {
    let this = args.get(0).coerce_to_object()?;
    let url = args.get(1).coerce_to_string()?;

    spawn_local(load_image(url, this.as_node().ok_or("Not a node!")?, world))
}
}

Hmm, the error is in that last line. spawn_local is a thing Alan had to put into everything that called load_image, otherwise his async code never actually did anything. But why is this a problem? Alan can borrow a World, or anything else for that matter, inside of async code; and it should get it's own lifetime like everything else, right?

Alan has a hunch that this spawn_local thing might be causing a problem, so Alan reads the documentation. The function signature seems particularly suspicious:


#![allow(unused)]
fn main() {
pub fn spawn_local<F>(future: F) 
where
    F: Future<Output = ()> + 'static
}

So, spawn_local only works with futures that return nothing - so far, so good - and are 'static. Uh-oh. What does that last bit mean? Alan asks Barbara, who responds that it's the lifetime of the whole program. Yeah, but... the async function is part of the program, no? Why wouldn't it have the 'static lifetime? Does that mean all functions that borrow values aren't 'static, or just the async ones?

Barbara explains that when you borrow a value in a closure, the closure doesn't gain the lifetime of that borrow. Instead, the borrow comes with it's own lifetime, separate from the closure's. The only time a closure can have a non-'static lifetime is if one or more of its borrows is not provided by it's caller, like so:


#![allow(unused)]
fn main() {
fn benchmark_sort() -> usize {
    let mut num_times_called = 0;
    let test_values = vec![1,3,5,31,2,-13,10,16];

    test_values.sort_by(|a, b| {
        a.cmp(b)
        num_times_called += 1;
    });

    num_times_called
}
}

The closure passed to sort_by has to copy or borrow anything not passed into it. In this case, that would be the num_times_called variable. Since we want to modify the variable, it has to be borrowed. Hence, the closure has the lifetime of that borrow, not the whole program, because it can't be called anytime - only when num_times_called is a valid thing to read or write.

Async functions, it turns out, act like closures that don't take parameters! They have to, because all Futures have to implement the same trait method poll:


#![allow(unused)]
fn main() {
pub trait Future {
    type Output;

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
}

When you call an async function, all of it's parameters are copied or borrowed into the Future that it returns. Since we need to borrow the World, the Future has the lifetime of &'a mut World, not of 'static.

Barbara suggests changing all of the async function's parameters to be owned types. Alan asks Grace, who architected this project. Grace recommends holding a reference to the Plugin that owns the World, and then borrowing it whenever you need the World. That ultimately looks like the following:


#![allow(unused)]
fn main() {
async fn load_image(src: String, inside_of: usize, player: Arc<Mutex<Player>>) {
    let request = make_request(&url);
    let data = window().unwrap().fetch_with_request(&request).await.unwrap().etc.etc.etc;
    let image = parse_png(data, context);

    player.lock().unwrap().update(|world| {
        let new_node_index = world.nodes.len();
        if let Some(parent) = world.nodes.get(inside_of) {
            parent.set_child(new_node_index);
        }
        world.nodes.push(image.into());
    });
}
}

It works, well enough that Alan is able to finish his changes and PR them into the project. However, Alan wonders if this could be syntactically cleaner, somehow. Right now, async and update code have to be separated - if we need to do something with a World, then await something else, that requires jumping in and out of this update thing. It's a good thing that we only really have to be async in these loaders, but it's also a shame that we practically can't mix async code and Worlds.

๐Ÿค” Frequently Asked Questions

  • What are the morals of the story?
    • Async functions capture all of their parameters for the entire duration of the function. This allows them to hold borrows of those parameters across await points.
      • When the parameter represents any kind of "global environment", such as the World in this story, it may be useful for that parameter not to be captured by the future but rather supplied anew after each await point.
    • Non-'static Futures are of limited use to developers, as lifetimes are tied to the sync stack. The execution time of most asynchronous operations does not come with an associated lifetime that an executor could use.
      • It is possible to use borrowed futures with block_on style executors, as they necessarily extend all lifetimes to the end of the Future. This is because they turn asynchronous operations back into synchronous ones.
      • Most practical executors want to release the current stack, and thus all of it's associated lifetimes. They need 'static futures.
    • Async programming introduces more complexity to Rust than it does, say, JavaScript. The complexity of async is sometimes explained in terms of 'color', where functions of one 'color' can only call those of another under certain conditions, and developers have to keep track of what is sync and what is async. Due to Rust's borrowing rules, we actually have three 'colors', not the two of other languages with async I/O:
      • Sync, or 'blue' in the original metaphor. This color of function can both own and borrow it's parameters. If made into the form of a closure, it may have a lifetime if it borrows something from the current stack.
      • Owned Async, or 'red' in the original metaphor. This color of function can only own parameters, by copying them into itself at call time.
      • Borrowed Async. If an async function borrows at least one parameter, it gains a lifetime, and must fully resolve itself before the lifetime of it's parameters expires.
  • What are the sources for this story?
    • This is personal experience. Specifically, I had to do almost exactly this dance in order to get fetch to work in Ruffle.
    • I have omitted a detail from this story: in Ruffle, we use a GC library (gc_arena) that imposes a special lifetime on all GC references. This is how the GC library upholds it's memory safety invariants, but it's also what forces us to pass around contexts, and once you have that, it's natural to start putting even non-GC data into it. It also means we can't hold anything from the GC in the Future as we cannot derive it's Collect trait on an anonymous type.
  • Why did you choose Alan to tell this story?
    • Lifetimes on closures is already non-obvious to new Rust programmers and using them in the context of Futures is particularly unintuitive.
  • How would this story have played out differently for the other characters?
    • Niklaus probably had a similar struggle as Alan.
    • Grace would have felt constrained by the async syntax preventing some kind of workaround for this problem.
    • Barbara already knew about Futures and 'static and carefully organizes their programs accordingly.

๐Ÿ˜ฑ Status quo stories: Alan misses C# async

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

First attempt

Alan has finally gotten comfortable working in rust and finally decides to try writing async code. He's used C#'s async and mostly loved the experience, so he decides to try writing it the same way:

async fn run_async() {
    println!("Hello async!");
}

fn main() {
    run_async();
}

But the compiler didn't like this:

warning: unused implementer of `Future` that must be used
 --> src/main.rs:6:5
  |
6 |     run_async();
  |     ^^^^^^^^^^^^
  |
  = note: `#[warn(unused_must_use)]` on by default
  = note: futures do nothing unless you `.await` or poll them

Alan has no idea what Future is; he's never seen this before and it's not in his code. He sees the note in the warning and adds .await to the line in main:

fn main() {
    run_async().await;
}

The compiler does't like this either.

error[E0728]: `await` is only allowed inside `async` functions and blocks
 --> src/main.rs:6:5
  |
5 | fn main() {
  |    ---- this is not `async`
6 |     run_async().await;
  |     ^^^^^^^^^^^^^^^^^ only allowed inside `async` functions and blocks

... so Alan adds async to main:

async fn main() {
    run_async().await;
}

which prompts yet another error from the compiler:

error[E0277]: `main` has invalid return type `impl Future`
 --> src/main.rs:5:17
  |
5 | async fn main() {
  |                 ^ `main` can only return types that implement `Termination`
  |
  = help: consider using `()`, or a `Result`

error[E0752]: `main` function is not allowed to be `async`
 --> src/main.rs:5:1
  |
5 | async fn main() {
  | ^^^^^^^^^^^^^^^ `main` function is not allowed to be `async`

So Alan decides to do a lot of research online and hunting around on StackOverflow. He learns that async fn returns a value, but it's not the same as the value returned from async functions in C#. In C#, the object he gets back can only be used to query the result of an already running thread of work. The rust one doesn't seem to do anything until you call .await on it. Alan thinks this is really nice because he now has more control over when the processing starts. You seem to get the same control as constructing a Task manually in C#, but with a lot less effort.

He also ends up finding out a little about executors. tokio seems to be really popular, so he incorporates that into his project:

async fn run_async() {
    println!("Hello async!");
}

#[tokio::main]
async fn main() {
    run_async().await;
}

And it works!

Hello async!

Attempting concurrency

Alan decides to try running two async functions concurrently. "This is pretty easy in C#," he thinks, "This can't be too hard in rust."

In C# Alan would usually write something like:

async Task expensive1() {
    ...
}

async Task expensive2() {
    ...
}

public static async Main() {
    Task task = expensive1();
    await expensive2();
    task.Wait();
}

If the code was more dynamic, Alan could have also used the Task API to simplify the await:

public static Main() {
    List<Task> tasks = new List<Task>();
    tasks.push(expensive1());
    tasks.push(expensive2());
    try {
        Task.WaitAll(tasks.ToArray());
    }
    // Ignore exceptions here.
    catch (AggregateException) {}
}

So Alan tries the first approach in rust:

use std::sync::mpsc::{self, Sender, Receiver};

async fn expensive1(tx: Sender<()>, rx: Receiver<()>) {
    println!("Doing expensive work in 1");
    tx.send(()).ok();
    let _ = rx.recv();
    println!("Got result, finishing processing in 1");
    println!("1 done");
}

async fn expensive2(tx: Sender<()>, rx: Receiver<()>) {
    println!("Doing simple setup in 2");
    let _ = rx.recv();
    println!("Got signal from 1, doing expensive processing in 2");
    tx.send(()).ok();
    println!("2 done");
}

#[tokio::main]
async fn main() {
    let (tx1, rx1) = mpsc::channel();
    let (tx2, rx2) = mpsc::channel();
    expensive1(tx1, rx2).await;
    expensive2(tx2, rx1).await;
}

But this just hangs after printing:

Doing expensive work in 1

Alan wonders if this means he can't run code concurrently... he does some research and learns about join, which doesn't seem to be part of the std. This seems like the second example in C#, but Alan is surprised it doesn't come with the standard library. He has to import futures as a dependency and tries again:

use futures::join;
...

#[tokio::main]
async fn main() {
    let (tx1, rx1) = mpsc::channel();
    let (tx2, rx2) = mpsc::channel();
    let fut1 = expensive1(tx1, rx2);
    let fut2 = expensive2(tx2, rx1);
    join!(fut1, fut2);
}

But this still hangs the same way as the first attempt. After more research, Alan learns that he can't use the standard mpsc::channel in async contexts. He needs to use the ones in the external futures crate. This requires quite a few changes since the API's don't line up with the one's in std:

  • rx has to be mut
  • there's bounded and unbounded mpsc channels, Alan went with unbounded since the API seemed simpler for now
  • you need to import the StreamExt trait to be able to get a value out of rx, this took a lot of research to get right.
use futures::{
    join,
    channel::mpsc::{self, UnoundedSender, UnboundedReceiver},
    StreamExt,
};
use std::sync::mpsc::{self, Sender, Receiver};

async fn expensive1(tx: Sender<()>, mut rx: Receiver<()>) {
    println!("Doing expensive work in 1");
    tx.unbounded_send(()).ok();
    let _ = rx.next().await;
    println!("Got result, finishing processing in 1");
    println!("1 done");
}

async fn expensive2(tx: Sender<()>, mut rx: Receiver<()>) {
    println!("Doing simple setup in 2");
    let _ = rx.next().await;
    println!("Got signal from 1, doing expensive processing in 2");
    tx.unbounded_send(()).ok();
    println!("2 done");
}

#[tokio::main]
async fn main() {
    let (tx1, rx1) = mpsc::channel();
    let (tx2, rx2) = mpsc::channel();
    let fut1 = expensive1(tx1, rx2);
    let fut2 = expensive2(tx2, rx1);
    join!(fut1, fut2);
}

And now it works!

Doing expensive work in 1
Doing simple setup in 2
Got signal from 1, doing expensive processing in 2
2 done
Got result, finishing processing in 1
1 done

While this is more similar to using the Task.WaitAll from C#, there were a lot more changes needed than Alan expected.

Cancelling tasks

Another pattern Alan had to use frequently in C# was accounting for cancellation of tasks. Users in GUI applications might not want to wait for some long running operation or in a web server some remote calls might time out. C# has a really nice API surrounding CancellationTokens.

They can be used in a fashion similar to (overly simplified example):

async Task ExpensiveWork(CancellationToken token) {
    while (not_done) {
        // Do expensive operations...
        if (token.IsCancellationRequested) {
            // Cleanup...
            break;
        }
    }
}

public static async Main() {
    // Create the cancellation source and grab its token.
    CancellationTokenSource source = new CancellationTokenSource();
    CancellationToken token = source.Token;

    // Setup a handler so that on user input the expensive work will be canceled.
    SetupInputHandler(() => {
        // on user cancel
        source.Cancel();
    });

    // Pass the token to the functions that should be stopped when requested.
    await ExpensiveWork(token);
}

Alan does some research. He searches for "rust async cancellation" and can't find anything similar. He reads that "dropping a future is cancelling it". In his junior dev days, Alan might have run with that idea and moved on to the next task, but experienced Alan knows something is wrong here. If he drops a Future how does he control the cleanup? Which await point is the one that will not be processed? This scares Alan since he realized he could get some really nasty bugs if this happens in production. In order to work around this, Alan needs to make sure every future around critical code is carefully reviewed for drops in the wrong places. Alan also decided he needs to come up with some custom code to handle cancelling.

Alan decides to ask around, and gets suggestions for searching with "rust cancel future" or "rust cancel async". He finds out about tokio's tokio_util::sync::CancellationToken, and also the stop-token and stopper crates. He decides to try working with the version in tokio_util since he's already using tokio. Looking at the docs for each, they all seem to behave how Alan expected, though he couldn't use stop-token since that only works with async-std. stopper also seems like a good alternative, but he decides to go with the type that is built by the tokio team.

Reading the docs it seems that the tokio CancellationToken acts more like a combination of C#'s CancellationTokenSource and CancellationToken. He needs to pass the tokens generated from a call to child_token() and keep the main token for triggering cancellation. One advantage that all of the token crates seem to have is that they can also integrate directly with streams and futures, or be polled directly (as a stream or boolean).

use tokio_util::sync::CancellationToken;
use futures::StreamExt;
// ...

fn generate_work() -> impl Stream<Item = Work> {
    // ...
}

async fn expensive_work(token: CancellationToken) {
    let mut work_stream = generate_work();
    loop {
        if let Some(op) = work_stream.next().await {
            op.work().await;
        } else {
            break;
        }

        if token.is_cancelled() {
            break;
        }
    }
}

#[tokio::main]
async fn main() {
    let token = CancellationToken::new();
    let child_token = token.child_token();
    setup_input_handler(move || {
        token.cancel();
    });

    expensive_work(child_token).await;
}

This seems relatively straightforward!

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • First Attempt
    • Unused implementer warnings for Futures are less clear than they are for, e.g. Result.
    • It's not as easy to jump into experimenting with async as compared to synchronous code. It requires a lot more front-end research on the user's end.
    • Developers might need to unlearn async behavior from other languages in order to understand async rust.
    • Dynamic languages with async provide async main, but rust does not. We could be more helpful by explaining this in compiler errors.
  • Attempting Concurrency
    • Trying to use items from std is the obvious thing to try, but wrong because they are blocking.
    • The corresponding async versions of the std items don't exist in std, but are in futures crate. So it's hard to actually develop in async without the futures crates.
  • Cancelling Tasks
    • It's not obvious that futures could only run part-way.
    • Async types and crates can be bound to certain ecosystems, limiting developers' ability to reuse existing code.

What are the sources for this story?

  • The docs for oneshot::Canceled mentions that dropping a Sender will cancel the future. Someone inexperienced might accidentally apply this to a broader scope of types.
  • This IRLO post has a nice discussion on cancellation, where the linked gist is a thorough overview of problems surrounding cancelation in async rust, with comparisons to other languages.

Why did you choose Alan to tell this story?

C# is a garbage collected language that has had async for a long time. Alan best fit the model for a developer coming from such a language.

How would this story have played out differently for the other characters?

  • Barbara may already be used to the ideosynchracies of async in rust. She may not realize how difficult it could be for someone who has a very different model of async engrained into them.
  • Grace has likely never used async utilities similar to the ones in C# and other GC languages. C and C++ tend to use callbacks to manage async workflows. She may have been following the C++ proposals for coroutines (e.g. co_await, co_yield, co_return), but similar to rust, the utilities are not yet thoroughly built out in those spaces. She may be familiar with cancelation in external libraries like cppcoro, or async in general with continuable
  • Niklaus may not have had enough experience to be wary of some of the pitfalls encountered here. He might have introduced bugs around dropping futures (to cancel) without realizing it.

๐Ÿ˜ฑ Status quo stories: Alan needs async in traits

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories [cannot be wrong], only inaccurate). Alternatively, you may wish to [add your own status quo story][htvsq]!

The story

Alan is working on a project with Barbara which has already gotten off to a somewhat rocky start. He is working on abstracting away the HTTP implementation the library uses so that users can provide their own. He wants the user to implement an async trait called HttpClient which has one method perform(request: Request) -> Response. Alan tries to create the async trait:


#![allow(unused)]
fn main() {
trait HttpClient {
    async fn perform(request: Request) -> Response;
}
}

When Alan tries to compile this, he gets an error:

 --> src/lib.rs:2:5
  |
2 |     async fn perform(request: Request) -> Response;
  |     -----^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |     |
  |     `async` because of this
  |
  = note: `async` trait functions are not currently supported
  = note: consider using the `async-trait` crate: https://crates.io/crates/async-trait

Alan, who has been using Rust for a little while now, has learned to follow compiler error messages and adds async-trait to his Cargo.toml. Alan follows the README of async-trait and comes up with the following code:


#![allow(unused)]
fn main() {
#[async_trait]
trait HttpClient {
    async fn perform(request: Request) -> Response;
}
}

Alan's code now compiles, but he also finds that his compile times have gone from under a second to around 6s, at least for a clean build.

After Alan finishes adding the new trait, he shows his work off to Barbara and mentions he's happy with the work but is a little sad that compile times have worsened. Barbara, an experienced Rust developer, knows that using async-trait comes with some additional issues. In this particular case she is especially worried about tying their public API to a third-party dependency. Even though it is technically possible to implement traits annotated with async_trait without using async_trait, doing so in practice is very painful. For example async_trait:

  • handles lifetimes for you if the returned future is tied to the lifetime of some inputs.
  • boxes and pins the futures for you.

which the implementer will have to manually handle if they don't use async_trait. She decides to not worry Alan with this right now. Alan and Barbara are pretty happy with the results and go on to publish their crate which gets lots of users.

Later on, a potential user of the library wants to use their library in a no_std context where they will be providing a custom HTTP stack. Alan and Barbara have done a pretty good job of limiting the use of standard library features and think it might be possible to support this use case. However, they quickly run into a show stopper: async-trait boxes all of the futures returned from a async trait function. They report this to Alan through an issue.

Alan, feeling (over-) confident in his Rust skills, decides to try to see if he can implement async traits without using async-trait.


#![allow(unused)]
fn main() {
trait HttpClient {
   type Response: Future<Output = Response>;

   fn perform(request: Request) -> Self::Response; 
}
}

Alan seems to have something working, but when he goes to update the examples of how to implement this trait in his crate's documentation, he realizes that he either needs to:

  • use trait object:

    
    #![allow(unused)]
    fn main() {
    struct ClientImpl;
    
    impl HttpClient for ClientImpl {
        type Response = Pin<Box<dyn Future<Output = Response>>>;
    
        fn perform(request: Request) -> Self::Response {
            Box::pin(async move {
                // Some async work here creating Reponse
            })
        }
    }
    }
    

    which wouldn't work for no_std.

  • implement Future trait manually, which isn't particularly easy/straight-forward for non-trivial cases, especially if it involves making other async calls (likely).

After a lot of thinking and discussion, Alan and Barbara accept that they won't be able to support no_std users of their library and add mention of this in crate documentation.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • async-trait is awesome, but has some drawbacks
    • compile time increases
    • performance cost of boxing and dynamic dispatch
    • not a standard solution so when this comes to language, it might break things
  • Trying to have a more efficient implementation than async-trait is likely not possible.

What are the sources for this story?

Why did you choose Alan to tell this story?

We could have used Barbara here but she'd probably know some of the work-arounds (likely even the details on why they're needed) and wouldn't need help so it wouldn't make for a good story. Having said that, Barbara is involved in the story still so it's not a pure Alan story.

How would this story have played out differently for the other characters?

  • Barbara: See above.
  • Grace: Probably won't know the solution to these issues much like Alan, but might have an easier time understanding the why of the whole situation.
  • Niklaus: would be lost - traits are somewhat new themselves. This is just more complexity, and Niklaus might not even know where to go for help (outside of compiler errors).

๐Ÿ˜ฑ Status quo stories: Alan runs into stack allocation trouble

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The problem

One day, as Alan is working on his async Rust project, he runs his application and hits an error:

$ .\target\debug\application.exe
thread 'main' has overflowed its stack

Perplexed, Alan sees if anything with his application works by seeing if he can get output when the --help flag is passed, but he has no luck:

$ .\target\debug\application.exe --help
thread 'main' has overflowed its stack

Searching for the solution

Having really only ever seen stack overflow issues caused by recursive functions, Alan desperately tries to find the source of the bug but searching through the codebase for recursive functions only to find none. Having learned that Rust favors stack allocation over heap allocation (a concept Alan didn't really need to worry about before), he started manually looking through his code, searching for structs that looked "too large"; he wasn't able to find any candidates.

Confused, Alan reached out to Grace for her advice. She suggested making the stack size larger. Although she wasn't a Windows expert, she remembers hearing that stack sizes on Windows might be smaller than on Linux. After much searching, Alan discovers an option do just that: RUSTFLAGS = "-C link-args=-Wl,-zstack-size=<size in bytes>".

While eventually Alan gets the program to run, the stack size must be set to 4GB before it does! This seems untenable, and Alan goes back to the drawing board.

Alan reaches out to Barbara for her expertise in Rust to see if she has something to suggest. Barbara recommends using RUSTFLAGS = "-Zprint-type-sizes to print some type sizes and see if anything jumps out. Barbara noted that if Alan does find a type that stands out, it's usually as easy as putting some boxes in that type to provide some indirection and not have everything be stack allocated. Alan never needs the nightly toolchain, but this option requires it so he installs it using rustup. After searching through types, one did stand out as being quite large. Ultimately, this was a red herring, and putting parts of it in Boxes did not help.

Finding the solution

After getting no where, Alan went home for the weekend defeated. On Monday, he decided to take another look. One piece of code, stuck out to him: the use of the select! macro from the futures crate. This macro allowed multiple futures to race against each other, returning the value of the first one to finish. This macro required the futures to be pinned which the docs had shown could be done by using pin_mut!. Alan didn't fully grasp what pin_mut! was actually doing when he wrote that code. The compiler had complained to him that the futures he was passing to select! needed to be pinned, and pin_mut! was what he found to make the compiler happy.

Looking back at the documents made it clear to Alan that this could potentially be the issue: pin_mut! pins futures to the stack. It was relatively clear that a possible solution would be to pin to the heap instead of the stack. Some more digging in the docs lead Alan to Box::pin which did just that. An extra heap allocation was of no consequence to him, so he gave it a try. Lo and behold, this fixed the issue!

While Alan knew enough about pinning to know how to satisfy the compiler, he didn't originally take the time to fully understand what the consequences were of using pin_mut! to pin his futures. Now he knows!

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • When coming from a background of GCed languages, taking the time to understand the allocation profile of a particular piece of code is not something Alan was used to doing.
  • It was hard to tell where in his code the stack was being exhausted. Alan had to rely on manually combing his code to find the culprit.
  • Pinning is relatively confusing, and although the code compiled, Alan didn't fully understand what he wrote and what consequences his decision to use pin_mut! would have.

What are the sources for this story?

This story is adapted from the experiences of the team working on the Krustlet project. You can read about this story in their own words here.

Why did you choose Alan to tell this story?

  • The programmers this story was based on have an experience mostly in Go, a GCed language.
  • The story is rooted in the explicit choice of using stack vs heap allocation, a choice that in GCed languages is not in the hands of the programmer.

How would this story have played out differently for the other characters?

  • Grace would have likely had a similar hard time with this bug. While she's used to the tradeoffs of stack vs heap allocations, the analogy to the Pin API is not present in languages she's used to.
  • Barbara, as an expert in Rust, may have had the tools to understand that pin_mut is used for pinning to the stack while Box::pin is for pinning heap allocations.
  • This problem is somewhat subtle, so someone like Niklaus would probably have had a much harder time figuring this out (or even getting the code to compile in the first place).

Could Alan have used another API to achieve the same objectives?

Perhaps! Tokio's select! macro doesn't require explicit pinning of the futures it's provided, but it's unclear to this author whether it would have been smart enough to avoid pinning large futures to the stack. However, pinning is a part of the way one uses futures in Rust, so it's possible that such an issue would have arisen elsewhere.

๐Ÿ˜ฑ Status quo stories: Alan started trusting the Rust compiler, but then... async

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

Trust the compiler

Alan has a lot of experience in C#, but in the meantime has created some successful projects in Rust. He has dealt with his fair share of race conditions/thread safety issues during runtime in C#, but is now starting to trust that if his Rust code compiles, he won't have those annoying runtime problems to deal with.

This allows him to try to squeeze his programs for as much performance as he wants, because the compiler will stop him when he tries things that could result in runtime problems. After seeing the performance and the lack of runtime problems, he starts to trust the compiler more and more with each project finished.

He knows what he can do with external libraries, he does not need to fear concurrency issues if the library cannot be used from multiple threads, because the compiler would tell him.

His trust in the compiler solidifies further the more he codes in Rust.

The first async project

Alan now starts with his first async project. He sees that there is no async in the standard library, but after googling for "rust async file open", he finds 'async_std', a crate that provides some async versions of the standard library functions. He has some code written that asynchronously interacts with some files:

use async_std::fs::File;
use async_std::prelude::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt").await?;
    file.write_all(b"Hello, world!").await?;
    Ok(())
}

But now the compiler complains that await is only allowed in async functions. He now notices that all the examples use #[async_std::main] as an attribute on the main function in order to be able to turn it into an async main, so he does the same to get his code compiling:

use async_std::fs::File;
use async_std::prelude::*;

#[async_std::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt").await?;
    file.write_all(b"Hello, world!").await?;

    Ok(())
}

This aligns with what he knows from C#, where you also change the entry point of the program to be async, in order to use await. Everything is great now, the compiler is happy, so no runtime problems, so Alan is happy.

The project is working like a charm.

Fractured futures, fractured trust

The project Alan is building is starting to grow, and he decides to add a new feature that needs to make some API calls. He starts using reqwest in order to help him achieve this task. After a lot of refactoring to make the compiler accept the program again, Alan is satisfied that his refactoring is done. His program now boils down to:

use async_std::fs::File;
use async_std::prelude::*;

#[async_std::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt").await?;
    file.write_all(b"Hello, world!").await?;

    let body = reqwest::get("https://www.rust-lang.org")
        .await?
        .text()
        .await?;
    println!("{}", body);

    Ok(())
}

He runs his project but is suddenly greeted with a runtime error. He is quite surprised. "How is this even possible?", he thinks. "I don't have any out-of-bounds accesses, and I never use .unwrap or .expect." At the top of the error message he sees: thread 'main' panicked at 'there is no reactor running, must be called from the context of a Tokio 1.x runtime'

He searches what "Tokio" is in Rust, and he finds that it also provides an attribute to put on main, namely [tokio::main], but what is the difference with [async_std::main]? His curiosity leads him to watch videos/read blogs/scour reddit,... on why there are multiple runtimes in Rust. This leads him into a rabbit hole and now he learns about Executors, Wakers, Pin,... He has a basic grasp of what they are, but does not have a good understanding of them or how they all fit together exactly. These are all things he had not need to know nor heed in C#. (Note: there is another story about troubles/confusion that might arise when learning all these things about async: Alan hates writing a Stream)

He does understand the current problems and why there is no one-size-fits-all executor (yet). Trying to get his async Rust code to work, he broadened his knowledge about what async code actually is, he gains another way to reason about asynchronous code, not only in Rust, but also more generally.

But now he realizes that there is a whole new area of runtime problems that he did not have to deal with in C#, but he does in Rust. Can he even trust the Rust compiler anymore? What other kinds of runtime problems can occur in Rust that can't in C#? If his projects keep increasing in complexity, will other new kinds of runtime problems keep popping up? Maybe it's better to stick with C#, since Alan already knows all the runtime problems you can have over there.

The Spider-Man effect

Do you recall in Spider-Man, that after getting bitten by the radioactive spider, Peter first gets ill before he gains his powers? Well, imagine instead of being bitten by a radioactive spider, he was bitten by an async-rust spider...

In his work, Alan sees an async call to a C# wrapper around SQLite, his equivalent of a spider-sense (async-sense?) starts tingling. Now knowing from Rust the complexities that arise when trying to create asynchronicity, what kind of complex mechanisms are at play here to enable these async calls from C# that end up in the C/C++ of SQLite?

He quickly discovers that there are no complex mechanism at all! It's actually just a synchronous call all the way down, with just some extra overhead from wrapping it into an asynchronous function. There are no points where the async function will yield. He transforms all these asynchronous calls to their synchronous counterparts, and sees a slight improvement in performance. Alan is happy, product management is happy, customers are happy!

Over the next few months, he often takes a few seconds to reflect about why certain parts of the code are async, if they should be, or how other parts of the code might benefit from being async and if it's possible to make them async. He also uses what he learned from async Rust in his C# code reviews to find similar problems or general issues (With great power...). He even spots some lifetime bugs w.r.t. asynchronous code in C#, imagine that.

His team recognizes that Alan has a pretty good grasp about what async is really about, and he is unofficially crowned the "async guru" of the team.

Even though this spider-man might have gotten "ill" (his negative experience with async Rust), he has now become the superhero he was meant to be!

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Async I/O includes a new set of runtime errors and misbehaviors that the compiler can't help you find. These include cases like executing blocking operations in an async context but also mixing runtime libraries (something users may not even realize is a factor).
  • Rust users get used to the compiler giving them error messages for runtime problems but also helping them to fix them. Pushing error messages to runtimes feels surprising and erodes some of their confidence in Rust.
  • The "cliff" in learning about async is very steep -- at first everything seems simple and similar to other languages, then suddenly you are thrown into a lot of information. It's hard to know what's important and what is not. But, at the same time, dipping your toes into async Rust can broaden the understanding a programmer has of asynchronous coding, which can help them even in other languages than Rust.

What are the sources for this story?

Personal experience of the author.

Why did you choose Alan to tell this story?

With his experience in C#, Alan probably has experience with async code. Even though C# protects him from certain classes of errors, he can still encounter other classes of errors, which the Rust compiler prevents.

How would this story have played out differently for the other characters?

For everyone except Barbara, I think these would play out pretty similarly, as this is a kind of problem unique to Rust. Since Barbara has a lot of Rust experience, she would probably already be familiar with this aspect.

How would this story have played out differently if Alan came from another GC'd language?

It would be very close, since all other languages (that I know of) provide async runtimes out of the box and it's not something the programmer needs to concern themselves with.

๐Ÿ˜ฑ Status quo stories: Alan thinks he needs async locks

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

One of Alan's first Rust related tasks in his job at YouBuy is writing an HTTP based service. This service is a simple internal proxy router that inspects an incoming HTTP request and picks the downstream service to call based on certain aspects of the HTTP request.

Alan decides that he'll simply use some shared state that request handlers can read from in order to decide how to proxy the request.

Alan, having read the Rust book and successfully completed the challenge in the last chapters, knows that shared state can be achieved in Rust with reference counting (using std::sync::Arc) and locks (using std::sync::Mutex). Alan starts by throwing his shared state (a std::collections::HashMap<String, url::Url>) into an Arc<Mutex<T>>.

Alan, smitten with how quickly he can write Rust code, ends up with some code that compiles that looks roughly like this:


#![allow(unused)]
fn main() {
#[derive(Clone)]
struct Proxy {
   routes: Arc<Mutex<HashMap<String, String>>,
}

impl Proxy {
  async fn handle(&self, key: String, request: Request) -> crate::Result<Response> {
      let routes = self.state.lock().unwrap();
      let route = routes.get(key).unwrap_or_else(crate::error::MissingRoute)?;
      Ok(self.client.perform_request(route, request).await?)
  }
}
}

Alan is happy that his code seems to be compiling! The short but hard learning curve has been worth it. He's having fun now!

Unfortunately, Alan's happiness soon comes to end as he starts integrating his request handler into calls to tokio::spawn which he knows will allow him to manage multiple requests at a time. The error message is somewhat cryptic, but Alan is confident he'll be able to figure it out:

189 |     tokio::spawn(async {
    |     ^^^^^^^^^^^^ future created by async block is not `Send`
::: /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.5.0/src/task/spawn.rs:129:21
    |
129 |         T: Future + Send + 'static,
    |                     ---- required by this bound in `tokio::spawn`

note: future is not `Send` as this value is used across an await
   --> src/handler.rs:787:9
      |
786   |         let routes = self.state.lock().unwrap();
      |             - has type `std::sync::MutexGuard<'_, HashMap<String, Url>>` which is not `Send`
787   |         Ok(self.client.perform_request(route, request).await?)
      |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ await occurs here, with `routes` maybe used later
788   |     })
      |     - `routes` is later dropped here

Alan stops and takes a deep breath. He tries his best to make sense of the error message. He sort of understands the issue the compiler is telling him. Apparently routes is not marked as Send, and because it is still alive over a call to await, it is making the future his handler returns not Send. And tokio's spawn function seems to require that the future it received be Send.

Alan reaches the boundaries of his knowledge of Rust, so he reaches out over chat to ask his co-worker Barbara for help. Not wanting to bother her, Alan provides the context he's already figured out for himself.

Barbara knows that mutex guards are not Send because sending mutex guards to different threads is not a good idea. She suggests looking into async locks which can be held across await points because they are Send. Alan looks into the tokio documentation for more info and is easily able to move the use of the standard library's mutex to tokio's mutex. It compiles!

Alan ships his code and it gets a lot of usage. After a while, Alan notices some potential performance issues. It seems his proxy handler does not have the throughput he would expect. Barbara, having newly joined his team, sits down with him to take a look at potential issues. Barbara is immediately worried by the fact that the lock is being held much longer than it needs to be. The lock only needs to be held while accessing the route and not during the entire duration of the downstream request.

She suggests to Alan to switch to not holding the lock across the I/O operations. Alan first tries to do this by explicitly cloning the url and dropping the lock before the proxy request is made:


#![allow(unused)]
fn main() {
impl Proxy {
  async fn handle(&self, key: String, request: Request) -> crate::Result<Response> {
      let routes = self.state.lock().unwrap();
      let route = routes.get(key).unwrap_or_else(crate::error::MissingRoute)?.clone();
      drop(routes);
      Ok(self.client.perform_request(route, request).await?)
  }
}
}

This compiles fine and works in testing! After shipping to production, they notice a large increase in throughput. It seems their change made a big difference. Alan is really excited about Rust, and wants to write more!

Alan continues his journey of learning even more about async Rust. After some enlightening talks at the latest RustConf, he decides to revisit the code that he and Barbara wrote together. He asks himself, is using an async lock the right thing to do? This lock should only be held for a very short amount of time. Yielding to the runtime is likely more expensive than just synchronously locking. But he remembers vaguely hearing that you should never use blocking code in async code as this will block the entire async executor from being able to make progress, so he doubts his intuition.

After chatting with Barbara, who encourages him to benchmark and measure, he decides to switch back to synchronous locks.

Unfortunately, switching back to synchronous locks brings back the old compiler error message about his future not being Send. Alan is confused as he's dropping the mutex guard before it ever crosses an await point.

Confused Alan goes to Barbara for advice. She is also confused, and it takes several minutes of exploration before she comes to a solution that works: wrapping the mutex access in a block and implicitly dropping the mutex.


#![allow(unused)]
fn main() {
impl Proxy {
  async fn handle(&self, key: String, request: Request) -> crate::Result<Response> {
      let route = {
        let routes = self.state.lock().unwrap();
        routes.get(key).unwrap_or_else(crate::error::MissingRoute)?.clone()
      };
      Ok(self.client.perform_request(route, request).await?)
  }
}
}

Barbara mentions she's unsure why explicitly dropping the mutex guard did not work, but they're both happy that the code compiles. In fact it seems to have improved the performance of the service when its under extreme load. Alan's intuition was right!

In the end, Barbara decides to write a blog post about how blocking in async code isn't always such a bad idea.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Locks can be quite common in async code as many tasks might need to mutate some shared state.
  • Error messages can be fairly good, but they still require a decent understanding of Rust (e.g., Send, MutexGuard, drop semantics) to fully understand what's going on.
  • This can lead to needing to use certain patterns (like dropping mutex guards early) in order to get code working.
  • The advice to never block in async code is not always true: if blocking is short enough, is it even blocking at all?

What are the sources for this story?

  • Chats with Alice and Lucio.
  • Alice's blog post on the subject has some good insights.
  • The issue of conservative analysis of whether values are used across await points causing futures to be !Send is known, but it takes some digging to find out about this issue. A tracking issue for this can be found here.

Why did you choose Alan to tell this story?

  • While Barbara might be tripped up on some of the subtleties, an experienced Rust developer can usually tell how to avoid some of the issues of using locks in async code. Alan on the other hand, might be surprised when his code does not compile as the issue the Send error is protecting against (i.e., a mutex guard being moved to another thread) is not protected against in other languages.

How would this story have played out differently for the other characters?

  • Grace would have likely had a similar time to Alan. These problems are not necessarily issues you would run into in other languages in the same way.
  • Niklaus may have been completely lost. This stuff requires a decent understanding of Rust and of async computational systems.

๐Ÿ˜ฑ Status quo stories: Alan tries using a socket Sink

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Alan is working on a project that uses async-std. He has worked a bit with tokio in the past and is more familiar with that, but he is interested to learn something how things work in async-std.

One of the goals is to switch from a WebSocket implementation using raw TCP sockets to one managed behind an HTTP server library, so both HTTP and WebSocket RPC calls can be forwarded to a transport-agnostic RPC server.

In this server implementation:

  • RPC call strings can be received over a WebSocket
  • The strings are decoded and sent to an RPC router that calls the methods specified in the RPC call
  • Some of the methods that are called can take some time to return a result, so they are spawned separately
    • RPC has built-in properties to organize call IDs and methods, so results can be sent in any order
  • Since WebSockets are bidirectional streams (duplex sockets), the response is sent back through the same client socket

He finds the HTTP server tide and it seems fairly similar to warp, which he was using with tokio. He also finds the WebSocket middleware library tide-websockets that goes with it.

However, as he's working, Alan encounters a situation where the socket needs to be written to within an async thread, and the traits just aren't working. He wants to split the stream into a sender and receiver:


#![allow(unused)]
fn main() {
use futures::{SinkExt, StreamExt};
use async_std::sync::{Arc, Mutex};
use log::{debug, info, warn};

async fn rpc_ws_handler(ws_stream: WebSocketConnection) {
    let (ws_sender, mut ws_receiver) = ws_stream.split();
    let ws_sender = Arc::new(Mutex::new(ws_sender));

    while let Some(msg) = ws_receiver.next().await {
        debug!("Received new WS RPC message: {:?}", msg);

        let ws_sender = ws_sender.clone();

        async_std::task::spawn(async move {
            let res = call_rpc(msg).await?;

            match ws_sender.lock().await.send_string(res).await {
                Ok(_) => info!("New WS data sent."),
                Err(_) => warn!("WS connection closed."),
            };
        });
    }
}
}

The split method splits the ws_stream into two separate halves:

  • a producer (ws_sender) that implements a Stream with the messages arriving on the websocket;
  • a consumer (ws_receiver) that implements Sink, which can be used to send responses.

This way, one task can pull items from the ws_sender and spawn out subtasks. Those subtasks share access to the ws_receiver and send messages there when they're done. Unfortunately, Alan finds that he can't use this pattern here, as the Sink trait wasn't implemented in the WebSockets middleware library he's using.

Alan also tries creating a sort of poller worker thread using an intermediary messaging channel, but he has trouble reasoning about the code and wasn't able to get it to compile:


#![allow(unused)]
fn main() {
use async_std::channel;
use async_std::sync::{Arc, Mutex};
use log::{debug, info, warn};

async fn rpc_ws_handler(ws_stream: WebSocketConnection) {
    let (ws_sender, mut ws_receiver) = channel::unbounded::<String>();
    let ws_receiver = Arc::new(ws_receiver);

    let ws_stream = Arc::new(Mutex::new(ws_stream));
    let poller_ws_stream = ws_stream.clone();

    async_std::task::spawn(async move {
        while let Some(msg) = ws_receiver.next().await {
            match poller_ws_stream.lock().await.send_string(msg).await {
                Ok(msg) => info!("New WS data sent. {:?}", msg),
                Err(msg) => warn!("WS connection closed. {:?}", msg),
            };
        }
    });

    while let Some(msg) = ws_stream.lock().await.next().await {
        async_std::task::spawn(async move {
            let res = call_rpc(msg).await?;
            ws_sender.send(res);
        });
    }
}
}

Alan wonders if he's thinking about it wrong, but the solution isn't as obvious as his earlier Sink approach. Looking around, he realizes a solution to his problems already exists-- as others have been in his shoes before-- within two other nearly-identical pull requests, but they were both closed by the project maintainers. He tries opening a third one with the same code, pointing to an example where it was actually found to be useful. To his joy, his original approach works with the code in the closed pull requests in his local copy! Alan's branch is able to compile for the first time.

However, almost immediately, his request is closed with a comment suggesting that he try to create an intermediate polling task instead, much as he was trying before. Alan is feeling frustrated. "I already tried that approach," he thinks, "and it doesn't work!"

As a result of his frustration, Alan calls out one developer of the project on social media. He knows this developer is opposed to the Sink traits. Alan's message is not well-received: the maintainer sends a short response and Alan feels dismissed. Alan later finds out he was blocked. A co-maintainer responds to the thread, defending and supporting the other maintainer's actions, and suggests that Alan "get over it". Alan is given a link to a blog post. The post provides a number of criticisms of Sink but, after reading it, Alan isn't sure what he should do instead.

Because of this heated exchange, Alan grows concerned for his own career, what these well-known community members might think or say about his to others, and his confidence in the community surrounding this language that he really enjoys using is somewhat shaken.

Despite this, Alan takes a walk, gathers his determination, and commits to maintaining his fork with the changes from the other pull requests that were shut down. He publishes his version to crates.io, vowing to be more welcoming to "misfit" pull requests like the one he needed.

A few weeks later, Alan's work at his project at work is merged with his new forked crate. It's a big deal, his first professional open source contribution to a Rust project! Still, he doesn't feel like he has a sense of closure with the community. Meanwhile, his friends say they want to try Rust, but they're worried about its async execution issues, and he doesn't know what else to say, other than to offer a sense of understanding. Maybe the situation will get better someday, he hopes.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • There are often many sources of opinion in the community regarding futures and async, but these opinions aren't always backed up with examples of how it should be better accomplished. Sometimes we just find a thing that works and would prefer to stick with it, but others argue that some traits make implementations unnecessarily complex, and choose to leave it out. Disagreements like these in the ecosystem can be harmful to the reputation of the project and the participants.
  • If there's a source of substantial disagreement, the community becomes even further fragmented, and this may cause additional confusion in newcomers.
  • Alan is used to fragmentation from the communities he comes from, so this isn't too discouraging, but what's difficult is that there's enough functionality overlap in async libraries that it's tempting to get them to interop with each other as-needed, and this can lead to architectural challenges resulting from a difference in design philosophies.
  • It's also unclear if Futures are core to the Rust asynchronous experience, much as Promises are in JavaScript, or if the situation is actually more complex.
  • The Sink trait is complex but it solves a real problem, and the workarounds required to solve problems without it can be unsatisfactory.
  • Disagreement about core abstractions like Sink can make interoperability between runtimes more difficult; it also makes it harder for people to reproduce patterns they are used to from one runtime to another.
  • It is all too easy for technical discussions like this to become heated; it's important for all participants to try and provide each other with the "benefit of the doubt".

What are the sources for this story?

Why did you choose Alan to tell this story?

  • Alan is more representative of the original author's background in JS, TypeScript, and NodeJS.

How would this story have played out differently for the other characters?

  • (I'm not sure.)

๐Ÿ˜ฑ Status quo stories: Alan tries to debug a hang

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Alan's startup has officially launched and YouBuy is live for the world to use. The whole team is very excited especially as this will be their first use of Rust in production! Normally, as a .NET shop, they would have written the entire application in C#, but because of the scalability and latency requirements on their inventory service, they decided to write a microservice in Rust utilizing the async features they've heard so much about.

The day's excitement soon turns into concern as reports begin coming into support of customers who can't checkout. After a few cases, a pattern begins to emerge: when a customer tries to buy the last available item, the checkout process hangs forever.

Alan suspects there is an issue with the lock used in the inventory service to prevent multiple people from buying the last available item at the same time. With this hunch, he builds the latest code and opens this local dev environment to conduct some tests. Soon enough, Alan has a repro of the bug.

With the broken environment still running, he decides to use a debugger to see if he can confirm his theory. In the past, Alan has used Visual Studio's debugger to diagnose a very similar issue in a C# application he wrote. The debugger was able to show him all the async Tasks currently waiting, their call stacks and what resource they were waiting on.

Alan hasn't used a debugger with Rust before, usually a combination of the strict compiler and a bit of manual testing has been enough to fix all the bugs he's previously encountered. He does a quick Google search to see what debugger he should use and decides to go with gdb because it is already installed on his system and sounds like it should work. Alan also pulls up a blog post that has a helpful cheatsheet of gdb commands since he's not familiar with the debugger.

Alan restarts the inventory service under gdb and gets to work reproducing the issue. He reproduces the issue a few times in the hope of making it easier to identify the cause of the problem. Ready to pinpoint the issue, Alan presses Ctrl+C and then types bt to get a backtrace:

(gdb) bt
(gdb) bt
#0  0x00007ffff7d5e58a in epoll_wait (epfd=3, events=0x555555711340, maxevents=1024, timeout=49152)
    at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x000055555564cf7d in mio::sys::unix::selector::epoll::Selector::select (self=0x7fffffffd008, events=0x7fffffffba40, 
    timeout=...) at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/mio-0.7.11/src/sys/unix/selector/epoll.rs:68
#2  0x000055555564a82f in mio::poll::Poll::poll (self=0x7fffffffd008, events=0x7fffffffba40, timeout=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/mio-0.7.11/src/poll.rs:314
#3  0x000055555559ad96 in tokio::io::driver::Driver::turn (self=0x7fffffffce28, max_wait=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/io/driver/mod.rs:162
#4  0x000055555559b8da in <tokio::io::driver::Driver as tokio::park::Park>::park_timeout (self=0x7fffffffce28, duration=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/io/driver/mod.rs:238
#5  0x00005555555e9909 in <tokio::signal::unix::driver::Driver as tokio::park::Park>::park_timeout (self=0x7fffffffce28, 
    duration=...) at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/signal/unix/driver.rs:156
#6  0x00005555555a9229 in <tokio::process::imp::driver::Driver as tokio::park::Park>::park_timeout (self=0x7fffffffce28, 
    duration=...) at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/process/unix/driver.rs:84
#7  0x00005555555a898d in <tokio::park::either::Either<A,B> as tokio::park::Park>::park_timeout (self=0x7fffffffce20, 
    duration=...) at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/park/either.rs:37
#8  0x00005555555ce0b8 in tokio::time::driver::Driver<P>::park_internal (self=0x7fffffffcdf8, limit=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/time/driver/mod.rs:226
#9  0x00005555555cee60 in <tokio::time::driver::Driver<P> as tokio::park::Park>::park (self=0x7fffffffcdf8)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/time/driver/mod.rs:398
#10 0x00005555555a87bb in <tokio::park::either::Either<A,B> as tokio::park::Park>::park (self=0x7fffffffcdf0)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/park/either.rs:30
#11 0x000055555559ce47 in <tokio::runtime::driver::Driver as tokio::park::Park>::park (self=0x7fffffffcdf0)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/driver.rs:198
#12 0x000055555557a2f7 in tokio::runtime::basic_scheduler::Inner<P>::block_on::{{closure}} (scheduler=0x7fffffffcdb8, 
    context=0x7fffffffcaf0)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/basic_scheduler.rs:224
#13 0x000055555557b1b4 in tokio::runtime::basic_scheduler::enter::{{closure}} ()
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/basic_scheduler.rs:279
#14 0x000055555558174a in tokio::macros::scoped_tls::ScopedKey<T>::set (
    self=0x555555701af8 <tokio::runtime::basic_scheduler::CURRENT>, t=0x7fffffffcaf0, f=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/macros/scoped_tls.rs:61
#15 0x000055555557b0b6 in tokio::runtime::basic_scheduler::enter (scheduler=0x7fffffffcdb8, f=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/basic_scheduler.rs:279
#16 0x0000555555579d3b in tokio::runtime::basic_scheduler::Inner<P>::block_on (self=0x7fffffffcdb8, future=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/basic_scheduler.rs:185
#17 0x000055555557a755 in tokio::runtime::basic_scheduler::InnerGuard<P>::block_on (self=0x7fffffffcdb8, future=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/basic_scheduler.rs:425
#18 0x000055555557aa9c in tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on (self=0x7fffffffd300, future=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/basic_scheduler.rs:145
#19 0x0000555555582094 in tokio::runtime::Runtime::block_on (self=0x7fffffffd2f8, future=...)
    at /home/alan/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.4.0/src/runtime/mod.rs:450
#20 0x000055555557c22f in inventory_service::main () at /home/alan/code/inventory_service/src/main.rs:4

Puzzled, the only line Alan even recognizes is the main entry point function for the service. He knows that async tasks in Rust aren't run individually on their own threads which allows them to scale better and use fewer resources but surely there has to be a thread somewhere that's running his code? Alan doesn't completely understand how async works in Rust but he's seen the Future::poll method so he assumes that there is a thread which constantly polls tasks to see if they are ready to wake up. "Maybe I can find that thread and inspect its state?" he thinks and then consults the cheatsheet for the appropriate command to see the threads in the program. info threads seems promising so he tries that:

(gdb) info threads
(gdb) info threads
  Id   Target Id                                          Frame 
* 1    Thread 0x7ffff7c3b5c0 (LWP 1048) "inventory_servi" 0x00007ffff7d5e58a in epoll_wait (epfd=3, events=0x555555711340, 
    maxevents=1024, timeout=49152) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30

Alan is now even more confused: "Where are my tasks?" he thinks. After looking through the cheatsheet and StackOverflow, he discovers there isn't a way to see which async tasks are waiting to be woken up in the debugger. Taking a shot in the dark, Alan concludes that this thread must be thread which is polling his tasks since it is the only one in the program. He googles "epoll_wait rust async tasks" but the results aren't very helpful and inspecting the stack frame doesn't yield him any clues as to where his tasks are so this seems to be a dead end.

After thinking a bit, Alan realizes that since the runtime must know what tasks are waiting to be woken up, perhaps he can have the service ask the async runtime for that list of tasks every 10 seconds and print them to stdout? While crude, this would probably also help him diagnose the hang. Alan gets to work and opens the runtime docs to figure out how to get that list of tasks. After spending 30 minutes reading the docs, looking at StackOverflow questions and even posting on users.rust-lang.org, he discovers this simply isn't possible and he will have to add tracing to his application to figure out what's going on.

Disgruntled, Alan begins the arduous, boring task of instrumenting the application in the hope that the logs will be able to help him.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Developers, especially coming from an language that has a tightly integrated development environment, expect their debugger to help them particularly in situations where "println" debugging can't.
  • If the debugger can't help them, developers will often try to reach for a programmatic solution such as debug functions in their runtime that can be invoked at critical code paths.
  • Trying to debug an issue by adding logging and then triggering the issue is painful because of the long turn-around times when modifying code, compiling and then repro'ing the issue.

What are the sources for this story?

  • @erickt's comments in #76, similar comments I've heard from other developers.

Why did you choose Alan to tell this story?

  • Coming from a background in managed languages where the IDE, debugger and runtime are tightly integrated, Alan would be used to using those tools to diagnose his issue.
  • Alan has also been a bit insulated from the underlying OS and expects the debugger to understand the language and runtime even if the OS doesn't have similar concepts such as async tasks.

How would this story have played out differently for the other characters?

  • Some of the characters with either a background in Rust or a background in systems programming might know that Rust's async doesn't always map to an underlying system feature and so they might expect that gdb or lldb is unable to help them.
  • Barbara, the experienced Rust dev, might also have used a tracing/instrumentation library from the beginning and have that to fall back on rather than having to do the work to add it now.

๐Ÿ˜ฑ Status quo stories: Alan tries to cache requests, which doesn't always happen.

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories [cannot be wrong], only inaccurate). Alternatively, you may wish to [add your own status quo story][htvsq]!

The story

Alan is working on an HTTP server. The server makes calls to some other service. The performance of the downstream service is somewhat poor, so Alan would like to implement some basic caching.

Alan writes up some code which does the caching:


#![allow(unused)]
fn main() {
async fn get_response(&mut self, key: String) {
    // Try to get the response from cache
    if let Some(cached_response) = self.cache.get(key) {
        self.channel.send(cached_response).await;
        return;
    }

    // Get the response from the downstream service
    let response = self.http_client.make_request(key).await;
    self.channel.send(response).await;
    
    // Store the response in the cache
    self.cache.set(key, response);
}
}

Alan is happy with how things are working, but notices every once in a while the downstream service hangs. To prevent that, Alan implements a timeout.

He remembers from the documentation for his favorite runtime that there is the race function which can kick off two futures and polls both until one completes (similar to tokio's select and async-std's race for example).


#![allow(unused)]
fn main() {
runtime::race(timeout(), get_response(key)).await
}

The bug

Alan ships to production but after several weeks he notices some users complaining that they receive old data.

Alan looks for help. The compiler unfortunately doesn't provide any hints. He turns to his second best friend clippy, who cannot help either. Alan tries debugging. He uses his old friend println!. After hours of working through, he notices that sometimes the line that sets the response in the cache never gets called.

The solution

Alan goes to [Barbara][] and asks why in the world that might be โ‰๏ธ

๐Ÿ’ก Barbara looks through the code and notices that there is an await point between sending the response over the channel and setting the cache.

Since the get_response future can be dropped at each available await point, it may be dropped after the http request has been made, but before the response has successfully been sent over the channel, thus not executing the remaining instructions in the function.

This means the cache might not be set.

Alan fixes it by setting the cache before sending the result over the channel. ๐ŸŽ‰


#![allow(unused)]
fn main() {
async fn get_response(&mut self, key: String) {
    // ... cache miss happened here

    // We perform the HTTP request and our code might continue
    // after this .await once the HTTP request is complete
    let response = self.http_client.make_request(key).await;

    // Immediately store the response in the cache
    self.cache.set(key, response);

    self.channel.send(response).await;
}
}

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Futures can be "canceled" at any await point. Authors of futures must be aware that after an await, the code might not run.
    • This is similar to panic safety but way more likely to happen
  • Futures might be polled to completion causing the code to work. But then many years later, the code is changed and the future might conditionally not be polled to completion which breaks things.
  • The burden falls on the user of the future to poll to completion, and there is no way for the lib author to enforce this - they can only document this invariant.
  • Diagnosing and ultimately fixing this issue requires a fairly deep understanding of the semantics of futures.
  • Without a Barbara, it might be hard to even know where to start: No lints are available, Alan is left with a normal debugger and println!.

What are the sources for this story?

The relevant sources of discussion for this story have been gathered in this github issue.

Why did you choose Alan to tell this story?

Alan has enough experience and understanding of push based async languages to make the assumptions that will trigger the bug.

How would this story have played out differently for the other characters?

This story would likely have played out the same for almost everyone but Barbara, who has probably been bitten by that already. The debugging and fixing time would however probably have varied depending on experience and luck.

๐Ÿ˜ฑ Status quo stories: Alan tries processing some files

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories [cannot be wrong], only inaccurate). Alternatively, you may wish to [add your own status quo story][htvsq]!

The story

Alan is new to Rust. He wants to build a program that recurses over all the files in a directory (and its subdirectories), reads each file, and produces some fingerprint of the file.

Since so much blocking I/O is involved, he chooses async in order to process many files concurrently.

Async

Alan does some research into async Rust. New to the language, he's heard that async support has recently landed, so he starts by reading the release notes and much of the Async Book, bookmarking the dense parts about Pinning as something he'll come back to when it makes more sense. Notably, he skips over the Recursion Workaround and other workaround bits.

As someone who hasn't followed the evolution of async Rust closely, the Ecosystem page of the Async Book provides a critical bit of context that he wishes he'd found first. Coming from Python and Go, where asyncio and goroutines are fully supported by the core language, Alan had been unclear exactly what was and what wasn't included in the language. This page puts everything into place.

The Popular Runtimes section makes it clear that he'll need to choose a third party ecosystem. He chooses Tokio because:

  • It's the only ecosystem of those listed that he's already heard about.
  • It seems to be widely used based on some web searches.
  • It has bite-sized, approachable tutorial pages that provide higher-level introduction than the average rustdoc.
  • It provides rich RPC libraries, like Tonic, which he plans to fiddle with in a future project.

Recursion

Alan starts by writing a recursive function that can call some operation on each regular file in a directory and recurse on each subdirectory.


#![allow(unused)]
fn main() {
async fn process_directory<'a, F, P, T>(path: PathBuf, processor: &'a P) -> Vec<F>
where
    P: Fn(DirEntry) -> F,
    F: Future<Output = T>,
{
    ReadDirStream::new(read_dir(path).await.unwrap())
        .filter_map(|x| async {
            let dir_entry = x.unwrap();
            let ft = dir_entry.file_type().await.unwrap();
            if ft.is_file() {
                Some(vec![processor(dir_entry)])
            } else if ft.is_dir() {
                Some(process_directory(dir_entry.path(), processor).await)
            } else {
                None
            }
        })
        .collect::<Vec<Vec<F>>>()
        .await
        .into_iter()
        .flatten()
        .collect()
}
}

The first paper cut comes when the compiler complains:

error[E0733]: recursion in an `async fn` requires boxing
  --> src/main.rs:23:77
   |
23 | async fn process_directory<'a, F, P, T>(path: PathBuf, processor: &'a P) -> Vec<F>
   |                                                                             ^^^^^^ recursive `async fn`
   |
   = note: a recursive `async fn` must be rewritten to return a boxed `dyn Future`

...
For more information about an error, try `rustc --explain E0733`.

From the explainer, Alan learns that he cannot use the async sugaring, and needs to use a Boxed Pin in his function signature:

fn process_directory<'a, F, P, T>(
    path: PathBuf,
    processor: &'static P,
) -> Pin<Box<dyn Future<Output = Vec<F>>>>

New to Rust, Alan still doesn't really understand what Pin does, so he reads the docs, sees that it marks which objects are "guaranteed not to move", and wonders why the compiler couldn't determine this automatically since he read so much about how the borrow checker can already detect moves versus borrows.

He's also not entirely sure why the returned Future needs to be Boxed. The suggested explainer helps a bit:

The `Box<...>` ensures that the result is of known size, and the pin is
required to keep it in the same place in memory.

But Alan figures that the size of Future<Output = T> should be determined by the type T. It's not like he's implementing a custom struct that is Future; he's returning a Vec<T> inside the standard async move {}. Alan wishes there was a way to express "Hey I'm returning a Future created by async move, whose Output attribute has a known size, so the resulting Future should have a known size too!"

But Alan does what the compiler tells him to do and adds some extra stuff to his function, which now looks like:


#![allow(unused)]
fn main() {
fn process_directory<'a, F, P, T>(
    path: PathBuf,
    processor: &'static P,
) -> Pin<Box<dyn Future<Output = Vec<F>> + 'a>>
where
    P: Fn(DirEntry) -> F,
    F: Future<Output = T>,
{
    Box::pin(async move {
        ReadDirStream::new(read_dir(path).await.unwrap())
            .filter_map(|x| async {
                let dir_entry = x.unwrap();
                let ft = dir_entry.file_type().await.unwrap();
                if ft.is_file() {
                    Some(vec![processor(dir_entry)])
                } else if ft.is_dir() {
                    Some(process_directory(dir_entry.path(), processor).await)
                } else {
                    None
                }
            })
            .collect::<Vec<Vec<F>>>()
            .await
            .into_iter()
            .flatten()
            .collect()
    })
}
}

Rate Limiting

Alan knows that process_directory may be called on directories with many thousands of files or subdirectories, and is wary of exhausting file descriptor limits. Since he can't find much documentation about how to keep the number of async tasks in check - Tokio's docs suggest we can spawn millions of tasks, but don't offer advice on how to manage tasks with expensive side effects - he decides he needs to build a simple rate limiter.

Alan's rate limiter will wrap some Future<Output =T>, acquire a semaphore, and then await the Future, returning the same type T:


#![allow(unused)]
fn main() {
async fn rate_limit<F, T>(fut: F, sem: &Semaphore) -> T
where
    F: Future<Output = T>,
{
    let _permit = sem.acquire().await;
    fut.await
}
}

Since the async fn foo<T>() -> T syntax desugars to fn foo<T>() -> Future<Output = T>, and since fut.await returns T, Alan assumes that the above is equivalent to:


#![allow(unused)]
fn main() {
fn rate_limit<F, T>(fut: F, sem: &Semaphore) -> F
where
    F: Future<Output = T>,
{
    ...
}
}

So he plugs this new rate_limit logic into process_directory:

use futures::future::join_all;                 
use futures::stream::StreamExt;                
use futures::Future;                           
use std::path::PathBuf;                        
use std::pin::Pin;                                    
use tokio::fs::{read_dir, DirEntry};     
use tokio::sync::Semaphore;                    
use tokio_stream::wrappers::ReadDirStream;     

async fn rate_limit<F, T>(fut: F, sem: &Semaphore) -> T
where
    F: Future<Output = T>,
{
    let _permit = sem.acquire().await;
    fut.await
}

fn process_directory<'a, F, P, T>(
    path: PathBuf,
    processor: &'a P,
    sem: &'static Semaphore,
) -> Pin<Box<dyn Future<Output = Vec<F>> + 'a>>
where
    P: Fn(DirEntry) -> F,
    F: Future<Output = T>,
{
    Box::pin(async move {
        ReadDirStream::new(read_dir(path).await.unwrap())
            .filter_map(|x| async {
                let dir_entry = x.unwrap();
                let ft = dir_entry.file_type().await.unwrap();
                if ft.is_file() {
                    Some(vec![rate_limit(processor(dir_entry), sem)])
                } else if ft.is_dir() {
                    Some(process_directory(dir_entry.path(), processor, sem).await)
                } else {
                    None
                }
            })
            .collect::<Vec<Vec<F>>>()
            .await
            .into_iter()
            .flatten()
            .collect()
    })
}

async fn expensive(de: DirEntry) -> usize {
    // assume this function spawns a task that does heavy I/O on the file
    de.file_name().len()
}

#[tokio::main(flavor = "current_thread")]
async fn main() {
    let sem = Semaphore::new(10);
    let path = PathBuf::from("/tmp/foo");
    let results = join_all(process_directory(path, &expensive, &sem).await);
    dbg!(results.await);
}

And is met with a new complaint from the compiler:

error[E0308]: `if` and `else` have incompatible types
  --> src/main.rs:34:24
   |
18 |    fn process_directory<'a, F, P, T>(
   |                             - this type parameter
...
32 |  /                 if ft.is_file() {
33 |  |                     Some(vec![rate_limit(processor(dir_entry), sem)])
   |  |                     ------------------------------------------------- expected because of this
34 |  |                 } else if ft.is_dir() {
   |  |________________________^
35 | ||                     Some(process_directory(dir_entry.path(), processor, sem).await)
36 | ||                 } else {
37 | ||                     None
38 | ||                 }
   | ||                 ^
   | ||_________________|
   | |__________________`if` and `else` have incompatible types
   |                    expected opaque type, found type parameter `F`
   |
   = note: expected type `Option<Vec<impl futures::Future>>`
              found enum `Option<Vec<F>>`
   = help: type parameters must be constrained to match other types
   = note: for more information, visit https://doc.rust-lang.org/book/ch10-02-traits.html#traits-as-parameters

Alan is confused. In line 33, rate_limit returns Future<Output = usize>, so why is this an opaque Future? So far as he can tell, the Option<Vec<impl futures::Future<Output = usize> returned on line 33 is the same type as the Option<Vec<F>> where F: Future<Output = usize> returned on line 35.

So he strips the problem down to only a few lines of code, and still he cannot figure out why the compiler complains:

use futures::{future::pending, Future};

async fn passthru<F, T>(fut: F) -> T
where
    F: Future<Output = T>,
{
    fut.await
}

fn main() {
    let func = pending::<u8>;
    match true {
        true => passthru(func()),
        false => func(),
    };
}

To which the compiler nevertheless replies:

error[E0308]: `match` arms have incompatible types
  --> src/main.rs:14:18
   |
12 | /     match true {
13 | |         true => passthru(func()),
   | |                 ---------------- this is found to be of type `impl futures::Future`
14 | |         false => func(),
   | |                  ^^^^^^ expected opaque type, found struct `futures::future::Pending`
15 | |     };
   | |_____- `match` arms have incompatible types
   |
   = note: expected type `impl futures::Future`
            found struct `futures::future::Pending<u8>`

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • The manual desugaring required for async recursion erases some of the "magic" of async.
  • Some programmers may never implement custom types that are Future, instead using standard constructs like async blocks to produce them. In these cases, the programmer might assume the returned Futures should have concrete types with known sizes, which would allow them to work directly with the returned types rather than have to deal with the complexities of trait objects, Box-ing, and opaque type comparisons.
  • Pin documentation focuses on data that can or cannot "move" in memory. To someone new to Rust, it might be easy to confuse this concept with "move" semantics in the context of ownership.

What are the sources for this story?

I describe my own experience while working on my first Rust project.

Why did you choose Alan to tell this story?

I chose Alan to tell this story because I envision him comping from Python. I mostly work in asyncio Python by day, which means my exposure to async is shaped by what I'd expect from a language without traits, and one where heap wrangling and memory addressing is abstracted away.

How would this story have played out differently for the other characters?

I'm not sure, but I'd assume:

  • Grace would not get tripped up on the need for Box::pin
  • Niklaus might share the confusion expressed above
  • Barbara might wish we could use async sugaring in recursive functions.

๐Ÿ˜ฑ Status quo stories: Alan wants an async iterator with prefetch

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Alan once wrote a data processing microservice in a GC'd language which was designed for high throughput. Now he wants to write it in Rust and have strong ownership model.

The original service consumes messages from a source stream (e.g. Kafka), process them and produces results to another stream and/or saves them to a database. Since the service acquires some data from other sources like external services and its own PostgreSQL database, Alan batches incoming messages to acquire as much as possible data from that sources with minimal overhead.

Since messages might arrive with some delays between them, or can end at some point for a while, their number is unknown, there's an async iterator which reads the input stream and waits some time before producing a batch if the next message isn't immediately ready.

Alan explored FutureExt from async-std and found no evidence that it's possible to wait for multiple futures returning different results (it's not possible for ValueTasks in .NET, but it worked well with Tasks which can be awaited multiple times). Later he was suggested to use an enum and the race method to achive his goal:


#![allow(unused)]
fn main() {
enum Choices<A, B, C> {
    A(A),
    B(B),
    C(C),
}

// convert each future into the type `Choices<...>`:
let future_a = async move { A(future_a.await) };
let future_b = async move { B(future_b.await) };
let future_c = async move { C(future_c.await) };

// await the race:
match future_a.race(future_b).race(future_c).await {
    A(a) => ...,
    B(b) => ....,
    C(c) => ...,
}
}

While that helped Alan, it was completely unobvious to him. He expected to see a macro accepting futures and producing a new future to be awaited:


#![allow(unused)]
fn main() {
match race!(future_a, future_b, future_c).await {
    // ...
}
}

Having join! would be nice too for Alan, so he can avoid binding variables to futures which later shall be awaited:


#![allow(unused)]
fn main() {
// How it's now
let future_a = do_async_a();
let future_b = do_async_b();
let future_c = do_async_c();

let result_a = future_a.await;
let result_b = future_b.await;
let result_c = future_c.await;

// How it could be
let (result_a, result_b, result_c) = join!(future_a, future_b, future_c).await;
}

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

  • Even though Alan had experience writing async code in other languages, he had a hard time figuring out how to do relatively simple things in Rust, like joining or racing on futures of different types.

What are the sources for this story?

Personal experience of the author.

Why did you choose Alan to tell this story?

As a backend developer in a GC'd language, Alan writes async code every day. He wants to gain the maximum performance and have memory safety at the same time.

How would this story have played out differently for the other characters?

In some cases, there are problems that only occur for people from specific backgrounds, or which play out differently. This question can be used to highlight that.

๐Ÿ˜ฑ Status quo stories: Alan wants to migrate a web server to Rust

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

Is Rust ready for the web?

Alan has been following the arewewebyet site for quite some time. He is a Typescript full-stack developer and follows the project in order to know when it would be sensible to migrate the backend of a web application he's responsible for. Alan loves Rust and has used it for some tasks that didn't quite need async routines. Since arewewebyet is an official Rust language project, he trusts their reviews of several web frameworks, tools, libraries, etc.

Alan was thrilled during the 2020 Xmas holiday. It turns out that at that time Rust was declared to be web ready! Alan takes this is a sign that not only is Rust great for web servers, but also a confirmation that async features have matured and stabilised. For, how can a language be web ready and not fully support asynchronous tasks?

Alan's point of reference are the Golang and Javascript languages. They were both created for web servers and clients. They also support async/await natively. At the same time, Alan is not aware of the complexities that these languages are "hiding" from him.

Picking a web server is ok

Golang native http server is nice but, as a Typescript developer, Alan is also used to dealing with "Javascript fatigue". Javascript developers often use this term to refer to a fast-pace framework ecosystem, where every so often there is the "new" thing everybody else is migrating to. Similarly, Javascript engineers are used to having to pick from a myriad of options within the vast npm ecosystem. And so, the lack of a web sever in Rust's standard library didn't surprise him. The amount of options didn't overwhelm him either.

The arewewebyet site mentions four good web servers. Alan picks Tide because the interfaces and the emphasis on middleware reminds him of Nodejs' Express framework.

The first endpoint

Alan sets up all the boilerplate and is ready to write the first endpoint. He picks PUT /support-ticket because it barely has any logic in it. When a request arrives, the handler only makes a request to Zendesk to create a support ticket. The handler is stateless and has no middleware.

The arewewebyet site doesn't recommend a specific http client, so Alan searches for one in crates.io. He picks reqwest simply because it's the most popular.

Alan combines the knowledge he has from programming in synchronous Rust and asynchronous Javascript to come up with a few lines that should work. If the compiler is happy, then so is he!

First problem: incompatible runtimes

The first problem he runs into is very similar to the one described in the compiler trust story: thread 'main' panicked at 'there is no reactor running, must be called from the context of a Tokio 1.x runtime.

In short, Alan has problems because Tide is based on std-async and reqwest on the latest version of tokio. This is a real pain for Alan as he has now to change either the http client or the server so that they use the same runtime.

He decides to switch to Actix web.

Second problem: incompatible versions of the same runtime

Alan migrates to Actix web and again the compiler seems to be happy. To his surprise, the same problem happens again. The program panics with the message as before: there is no reactor running, must be called from the context of a Tokio 1.x runtime. He is utterly puzzled as Actix web is based on Tokio just like reqwest. Didn't he just fix problem number 1?

It turns out that the issue is that Alan's using v0.11.2 of reqwest, which uses tokio v1, and v3.3.2 of actix-web, which uses tokio v0.3.

The solution to this problem is then to dig into all the versions of reqwest until he finds one which uses the same version of tokio.

Can Alan sell the Rust migration to his boss?

This experience has made Alan think twice about whether Rust is indeed web ready. On the one hand, there are very good libraries for web servers, ORMs, parsers, session management, etc. On the other, Alan is fearful that in 2/3/6 months time he has to develop new features with libraries that already exist but turn out to be incompatible with the runtime chosen at the beginning of the project.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Rust's ecosystem has a lot of great components that may individually be ready for the web, but combining them is still a fraught proposition. In a typical web server project, dependencies that use async features form an intricate web which is hard to decipher for both new and seasoned Rust developers. Alan picked Tide and reqwest, only to realise later that they are not compatible. How many more situations like this will he face? Can Alan be confident that it won't happen again? New users especially are not accustomed to having to think about what "runtime" they are using, since there is usually not a choice in the matter.
  • The situation is so complex that it's not enough knowing that all dependencies use the same runtime. They all have to actually be compatible with the same runtime and version. Newer versions of reqwest are incompatible with the latest stable version of actix web (verified at the time of writing)
  • Developers that need a stable environment may be fearful of the complexity that comes with managing async dependencies in Rust. For example, if reqwest had a security or bug fix in one of the latest versions that's not backported to older ones, Alan would not be able to upgrade because actix web is holding him back. He has in fact to wait until ALL dependencies are using the same runtime to apply fixes and upgrades.

What are the sources for this story?

Personal experience of the author.

Why did you choose Alan to tell this story?

As a web developer in GC languages, Alan writes async code every day. A language without stable async features is not an option.

How would this story have played out differently for the other characters?

Learning what async means and what it entails in a codebase is usually hard enough. Niklaus would struggle to learn all that while at the same time dealing with the many gotchas that can happen when building a project with a lot of dependencies.

Barbara may be more tolerant with the setup since she probably knows the rationale behind keeping Rust's standard library lean and the need for external async runtimes.

How would this story have played out differently if Alan came from another GC'd language?

Like the trust story, it would be very close, since all other languages (that I know of) provide async runtimes out of the box and it's not something the programmer needs to concern themselves with.

๐Ÿ˜ฑ Status quo stories: Alan writes a web framework

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

YouBuy is written using an async web framework that predates the stabilization of async function syntax. When Alan joins the company, it is using async functions for its business logic, but can't use them for request handlers because the framework doesn't support it yet. It requires the handler's return value to be Box<dyn Future<...>>. Because the web framework predates async function syntax, it requires you to take ownership of the request context (State) and return it alongside your response in the success/error cases. This means that even with async syntax, an http route handler in this web framework looks something like this (from the Gotham Diesel example):


#![allow(unused)]
fn main() {
// For reference, the framework defines these type aliases.
pub type HandlerResult = Result<(State, Response<Body>), (State, HandlerError)>;
pub type HandlerFuture = dyn Future<Output = HandlerResult> + Send;

fn get_products_handler(state: State) -> Pin<Box<HandlerFuture>> {
    use crate::schema::products::dsl::*;

    async move {
        let repo = Repo::borrow_from(&state);
        let result = repo.run(move |conn| products.load::<Product>(&conn)).await;
        match result {
            Ok(prods) => {
                let body = serde_json::to_string(&prods).expect("Failed to serialize prods.");
                let res = create_response(&state, StatusCode::OK, mime::APPLICATION_JSON, body);
                Ok((state, res))
            }
            Err(e) => Err((state, e.into())),
        }
    }
    .boxed()
}
}

and then it is registered like this:


#![allow(unused)]
fn main() {
    router_builder.get("/").to(get_products_handler);
}

The handler code is forced to drift to the right a lot, because of the async block, and the lack of ability to use ? forces the use of a match block, which drifts even further to the right. This goes against what he has learned from his days writing go.

Rather than switching YouBuy to a different web framework, Alan decides to contribute to the web framework himself. After a bit of a slog and a bit of where-clause-soup, he manages to make the web framework capable of using an async fn as an http request handler. He does this by extending the router builder with a closure that boxes up the impl Future from the async fn and then passes that closure on to .to().


#![allow(unused)]
fn main() {
    fn to_async<H, Fut>(self, handler: H)
    where
        Self: Sized,
        H: (FnOnce(State) -> Fut) + RefUnwindSafe + Copy + Send + Sync + 'static,
        Fut: Future<Output = HandlerResult> + Send + 'static,
    {
        self.to(move |s: State| handler(s).boxed())
    }
}

The handler registration then becomes:


#![allow(unused)]
fn main() {
    router_builder.get("/").to_async(get_products_handler);
}

This allows him to strip out the async blocks in his handlers and use async fn instead.


#![allow(unused)]
fn main() {
// Type the library again, in case you've forgotten:
pub type HandlerResult = Result<(State, Response<Body>), (State, HandlerError)>;

async fn get_products_handler(state: State) -> HandlerResult {
    use crate::schema::products::dsl::*;

    let repo = Repo::borrow_from(&state);
    let result = repo.run(move |conn| products.load::<Product>(&conn)).await;
    match result {
        Ok(prods) => {
            let body = serde_json::to_string(&prods).expect("Failed to serialize prods.");
            let res = create_response(&state, StatusCode::OK, mime::APPLICATION_JSON, body);
            Ok((state, res))
        }
        Err(e) => Err((state, e.into())),
    }
}
}

It's still not fantastically ergonomic though. Because the handler takes ownership of State and returns it in tuples in the result, Alan can't use the ? operator inside his http request handlers. If he tries to use ? in a handler, like this:


#![allow(unused)]
fn main() {
async fn get_products_handler(state: State) -> HandlerResult {
    use crate::schema::products::dsl::*;

    let repo = Repo::borrow_from(&state);
    let prods = repo
        .run(move |conn| products.load::<Product>(&conn))
        .await?;
    let body = serde_json::to_string(&prods).expect("Failed to serialize prods.");
    let res = create_response(&state, StatusCode::OK, mime::APPLICATION_JSON, body);
    Ok((state, res))
}
}

then he receives:

error[E0277]: `?` couldn't convert the error to `(gotham::state::State, HandlerError)`
  --> examples/diesel/src/main.rs:84:15
   |
84 |         .await?;
   |               ^ the trait `From<diesel::result::Error>` is not implemented for `(gotham::state::State, HandlerError)`
   |
   = note: the question mark operation (`?`) implicitly performs a conversion on the error value using the `From` trait
   = note: required by `std::convert::From::from`

Alan knows that the answer is to make another wrapper function, so that the handler can take an &mut reference to State for the lifetime of the future, like this:


#![allow(unused)]
fn main() {
async fn get_products_handler(state: &mut State) -> Result<Response<Body>, HandlerError> {
    use crate::schema::products::dsl::*;

    let repo = Repo::borrow_from(&state);
    let prods = repo
        .run(move |conn| products.load::<Product>(&conn))
        .await?;
    let body = serde_json::to_string(&prods).expect("Failed to serialize prods.");
    let res = create_response(&state, StatusCode::OK, mime::APPLICATION_JSON, body);
    Ok(res)
}
}

and then register it with:


#![allow(unused)]
fn main() {
    route.get("/").to_async_borrowing(get_products_handler);
}

but Alan can't work out how to express the type signature for the .to_async_borrowing() helper function. He submits his .to_async() pull-request upstream as-is, but it nags on his mind that he has been defeated.

Shortly afterwards, someone raises a bug about ?, and a few other web framework contributors try to get it to work, but they also get stuck. When Alan tries it, the compiler diagnostics keep sending him around in circles . He can work out how to express the lifetimes for a function that returns a Box<dyn Future + 'a> but not an impl Future because of how where clauses are expressed. Alan longs to be able to say "this function takes an async function as a callback" (fn register_handler(handler: impl async Fn(state: &mut State) -> Result<Response, Error>)) and have Rust elide the lifetimes for him, like how they are elided for async functions.

A month later, one of the contributors finds a forum comment by Barbara explaining how to express what Alan is after (using higher-order lifetimes and a helper trait). They implement this and merge it. The final .to_async_borrowing() implementation ends up looking like this (also from Gotham):


#![allow(unused)]
fn main() {
pub trait AsyncHandlerFn<'a> {
    type Res: IntoResponse + 'static;
    type Fut: std::future::Future<Output = Result<Self::Res, HandlerError>> + Send + 'a;
    fn call(self, arg: &'a mut State) -> Self::Fut;
}

impl<'a, Fut, R, F> AsyncHandlerFn<'a> for F
where
    F: FnOnce(&'a mut State) -> Fut,
    R: IntoResponse + 'static,
    Fut: std::future::Future<Output = Result<R, HandlerError>> + Send + 'a,
{
    type Res = R;
    type Fut = Fut;
    fn call(self, state: &'a mut State) -> Fut {
        self(state)
    }
}

pub trait HandlerMarker {
    fn call_and_wrap(self, state: State) -> Pin<Box<HandlerFuture>>;
}

impl<F, R> HandlerMarker for F
where
    R: IntoResponse + 'static,
    for<'a> F: AsyncHandlerFn<'a, Res = R> + Send + 'static,
{
    fn call_and_wrap(self, mut state: State) -> Pin<Box<HandlerFuture>> {
        async move {
            let fut = self.call(&mut state);
            let result = fut.await;
            match result {
                Ok(data) => {
                    let response = data.into_response(&state);
                    Ok((state, response))
                }
                Err(err) => Err((state, err)),
            }
        }
        .boxed()
    }
}

...
    fn to_async_borrowing<F>(self, handler: F)
    where
        Self: Sized,
        F: HandlerMarker + Copy + Send + Sync + RefUnwindSafe + 'static,
    {
        self.to(move |state: State| handler.call_and_wrap(state))
    }
}

Alan is still not sure whether it can be simplified.

Later on, other developers on the project attempt to extend this approach to work with closures, but they encounter limitations in rustc that seem to make it not work (rust-lang/rust#70263).

When Alan sees another open source project struggling with the same issue, he notices that Barbara has helped them out as well. Alan wonders how many people in the community would be able to write .to_async_borrowing() without help.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Callback-based APIs with async callbacks are a bit fiddly, because of the impl Future return type forcing you to write where-clause-soup, but not insurmountable.
  • Callback-based APIs with async callbacks that borrow their arguments are almost impossible to write without help.

What are the sources for this story?

Why did you choose Alan/YouBuy to tell this story?

  • Callback-based apis are a super-common way to interact with web frameworks. I'm not sure how common they are in other fields.

How would this story have played out differently for the other characters?

  • I suspect that even many Barbara-shaped developers would struggle with this problem.

๐Ÿ˜ฑ Status quo stories: Status quo of an AWS engineer

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

This tells the story of Alan, an engineer who works at AWS.

  • Writing a Java-based service at AWS: Alan is accustomed to using many convenient tools for writing Java-based services.
  • Getting started with Rust: Alan gets tapped to help spin up a new project on a tight timeline. He hasn't used Rust before, so he starts trying to setup an environment and learn the basics.
  • Coming from Java: Alan finds that some of the patterns he's accustomed to from Java don't translate well to Rust.
  • Exploring the ecosystem: The Rust ecosystem has a lot of useful crates, but they're hard to find. "I don't so much find them as stumble upon them by accident."
  • At first, Rust feels quite ergonomic to Alan. The async-await system seems pretty slick. But as he gets more comfortable with Rust, he starts to encounter situations where he can't quite figure out how to get things setup the way he wants, and he has to settle for suboptimal setups:
    • Juggling error handling: Alan tries to use ? to process errors in a stream.
    • Failure to parallelize: Alan can't figure out how to parallelize a loop.
    • Borrow check errors: Alan tries to write code that fills a buffer and returns references into it to the caller, only to learn that Rust's borrow checker makes that pattern difficult.
  • As Alan goes deeper into Async Rust, he learns that its underlying model can be surprising. One particular deadlock takes him quite a long time to figure out.
  • Encountering pin: Wrapping streams, AsyncRead implementations, and other types requires using Pin and it is challenging.
  • Figuring out the best option: Alan often encounters cases where he doesn't know what is the best way to implement something. He finds he has to implement it both ways to tell, and sometimes even then he can't be sure.
  • Testing his service: Alan invents patterns for Dependency Injection in order to write tests.
  • Using the debugger: Alan wishes for a smoother debugging experience.
  • Missed Waker leads to lost performance: Alan finds his service his not as fast as the reference server; the problem is ultimately due to a missed Waker, which was causing his streams to wake up much later than it should've.
  • Debugging performance problems: Alan finds more performance problems and tries to figure out their cause using tooling like perf. It's hard.
  • Getting ready to deploy: Alan prepares to deply the service.
  • Using JNI: Alan uses JNI to access services that are only available using Java libraries.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Building services in Rust can yield really strong results, but a lot of hurdles remain:
    • 'If it compiles, it works' is not true: there are lots of subtle variations.
    • Debugging correctness and performance problems is hard, and the tooling is not what folks are used to.
    • Few established patterns to things like DI.
    • The ecosystem has a lot of interesting things in it, but it's hard to navigate.

What are the sources for this story?

This story is compiled from discussions with service engineers in various AWS teams.

Why did you choose Alan to tell this story?

Because Java is a very widely used language at AWS.

How would this story have played out differently for the other characters?

Most parts of it remain the same; the main things that were specific to Java are some of the patterns Alan expected to use. Similarly, few things are specific to AWS apart from some details of the setup.

Status quo of an AWS engineer: Writing a Java-based service

Alan has been working at AWS for the last six years. He's accustomed to a fairly standard workflow for launching Java-based services:

  • Write a description of the service APIs using a modeling language like Smithy.
  • Submit the description to a webpage, which gives a standard service implementation based on netty. Each of the API calls in the modeling language has a function with a /* TODO */ comment to fill in.
  • As Alan works with his team to fill in each of those items, he makes use of a number of standard conventions:
    • Mocking with projects like mockito to allow for unit testing of specific components.
  • Alan uses a variety of nice tools:
    • Advanced IDEs like IntelliJ, which offer him suggestions as he works.
    • Full-featured, if standard, debuggers; he can run arbitrary code, mutate state, step into and out of functions with ease.
    • Tools for introspecting the VM state to get heap usage information and other profiling data.
    • Performance monitoring frameworks
  • As Alan is preparing to launch his service, he has to conduct an Operational Readiness Review (ORR):
    • This consists of a series of detailed questions covering all kinds of nasty scenarios that have arisen in deployments past. For each one, he has to explain how his service will handle it.
    • For most of them, the standard framework has pre-generated code that covers it, or he is able to use standard patterns.

Status quo of an AWS engineer: Getting started with Rust

For his latest project, Alan is rewriting a core component of DistriData. They are trying to move on a tight deadline.

The component that they are rewriting was implemented in Java, but it was having difficulty with high tail latencies and other performance hiccups. The team has an idea for a new architecture that will be more efficient, and they would like to reduce resource usage by adopting Rust.

Getting started with Rust is a bit different than what he is used to. There's not much infrastructure. They still define their service interface using the same modeling language, but there is no tooling to generate a server from it.

IDE setup

Of course, the very first thing Alan does it to tweak his IDE setup. He's happy to learn that IntelliJ has support for Rust, since he is accustomed to the keybindings and it has great integration with Brazil, AWS's internal build system.

Still, as he plays around with Rust code, he realizes that the support is not nearly at the level of Java. Autocomplete often gets confused. For example, when there are two traits with the same name but coming from different crates, IntelliJ often picks the wrong one. It also has trouble with macros, which are very common in async code. Some of Alan's colleagues switch to VSCode, which is sometimes better but has many of the same problems; Alan decides to stick with IntelliJ.

Building the first server

Alan asks around the company to learn more about how Async Rust works and he is told to start with the tokio tutorial and the Rust book. He also joins the company slack channel, where he can ask questions. The tokio tutorial is helpful and he is feeling relatively confident.

Missing types during Code review

One problem Alan finds has to do with AWS's internal tooling (although it would be the same in most places). When browsing Rust code in the IDE, there are lots of tips to help in understanding, such as tooltips showing the types of variables and the like. In code reviews, though, there is only the plain text. Rust's type inference is super useful and make the code compact, but it can be hard to tell what's going on when you just read the plain source.

Status quo of an AWS engineer: Coming from Java

At first, Alan is trying to write Rust code as if it were Java. He's accustomed to avoiding direct dependencies between types and instead modeling everything with an interface, so at first he creates a lot of Rust traits. He quickly learns that dyn Trait can be kind of painful to use.

He also learns that Rust doesn't really permit you to add references willy nilly. It was pretty common in Java to have a class that was threaded everywhere with all kinds of references to various parts of the system. This pattern often leads to borrow check errors in Rust.

He gets surprised by parallelism. He wants a concurrent hash map but can't find one in the standard library. There are a lot of crates on crates.io but it's not clear which would be best. He decides to use a mutex-protected lock.

He is surprised because futures in Java correspond to things executed in parallel, but in Rust they don't. It takes him some time to get used to this. Eventually he learns that a Rust future is more akin to a java callable.

Status quo of an AWS engineer: Exploring the ecosystem

Alan finds that cargo is a super powerful tool, but he finds it very hard to find crates to use. He doesn't really feel he discovers crates so much as "falls upon" them by chance. For example, he happened to see a stray mention of cargo bloat in the internals form, and that turned out to be exactly what he needed. He finds the async-trait crate in a similar way. He's happy these tools exist, but he wishes he had more assurance of finding them; he wonders what useful things are out there that he doesn't know about.

In some cases, there are a lot of choices and it's really hard to tell which is best. Alan spent some time evaluating crates that do md5 hashing, for example, and found tons of choices. He does some quick performance testing and finds huge differences: openssl seems to be the fastest, so he takes that, but he is worried he may have missed some crates.

He had decided to use tokio because it was the thing that everyone else is using. But he gradually learns that there are more runtimes out there. Sometimes, when he adds a crate, he finds that it is bringing in a new set of dependencies that don't seem necessary.

He also gets confused by the vast array of options. tokio seems to have an AsyncRead trait, for example, but so does futures -- which one should he use?

He's heard of other runtimes and he might like to be able to try them out, but it would be too much work. Occasionally he winds up with multiple versions of the same crate, which can lead either to compilation or runtime errors. For example, when rusoto upgraded to a new version of tokio, this spilled over into confusing huge error messages from the rusoto builder owing to subtle trait and type mismatches. Fortunately the recent tokio 1.0 release promises to solve some of those problems.

Status quo of an AWS engineer: Juggling error handling

For example, one day Alan is writing a loop. In this particular part of DistriData, the data is broken into "shards" and each shard has a number of "chunks". He is connected to various backend storage hosts via HTTP, and he needs to send each chunk out to all of them. He starts by writing some code that uses hyper::body::channel to generate a pair of a channel where data can be sent and a resulting HTTP body. He then creates a future for each of those HTTP bodies that will send it to the appropriate host once it is complete. He wants those sends to be executing in the background as the data arrives on the channel, so he creates a FuturesUnordered to host them:


#![allow(unused)]
fn main() {
let mut host_senders: Vec<hyper::body::Sender> = vec![];
let mut host_futures = FuturesUnordered::new();
for host in hosts {
    let (sender, body) = hyper::body::Body::channel();
    host_senders.push(sender);
    host_futures.push(create_future_to_send_request(body));
}
}

Next, he wants to iterate through each of the shards. For each shard, he will send each chunk to each of the hosts:


#![allow(unused)]
fn main() {
let mut shards = /* generate a stream of Shards */;
while let Some(chunks) = shards.next().await {
    let chunk_futures = chunks
        .into_iter()
        .zip(&mut host_senders)
        .map(|(chunk, sender)| sender.send_data(chunk)?);

    futures::join_all(chunk_futures).await;
}
}

The last line is giving him a bit of trouble. Each of the requests to send the futures could fail, and he would like to propagate that failure. He's used to writing ? to propagate an error, but when he puts ? in sender.send_data he gets an error:

error[E0277]: the `?` operator can only be applied to values that implement `Try`
  --> src/lib.rs:18:40
   |
18 |                 .map(|(chunk, sender)| sender.send_data(chunk)?);
   |                                        ^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `impl futures::Future`
   |
   = help: the trait `Try` is not implemented for `impl futures::Future`
   = note: required by `into_result`

"Right," Alan thinks, "I need to await the future." He tries to move the ? to the result of join_all:


#![allow(unused)]
fn main() {
let mut shards = /* generate a stream of Shards */;
while let Some(chunks) = shards.next().await {
    let chunk_futures = chunks
        .into_iter()
        .zip(&mut host_senders)
        .map(|(chunk, sender)| sender.send_data(chunk));

    futures::join_all(chunk_futures).await?;
}
}

But now he sees:

error[E0277]: the `?` operator can only be applied to values that implement `Try`
  --> src/lib.rs:20:9
   |
20 |         join_all(chunk_futures).await?;  
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `Vec<std::result::Result<(), hyper::Error>>`
   |
   = help: the trait `Try` is not implemented for `Vec<std::result::Result<(), hyper::Error>>`
   = note: required by `into_result`

"Ah," he says, "of course, I have a vector of potential errors, not a single error." He remembers seeing a trick for dealing with this in his Rust training. Pulling up the slides, he finds the example. It takes him a little bit to get the type annotations just right, but he finally lands on:


#![allow(unused)]
fn main() {
while let Some(chunks) = shards.next().await {
    let chunk_futures = chunks
        .into_iter()
        .zip(&mut host_senders)
        .map(|(chunk, sender)| sender.send_data(chunk));

    join_all(chunk_futures)
        .await
        .into_iter()
        .collect::<Result<Vec<_>, _>>()?;
}
}

playground

The loop now works: it sends each chunk from each shard to each host, and propagates errors in a reasonable way. The last step is to write for those writes to complete. To do this, he has until all the data has actually been sent, keeping in mind that there could be errors in these sends too. He writes a quick loop to iterate over the stream of sending futures host_futures that he created earlier:


#![allow(unused)]
fn main() {
loop {
    match host_futures.next().await {
        Some(Ok(response)) => handle_response(response)?,
        Some(Err(e)) => return Err(e)?,
        None => return Ok(()),
    }
}
}

It takes him a few tries to get this loop right too. The Some(Err(e)) case in particular is a bit finnicky. He tried to just return Err(e) but it gave him an error, because the of e didn't match the more generic Box<dyn Error> type that his function returns. He remembered that the ? operator performs some interconversion, though, and that you can do Err(e)? to workaround this particular problem.

He surveys the final function he has built, feeling a sense of satisfaction that he got it to work. Still, he can't help but think that this was an awful lot of work just to propagate errors. Plus, he knows from experience that the errors in Rust are often less useful for finding problems than the ones he used to get in Java. Rust errors don't capture backtraces, for example. He tried to add some code to capture backtraces at one point but it seemed really slow, taking 20ms or so to snag a backtrace, and he knew that would be a problem in production.


#![allow(unused)]
fn main() {
// Prepare the outgoing HTTP requests to each host:
let mut host_senders: Vec<hyper::body::Sender> = vec![];
let mut host_futures = FuturesUnordered::new();
for host in hosts {
    let (sender, body) = hyper::body::Body::channel();
    host_senders.push(sender);
    host_futures.push(create_future_to_send_request(body));
}

// Send each chunk from each shared to each host:
while let Some(chunks) = shards.next().await {
    let chunk_futures = chunks
        .into_iter()
        .zip(&mut host_senders)
        .map(|(chunk, sender)| sender.send_data(chunk));

    join_all(chunk_futures)
        .await
        .into_iter()
        .collect::<Result<Vec<_>, _>>()?;
}

// Wait for all HTTP requests to complete, aborting on error:
loop {
    match host_futures.next().await {
        Some(Ok(response)) => handle_response(response)?,
        Some(Err(e)) => return Err(e)?,
        None => return Ok(()),
    }
}
}

Status quo of an AWS engineer: Failure to parallelize

As Alan reads the loop he just built, he realizes that he ought to be able to process each shared independently. He decides to try spawning the tasks in parallel. He starts by trying to create a stream that spawns out tasks:


#![allow(unused)]
fn main() {
// Send each chunk from each shared to each host:
while let Some(chunks) = shards.next().await {
    tokio::spawn(async move {
        let chunk_futures = chunks
            .into_iter()
            .zip(&mut host_senders)
            .map(|(chunk, sender)| sender.send_data(chunk));

        join_all(chunk_futures)
            .await
            .into_iter()
            .collect::<Result<Vec<_>, _>>()?;
    })
}
}

But this is giving him errors about the ? operator again:

error[E0277]: the `?` operator can only be used in an async block that returns `Result` or `Option` (or another type that implements `Try`)
  --> src/lib.rs:21:13
   |
15 |            tokio::spawn(async move {
   |   _________________________________-
16 |  |             let chunk_futures = chunks
17 |  |                 .into_iter()
18 |  |                 .zip(&mut host_senders)
...   |
21 | /|             join_all(chunk_futures)
22 | ||                 .await
23 | ||                 .into_iter()
24 | ||                 .collect::<Result<Vec<_>, _>>()?;
   | ||________________________________________________^ cannot use the `?` operator in an async block that returns `()`
25 |  |         });
   |  |_________- this function should return `Result` or `Option` to accept `?`
   |
   = help: the trait `Try` is not implemented for `()`
   = note: required by `from_error`

Annoyed, he decides to convert those to unwrap calls temporarily (which will just abort the process on error) just to see if he can get something working:


#![allow(unused)]
fn main() {
    while let Some(chunks) = shards.next().await {
        tokio::spawn(async move {
            let chunk_futures = chunks
                .into_iter()
                .zip(&mut host_senders)
                .map(|(chunk, sender)| sender.send_data(chunk));
    
            join_all(chunk_futures)
                .await
                .into_iter()
                .collect::<Result<Vec<_>, _>>()
                .unwrap();
        });
    }
}

But now he gets this error (playground):

error[E0382]: use of moved value: `host_senders`
  --> src/lib.rs:15:33
   |
12 |       let mut host_senders: Vec<hyper::body::Sender> = vec![];
   |           ---------------- move occurs because `host_senders` has type `Vec<hyper::body::Sender>`, which does not implement the `Copy` trait
...
15 |           tokio::spawn(async move {
   |  _________________________________^
16 | |             let chunk_futures = chunks
17 | |                 .into_iter()
18 | |                 .zip(&mut host_senders)
   | |                           ------------ use occurs due to use in generator
...  |
24 | |                 .collect::<Result<Vec<_>, _>>().unwrap();
25 | |         });
   | |_________^ value moved here, in previous iteration of loop

He removes the move keyword from async move, but then he sees:

error[E0373]: async block may outlive the current function, but it borrows `host_senders`, which is owned by the current function
  --> src/lib.rs:15:28
   |
15 |           tokio::spawn(async {
   |  ____________________________^
16 | |             let chunk_futures = chunks
17 | |                 .into_iter()
18 | |                 .zip(&mut host_senders)
   | |                           ------------ `host_senders` is borrowed here
...  |
24 | |                 .collect::<Result<Vec<_>, _>>().unwrap();
25 | |         });
   | |_________^ may outlive borrowed value `host_senders`
   |
   = note: async blocks are not executed immediately and must either take a reference or ownership of outside variables they use
help: to force the async block to take ownership of `host_senders` (and any other referenced variables), use the `move` keyword
   |
15 |         tokio::spawn(async move {
16 |             let chunk_futures = chunks
17 |                 .into_iter()
18 |                 .zip(&mut host_senders)
19 |                 .map(|(chunk, sender)| sender.send_data(chunk));
20 |     
 ...

error[E0499]: cannot borrow `host_senders` as mutable more than once at a time
  --> src/lib.rs:15:28
   |
15 |            tokio::spawn(async {
   |   ______________________-_____^
   |  |______________________|
   | ||
16 | ||             let chunk_futures = chunks
17 | ||                 .into_iter()
18 | ||                 .zip(&mut host_senders)
   | ||                           ------------ borrows occur due to use of `host_senders` in generator
...  ||
24 | ||                 .collect::<Result<Vec<_>, _>>().unwrap();
25 | ||         });
   | ||         ^
   | ||_________|
   | |__________`host_senders` was mutably borrowed here in the previous iteration of the loop
   |            argument requires that `host_senders` is borrowed for `'static`

At this point, he gives up and leaves a // TODO comment:


#![allow(unused)]
fn main() {
// TODO: This loop should be able to execute in parallel,
// but I can't figure out how to make it work. -Alan
while let Some(chunks) = shards.next().await {
    ...
}
}

Editorial comment: In this case, the channel to which he is sending the data can only receive data from a single sender at a time (it has an &mut self). Rust is potentially saving Alan from a nasty data race here. He could have used a mutex around the senders, but he would still hit issues trying to spawn parallel threads because he lacks an API that lets him borrow from the stack.

Status quo of an AWS engineer: Borrow check errors

Alan has more or less gotten the hang of the borrow checker, but sometimes it still surprises him. One day, he is working on a piece of code in DistriData. There are a set of connections:


#![allow(unused)]
fn main() {
struct Connection {
    buffer: Vec<u8>,
}
}

and each Connection has the ability to iterate through various requests. These requests return subslices of the data in the connection:


#![allow(unused)]
fn main() {
struct Request<'a> { 
    headers: Vec<&'a u8>,
}
}

He writes a routine to get the next request from the connection. It begins by reading data into the internal buffer and then parsing from that buffer and returning the request (playground):


#![allow(unused)]
fn main() {
impl Connection {
    pub async fn read_next(&mut self) -> Request<'_> {
       loop {
           self.read_into_buffer();
           
           // can't borrow self.buffer, even though we only hang on to it in the
           // return branch
           match Request::try_parse(&self.buffer) {    
               Some(r) => return r,
               None => continue,
           }
       }
    }   
       
    async fn read_into_buffer(&mut self) {
        self.buffer.push(1u8);
    }
}
}

This code, however, doesn't build. He gets the following error:

error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
  --> src/lib.rs:15:12
   |
13 |     pub async fn read_next(&mut self) -> Request<'_> {
   |                            - let's call the lifetime of this reference `'1`
14 |        loop {
15 |            self.read_into_buffer().await;
   |            ^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
...
19 |            match Request::try_parse(&self.buffer) {    
   |                                     ------------ immutable borrow occurs here
20 |                Some(r) => return r,
   |                                  - returning this value requires that `self.buffer` is borrowed for `'1`

This is confusing. He can see that there is a mutable borrow occurring, and an immutable one, but it seems like they occur at disjoint periods of time. Why is the compiler complaining?

After asking on #rust in the AWS Slack, he learns that this is a pattern that Rust's borrow checker just can't support. It gets confused when you return data from functions and winds up producing errors that aren't necessary. Apparently there's some research project named after a Hamlet play that might help, but that isn't going to help him now. The slack channel points him at the ouroboros project and he eventually uses it to work around the problem (playground).

Status quo of an AWS engineer: Solving a deadlock

Alan logs into work the next morning to see a message in Slack:

Alan, I've noticed that the code to replicate the shards across the hosts is sometimes leading to a deadlock. Any idea what's going on?

This is the same code that Alan tried to parallelize earlier. He pulls up the function, but everything seems correct! It's not obvious what the problem could be.


#![allow(unused)]
fn main() {
// Prepare the outgoing HTTP requests to each host:
let mut host_senders: Vec<hyper::body::Sender> = vec![];
let mut host_futures = FuturesUnordered::new();
for host in hosts {
    let (sender, body) = hyper::body::Body::channel();
    host_senders.push(sender);
    host_futures.push(create_future_to_send_request(body));
}

// Send each chunk from each shared to each host:
while let Some(chunks) = shards.next().await {
    let chunk_futures = chunks
        .into_iter()
        .zip(&mut host_senders)
        .map(|(chunk, sender)| sender.send_data(chunk));

    join_all(chunk_futures)
        .await
        .into_iter()
        .collect::<Result<Vec<_>, _>>()?;
}

// Wait for all HTTP requests to complete, aborting on error:
loop {
    match host_futures.next().await {
        Some(Ok(response)) => handle_response(response)?,
        Some(Err(e)) => return Err(e).map_err(box_err)?,
        None => return Ok(()),
    }
}
}

He tries to reproduce the deadlock. He is able to reproduce the problem readily enough, but only with larger requests. He had always used small tests before. He connects to the process with the debugger but he can't really make heads or tails of what tasks seem to be stuck (see Alan tries to debug a hang or Barbara wants async insights). He resorts to sprinkling logging everywhere.

At long last, he starts to see a pattern emerging. From the logs, he sees the data from each chunk is being sent to the hyper channel, but it never seems to be sent over the HTTP connection to the backend hosts. He is pretty confused by this -- he thought that the futures he pushed into host_futures should be taking care of sending the request body out over the internet. He goes to talk to Barbara -- who, as it happens, has been through this very problem in the past -- and she explains to him what is wrong.

"When you push those futures into FuturesUnordered", she says, "they will only make progress when you are actually awaiting on the stream. With the way the loop is setup now, the actual sending of data won't start until that third loop. Presumably your deadlock is because the second loop is blocked, waiting for some of the data to be sent."

"Huh. That's...weird. How can I fix it?", asks Alan.

"You need to spawn a separate task," says Barbara. "Something like this should work." She modifies the code to spawn a task that is performing the third loop. That task is spawned before the second loop starts:


#![allow(unused)]
fn main() {
// Prepare the outgoing HTTP requests to each host:
let mut host_senders: Vec<hyper::body::Sender> = vec![];
let mut host_futures = FuturesUnordered::new();
for host in hosts {
    let (sender, body) = hyper::body::Body::channel();
    host_senders.push(sender);
    host_futures.push(create_future_to_send_request(body));
}

// Make sure this runs in parallel with the loop below!
let send_future = tokio::spawn(async move {
    // Wait for all HTTP requests to complete, aborting on error:
    loop {
        match host_futures.next().await {
            Some(Ok(response)) => handle_response(response)?,
            Some(Err(e)) => break Err(e)?,
            None => break Ok(()),
        }
    }
});

// Send each chunk from each shared to each host:
while let Some(chunks) = shards.next().await {
    let chunk_futures = chunks
        .into_iter()
        .zip(&mut host_senders)
        .map(|(chunk, sender)| sender.send_data(chunk));

    join_all(chunk_futures)
        .await
        .into_iter()
        .collect::<Result<Vec<_>, _>>()?;
}

send_future.await
}

"Oof", says Alan, "I'll try to remember that!"

Status quo of an AWS engineer: Encountering pin

As Alan is building the server, he encounters a case where he wants to extend a stream of data to track some additional metrics. The stream implements AsyncRead. He thinks, "Ah, I'll just make a wrapper type that can extend any AsyncRead." He opens up the rustdoc, though, and realizes that this may be a bit tricky. "What is this self: Pin<&mut Self>?" notation, he thinks. He had vaguely heard of Pin when skimming the docs for futures and things but it was never something he had to work with directly before.

Alan's experiences here are well documented in Alan hates writing a Stream. Suffice to say that, at long last, he does it to work, but he does not feel he really understands what is going on. Talking with his coworkers on slack he notes, "Mostly I just add Pin and whatever else the compiler asks for until it works; then I pray it doesnโ€™t crash." :crossed_fingers:

References:

Status quo of an AWS engineer: Figuring out the best option

Sometime after working on AsyncRead, Alan stumbles over the async-trait crate. This crate offers a macro that will let him add async fn to traits. He's excited about this because it seems like it would allow him to rewrite some of the custom AsyncRead impls in a cleaner way. The only problem is that he can't really judge what the implications are going to be -- will it be faster? Slower? It's hard to tell until it's done. He feels like this comes up a lot in Rust: he is forced to make a choice and see it all the way through to the end before he can decide whether he likes it (or if it will work at all: sometimes he encounters a compiler error part of the way through that he just can't figure out how to resolve). It's particularly frustrating in Async Rust where there seem to be so many options to choose from.

Status quo of an AWS engineer: Testing the service

At first, Alan is content to test by hand. But once the server is starting to really work, he realizes he needs to do unit testing. He wants to do something like Mockito in Rust, so he starts searching the internet to find out what the options are. To his surprise, he learns that there doesn't seem to be any comparable framework in Rust.

One option he considers is making all of his functions generic. For example, he could create a trait to model, for example, the network, so that he can insert artificial pauses and other problems during testing:


#![allow(unused)]
fn main() {
trait Network {
    ...
}
}

Writing such a trait is fairly complicated, but even if he wrote it, he would have to make all of his structs and functions generic:


#![allow(unused)]
fn main() {
struct MyService<N: Network> {
    ...
}
}

Alan starts threading these parameters through the code and quickly gets overwhelmed.

He decides instead to test his real code without any mocking. He and his team start building a load-testing framework, they call it "simworld". They need to be able to inject network errors, control timing, and force other unusual situations.

Building simworld takes a lot of time, but it is very useful, and they start to gain some confidence in their code.

Status quo of an AWS engineer: Using the debugger

Even though the code is starting to work, they soon uncover a test that is not behaving as it ought to. Alan decides to try loading the Rust code into the debugger. He quickly realizes that the debugger is showing him the raw threads that are used to implement his service, and not the tasks and things that the service uses at a logical level, but that's not a problem for what he's doing right now. He sets a breakpoint on a particular line of code that corresponds to the place where things seem to be going wrong.

At first, the debugger seems to be working great, but Alan soon realizes that the experiences is a far cry from what he is used to with IntelliJ and Java code. Stepping through the code is unpredictable; it's not always obvious what function the will be stepping into. More than once Alan is confronted with a screen full of assembly. "No thank you," he thinks, and just avoids stepping into that function in the future. He finds that he often cannot print the values of variables ('variable optimized out', says the debugger) or execute code dynamically. Sometimes he is able to print them but instead of seeing something useful, he gets a bunch of random pointer values.

Alan gives up on the debugger. He starts to thread printfs and logging statements throughout his code. The [tracing] crate is pretty useful. Eventually, he is able to find and fix the problem and get his test case passing.

Status quo of an AWS engineer: Missed Waker leads to lost performance

Once the server is working, Alan starts to benchmark it. He is not really sure what to expect, but he is hoping to see an improvement in performance relative to the baseline service they were using before. To his surprise, it seems to be running slower!

After trying a few common tricks to improve performance without avail, Alan wishes -- not for the first time -- that he had better tools to understand what was happening. He decides instead to add more metrics and logs in his service, to understand where the bottlenecks are. Alan is used to using a well-supported internal tool (or a mature open source project) to collect metrics, where all he needed to do was pull in the library and set up a few configuration parameters.

However, in Rust, there is no widely-used, battle-tested library inside and outside his company. Even less so in an async code base! So Alan just used what seemed to be the best options: tracing and metrics crate, but he quickly found that they couldn't do a few of the things he wants to do, and somehow invoking the metrics is causing his service to be even slower. Now, Alan has to debug and profile his metrics implementation before he can even fix his service. (Cue another story on how that's difficult...)

After a few days of poking at the problem, Alan notices something odd. It seems like there is often a fairly large delay between the completion of a particular event and the execution of the code that is meant to respond to that event. Looking more closely, he realizes that the code for handling that event fails to trigger the Waker associated with the future, and hence the future never wakes up.

As it happens, this problem was hidden from him because that particular future was combined with a number of others. Eventually, the other futures get signalled, and hence the event does get handled -- but less promptly than it should be. He fixes the problem and performance is restored.

"I'm glad I had a baseline to compare this against!", he thinks. "I doubt I would have noticed this problem otherwise."

Status quo of an AWS engineer: Debugging overall performance loss

Alan's service is working better and better, but performance is still lagging from where he hoped it would be. It seems to be about 20% slower than the Java version! After calling in Barbara to help him diagnose the problem, Alan identifies one culprit: Some of the types in Alan's system are really large! The system seems to spend a surprising amount of time just copying bytes. Barbara helped Alan diagnose this by showing him some hidden rustc flags, tinkering with his perf setup, and a few other tricks.

There is still a performance gap, though, and Alan's not sure where it could be coming from. There are a few candidates:

  • Perhaps they are not using tokio's scheduler optimally.
  • Perhaps the memory allocation costs introduced by the #[async_trait] are starting to add up.

Alan tinkers with jemalloc and finds that it does improve performance, so that's interesting, but he'd like to have a better understanding of why.

Status quo of an AWS engineer: Getting ready to deploy the service

The next morning, Alan is talking to his team. The service is more-or-less working, although there is room to improve performance. It's time to talk about the Operational Readiness Review (ORR). Before any service can be put into production at AWS, it needs to pass an ORR. This is a stringent process where experienced senior engineers grill the team about all kinds of things that could go wrong and how they would handle them. These plans are gathered into a document that can be consulted should the need arise.

Alan has been through a few ORRs in his time at AWS. They're always stressful, but they're usually not that big a deal. A lot of the worst cases are handled automatically by the Java frameworks that Alan is accustomed to working with: for example, they have connection timeouts, or facilities for logging particular kinds of events. For the stuff that is not automatic, there are known "best practices" that can help.

For Rust, there are a lot of unknowns. The standard servers don't exist, and Alan's team has had to roll their own. There aren't nearly as many tools for performance monitoring or other sorts of improvements. Alan's team is treading new ground by deploying a Rust-based service, and they know they have to budget extra time to manage that.

Status quo of an AWS engineer: Using JNI

One other problem that Alan's team has encountered is that some of the standard libraries they would use at AWS are only available in Java. After some tinkering, Alan's team decides to stand-up a java server as part of their project. The idea is that the server can accept the connections and then use JNI to invoke the Rust code; having the Rust code in process means it can communicate directly with the underlying file descriptor and avoid copies.

They stand up the Java side fairly quickly and then spend a bit of time experimenting with different ways to handle the "handoff" to Rust code. The first problem is keeping the tokio runtime alive. Their first attempt to connect using JNI was causing the runtime to get torn down. But they figure out that they can store the Runtime in a static variable.

Next, they find having Rust code access Java objects is quite expensive; it's cheaper to pass bytebuffers at the boundary using protobuf. They try a few options for serialization and deserialization to find which works best.

Overall, the integration with the JNI works fairly smoothly for them, but they wish there had been some documented pattern to have just shown them the best way to set things up, rather than them having to discover it.

๐Ÿ˜ฑ Status quo stories: Barbara Anguishes Over HTTP

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect people's experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara is starting a new project, working together with Alan. They want to write a Rust library and as part of it they will need to make a few HTTP calls to various web services. While HTTP is part of the responsibilities of the library it is by no means the only thing the library will need to do.

As they are pair programming, they get the part of the library where HTTP will be involved and Alan asks Barbara, "OK, how do I make an HTTP request?".

As an experienced async Rust developer Barbara has been dreading this question from the start of the project. She's tempted to ask "How long do you have?", but she quickly gathers herself and starts to outline the various considerations. She starts with a relatively simple question: "Should we use an HTTP library with a sync interface or an async interface?".

Alan, who comes from a JavaScript background, remembers the transition from callbacks to async/await in that language. He assumes Rust is merely making its transition to async/await, and it will eventually be the always preferred choice. He hesitates and asks Barbara: "Isn't async/await always better?". Barbara, who can think of many scenarios where a blocking, sync interface would likely be better, weighs whether going done the rabbit-hole of async vs sync is the right way to spend their time. She decides instead to try to directly get at the question of whether they should use async for this particular project. She knows that bridging sync and async can be difficult, and so there's another question they need to answer first: "Are we going to expose a sync or an async interface to the users of our library?".

Alan, still confused about when using a sync interface is the right choice, replies as confident as he can: "Everybody wants to use async these days. Let's do that!". He braces for Barbara's answer as he's not even sure what he said is actually true.

Barbara replies, "If we expose an async API then we need to decide which async HTTP implementation we will use". As she finishes saying this, Barbara feels slightly uneasy. She knows that it is possible to use a sync HTTP library and expose it through an async API, but she fears totally confusing Alan and so decides to not mention this fact.

Barbara looks over at Alan and sees a blank stare on his face. She repeats the question: "So, which async HTTP implementation should we use?". Alan responds with the only thing that comes to his mind: "which one is the best?" to which Barbara responds "Well, it depends on which async runtime you're using".

Alan, feeling utterly dejected and hoping that the considerations will soon end tries a new route out of this conversation: "Can we allow the user of the library to decide?".

Barbara thinks to herself, "Oh boy, we could provide a trait that abstracts over the HTTP request and response and allow the user to provide the implementation for whatever HTTP library they want... BUT, if we ever need any additional functionality that an async runtime needs to expose - like async locks or async timers - we might be forced to pick an actual runtime implementation on behalf of the user... Perhaps, we can put the most popular runtime implementations behind feature flags and let the user chose that way... BUT what if we want to allow plugging in of different runtimes?"

Alan, having watched Barbara stare off into the distance for what felt like a half-hour, feels bad for his colleague. All he can think to himself is how Rust is so much more complicated that C#.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • What is a very mundane and simple decision in many other languages, picking an HTTP library, requires users to contemplate many different considerations.
  • There is no practical way to choose an HTTP library that will serve most of the ecosystem. Sync/Async, competing runtimes, etc. - someone will always be left out.
  • HTTP is a small implementation detail of this library, but it is a HUGE decision that will ultimately be the biggest factor in who can adopt their library.

What are the sources for this story?

Based on the author's personal experience of taking newcomers to Rust through the decision making process of picking an HTTP implementation for a library.

Why did you choose Barbara to tell this story?

Barbara knows all the considerations and their consequences. A less experienced Rust developer might just make a choice even if that choice isn't the right one for them.

๐Ÿ˜ฑ Status quo stories: Barbara wants single threaded optimizations, but not that much

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara is working on operating system services, all of which benefit from concurrency, but only some of which benefit from parallelism. In cases where a service does not benefit from parallelism, a single-threaded executor is used which allows spawning !Send tasks.

Barbara has developed a useful feature as a module within one of her system's single-threaded services. The feature allows for the creation of multiple IPC objects to use within concurrent tasks while caching and reusing some of the heavier computation performed. This is implemented with reference counted interior mutability:


#![allow(unused)]
fn main() {
pub struct IpcHandle {
    cache_storage: Rc<RefCell<IpcCache>>,
    // ...
}

struct IpcCache { /* ... */ }
}

A colleague asks Barbara if she'd be interested in making this code available to other services with similar needs. After Barbara factors the module out into its own crate, her colleague tries integrating it into their service. This fails because the second service needs to hold IpcHandles across yieldpoints and it uses a multi-threaded executor. The multi-threaded executor requires that all tasks implement Send so they can be migrated between threads for work stealing scheduling.

Rejected: both single- and multi-threaded versions

Barbara considers her options to make the crate usable by the multi-threaded system service. She decides against making IpcHandle available in both single-threaded and multi-threaded versions. To do this generically would require a lot of boilerplate. For example, it would require manually duplicating APIs which would need to have a Send bound in the multi-threaded case:


#![allow(unused)]
fn main() {
impl LocalIpcHandle {
    fn spawn_on_reply<F: Future + 'static>(&mut self, to_spawn: impl Fn(IpcReply) -> F) {
        // ...
    }
}

impl SendIpcHandle {
    fn spawn_on_reply<F: Future + Send + 'static>(&mut self, to_spawn: impl Fn(IpcReply) -> F) {
        // ...
    }
}
}

Accepted: only implement multi-threaded version

Barbara decides it's not worth the effort to duplicate so much of the crate's functionality, and decides to make the whole library thread-safe:


#![allow(unused)]
fn main() {
pub struct IpcHandle {
    cache_storage: Arc<Mutex<IpcCache>>,
    // ...
}

struct IpcCache { /* ... */ }
}

This requires her to migrate her original system service to use multi-threaded types when interacting with the library. Before the change her service uses only single-threaded reference counting and interior mutability:


#![allow(unused)]
fn main() {
#[derive(Clone)]
struct ClientBroker {
    state: Rc<RefCell<ClientState>>,
}

impl ClientBroker {
    fn start_serving_clients(self) {
        let mut ipc_handle = self.make_ipc_handle_for_new_clients();
        ipc_handle.spawn_on_reply(move |reply| shared_state.clone().serve_client(reply));
        LocalExecutor::new().run_singlethreaded(ipc_handle.listen());
    }

    fn make_ipc_handle_for_new_clients(&self) { /* ... */ }
    async fn serve_client(self, reply: IpcReply) { /* accesses interior mutability... */ }
}
}

In order to be compatible with her own crate, Barbara needs to wrap the shared state of her service behind multi-threaded reference counting and synchronization:


#![allow(unused)]
fn main() {
#[derive(Clone)]
struct ClientBroker {
    state: Arc<Mutex<ClientState>>,
}

impl ClientBroker { /* nothing changed */ }
}

This incurs some performance overhead when cloning the Arc and when accessing the Mutex. The former is cheap when uncontended on x86 but will have different performance characteristics on e.g. ARM platforms. The latter's overhead varies depending on the kind of Mutex used, e.g. an uncontended parking_lot::Mutex may only need a few atomic instructions to acquire it. Acquiring many platforms' std::sync::Mutex is much more expensive than a few atomics. This overhead is usually not very high, but it does pollute shared resources like cache lines and is multiplied by the number of single-threaded services which make use of such a library.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

In synchronous Rust, choosing the "Sendness" of a crate is basically a choice about the concurrency it can support. In asynchronous Rust, one can write highly concurrent programs that still execute using only a single thread, but it is difficult to achieve maximum performance with reusable code.

Abstracting over a library's Sendness requires being generic over storage/synchronization types and APIs which need to accept user-defined types/tasks/callbacks.

What are the sources for this story?

As of writing, the Fuchsia operating system had over 1,500 invocations of LocalExecutor::run_singlethreaded. There were [less than 500 invocations](https://cs.opensource.google/search?q=file:rs%20%5C.run%5C(&ss=fuchsia%2Ffuchsia) of SendExecutor::run.1 As of writing the author could not find any widely used support libraries which were not thread-safe.

actix-rt's spawn function does not require Send for its futures, because each task is polled on the thread that spawned it. However it is very common when using actix-rt via actix-web to make use of async crates originally designed for tokio, whose spawn function does require Send.

Popular crates like diesel are still designing async support, and it appears they are likely to require Send.

1

There are multiple ways to invoke the different Rust executors for Fuchsia. The other searches for each executor yield a handful of results but not enough to change the relative sample sizes here.

Why did you choose Barbara to tell this story?

As an experienced Rustacean, Barbara is more likely to be responsible for designing functionality to share across teams. She's also going to be more aware of the specific performance implications of her change, and will likely find it more frustrating to encounter these boundaries.

How would this story have played out differently for the other characters?

A less experienced Rustacean may not even be tempted to define two versions, as the approach Barbara took is pretty close to the "just .clone() it" advice often given to beginners.

๐Ÿ˜ฑ Status quo stories: Barbara battles buffered streams

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Mysterious timeouts

Barbara is working on her YouBuy server and is puzzling over a strange bug report. She is encountering users reporting that their browser connection is timing out when they connect to YouBuy. Based on the logs, she can see that they are timing out in the do_select function:


#![allow(unused)]
fn main() {
async fn do_select<T>(database: &Database, query: Query) -> Result<Vec<T>> {
    let conn = database.get_conn().await?;
    conn.select_query(query).await
}
}

This is surprising, because do_select doesn't do much - it does a database query to claim a work item from a queue, but isn't expected to handle a lot of data or hit extreme slowdown on the database side. Some of the time, there is some kind of massive delay in between the get_conn method opening a connection and the call to select_query. But why? She has metrics that show that the CPU is largely idle, so it's not like the cores are all occupied.

She looks at the caller of do_select, which is a function do_work:


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    let work = do_select(database, FIND_WORK_QUERY)?;
    stream::iter(work)
        .map(|item| do_select(database, work_from_item(item)))
        .buffered(5)
        .for_each(|work_item| process_work_item(database, work_item))
        .await;
}

async fn process_work_item(...) { }
}

The do_work function is invoking do_select as part of a stream; it is buffering up a certain number of do_select instances and, for each one, invoking process_work_item. Everything seems to be in order, and she can see that calls to process_work_item are completing in the logs.

Following a hunch, she adds more logging in and around the process_work_item function and waits a few days to accumulate new logs. She notices that shortly after each time out, there is always a log of a process_work_item call that takes at least 20 seconds. These calls are not related to the connections that time out, they are for other connections, but they always appear afterwards in time.

process_work_item is expected to be slow sometimes because it can end up handling large items, so this is not immediately surprising to Barbara. She is, however, surprised by the correlation - surely the executor ensures that process_work_item can't stop do_select from doing its job?

Barbara thought she understood how async worked

Barbara thought she understood futures fairly well. She thought of async fn as basically "like a synchronous function with more advanced control flow". She knew that Rust's futures were lazy -- that they didn't start executing until they were awaited -- and she knew that could compose them using utilities like join, FuturesUnordered, or the buffered method (as in this example).

Barbara also knows that every future winds up associated with a task, and that if you have multiple futures on the same task (in this case, the futures in the stream, for example) then they would run concurrently, but not in parallel. Based on this, she thinks perhaps that process_work_item is a CPU hog that takes too long to complete, and so she needs to add a call to spawn_blocking. But when she looks more closely, she realizes that process_work_item is an async function, and those 20 seconds that it spends executing are mostly spent waiting on I/O. Huh, that's confusing, because the task ought to be able to execute other futures in that case -- so why are her connections stalling out without making progress?

Barbara goes deep into how poll works

She goes to read the Rust async book and tries to think about the model, but she can't quite see the problem. Then she asks on the rust-lang Discord and someone explains to her what is going on, with the catchphrase "remember, async is about waiting in parallel, not working in parallel". Finally, after reading over what they wrote a few times, and reading some chapters in the async book, she sees the problem.

It turns out that, to Rust, a task is kind of a black box with a "poll" function. When the executor thinks a task can make progress, it calls poll. The task itself then delegates this call to poll down to all the other futures that are composed together. In the case of her buffered stream of connections, the stream gets woken up and it would then delegate down the various buffered items in its list.

When it executes Stream::for_each, the task is doing something like this:


#![allow(unused)]
fn main() {
while let Some(work_item) = stream.next().await {
    process_work_item(database, work_item).await;
}
}

The task can only "wait" on one "await" at a time. It will execute that await until it completes and only then move on to the rest of the function. When the task is blocked on the first await, it will process all the futures that are part of the stream, and hence the various buffered connections all make progress.

But once a work item is produced, the task will block on the second await -- the one that resulted from process_work_item. This means that, until process_work_item completes, control will never return to the first await. As a result, none of the futures in the stream will make progress, even if they could do so!

The fix

Once Barbara understands the problem, she considers the fix. The most obvious fix is to spawn out tasks for the do_select calls, like so:


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    let work = do_select(database, FIND_WORK_QUERY)?;
    stream::iter(work)
        .map(|item| task::spawn(do_select(database, work_from_item(item))))
        .buffered(5)
        .for_each(|work_item| process_work_item(database, work_item))
        .await;
}
}

Spawning a task will allow the runtime to keep moving those tasks along independently of the do_work task. Unfortunately, this change results in a compilation error:

error[E0759]: `database` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
  --> src/main.rs:8:18
   |
8  | async fn do_work(database: &Database) {
   |                  ^^^^^^^^  --------- this data with an anonymous lifetime `'_`...
   |                  |
   |                  ...is captured here...
   |        .map(|item| task::spawn(do_select(database, work_from_item(item))))
   |                    ----------- ...and is required to live as long as `'static` here

"Ah, right," she says, "spawned tasks can't use borrowed data. I wish I had [rayon] or the scoped threads from [crossbeam]."

"Let me see," Barbara thinks. "What else could I do?" She has the idea that she doesn't have to process the work items immediately. She could buffer up the work into a FuturesUnordered and process it after everything is ready:


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    let work = do_select(database, FIND_WORK_QUERY)?;
    let mut results = FuturesUnordered::new();
    stream::iter(work)
        .map(|item| do_select(database, work_from_item(item)))
        .buffered(5)
        .for_each(|work_item| {
            results.push(process_work_item(database, work_item));
            futures::future::ready(())
        })
        .await;

    while let Some(_) = results.next().await { }
}
}

This changes the behavior of her program quite a bit though. The original goal was to have at most 5 do_select calls occurring concurrently with exactly one process_work_item, but now she has all of the process_work_item calls executing at once. Nonetheless, the hack solves her immediate problem. Buffering up work into a FuturesUnordered becomes a kind of "fallback" for those cases where can't readily insert a task::spawn.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Rust's future model is a 'leaky abstraction' that works quite differently from futures in other languages. It is prone to some subtle bugs that require a relatively deep understanding of its inner works to understand and fix.
  • "Nested awaits" -- where the task blocks on an inner await while there remains other futures that are still awaiting results -- are easy to do but can cause a lot of trouble.
  • Lack of scoped futures makes it hard to spawn items into separate tasks for independent processing sometimes.

What are the sources for this story?

This is based on the bug report Footgun with Future Unordered but the solution that Barbara came up with is something that was relayed by farnz vision doc writing session. farnz mentioned at the time that this pattern was frequently used in their codebase to work around this sort of hazard.

Why did you choose Barbara to tell this story?

To illustrate that knowing Rust -- and even having a decent handle on async Rust's basic model -- is not enough to make it clear what is going on in this particular case.

How would this story have played out differently for the other characters?

Woe be unto them! Identifying and fixing this bug required a lot of fluency with Rust and the async model. Alan in particular was probably relying on his understanding of async-await from other languages, which works very differently. In those languages, every async function is enqueued automatically for independent execution, so hazards like this do not arise (though this comes at a performance cost).

Besides timeouts for clients, what else could go wrong?

The original bug report mentioned the possibility of deadlock:

When using an async friendly semaphore (like Tokio provides), you can deadlock yourself by having the tasks that are waiting in the FuturesUnordered owning all the semaphores, while having an item in a .for_each() block after buffer_unordered() requiring a semaphore.

Is there any way for Barbara to both produce and process work items simultaneously?

Yes, in this case, she could've. For example, she might have written


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    let work = do_select(database, FIND_WORK_QUERY).await?;

    stream::iter(work)
        .map(|item| async move {
            let work_item = do_select(database, work_from_item(item)).await;
            process_work_item(database, work_item).await;
        })
        .buffered(5)
        .for_each(|()| std::future::ready(()))
        .await;
}
}

This would however mean that she would have 5 calls to process_work_item executing at once. In the actual case that inspired this story, process_work_item can take as much as 10 GB of RAM, so having multiple concurrent calls is a problem.

Is there any way for Barbara to both produce and process work items simultaneously, without the buffering and so forth?

Yes, she might use a loop with a select!. This would ensure that she is processing both the stream that produces work items and the FuturesUnordered that consumes them:


#![allow(unused)]
fn main() {
async fn do_work(database: &Database) {
    let work = do_select(database, FIND_WORK_QUERY).await?;

    let selects = stream::iter(work)
        .map(|item| do_select(database, work_from_item(item)))
        .buffered(5)
        .fuse();
    tokio::pin!(selects);

    let mut results = FuturesUnordered::new();

    loop {
        tokio::select! {
            Some(work_item) = selects.next() => {
                results.push(process_work_item(database, work_item));
            },
            Some(()) = results.next() => { /* do nothing */ },
            else => break,
        }
    }
}
}

Note that doing so is producing code that looks quite a bit different than where she started, though. :( This also behaves very differently. There can be a queue of tens of thousands of items that do_select grabs from, and this code will potentially pull far too many items out of the queue, which then would have to be requeued on shutdown. The intent of the buffered(5) call was to grab 5 work items from the queue at most, so that other hosts could pull out work items and share the load when there's a spike.

Barbara begets backpressure and benchmarks async_trait

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Write your story here! Feel free to add subsections, citations, links, code examples, whatever you think is best.

Barbara is implementing the network stack for an experimental new operating system in Rust. She loves Rust's combination of performance, expressiveness, and safety. She and her team set off implementing the network protocols, using traits to separate protocol layers, break up the work, and make them testable.

Unlike most operating systems, this network stack is designed to live in a separate process from the driver itself. Barbara eventually realizes a problem: this system architecture will require modeling backpressure explicitly when sending outbound packets.

She starts looking into how to model backpressure without having to rewrite all of her team's code. She realizes that async is actually the perfect model for expressing backpressure implicitly. By using async, she can keep most of her code without explicitly propagating backpressure information.

When she sets off to implement this, Barbara quickly realizes async won't work off the shelf because of the lack of support for async fn in traits.

Barbara is stuck. She has a large codebase that she would like to convert to using async, but core features of the language she was using are not available with async. She starts looking for workarounds.

Barbara begins by writing out requirements for her use case. She needs to

  • Continue using trait abstractions for core protocol implementations
  • Take advantage of the backpressure model implied by async
  • Maintain performance target of at most 4 ยตs per packet on underpowered hardware

The last requirement is important for sustaining gigabit speeds, a key goal of the network stack and one reason why Rust was chosen.

Barbara thinks about writing down the name of each Future type, but realizes that this wouldn't work with the async keyword. Using Future combinators directly would be extremely verbose and painful.

Barbara finds the async_trait crate. Given her performance constraints, she is wary of the allocations and dynamic dispatch introduced by the crate.

She decides to write a benchmark to simulate the performance impact of async_trait compared to a future where async fn is fully supported in traits. Looking at the async_trait documentation, she sees that it desugars code like


#![allow(unused)]
fn main() {
#[async_trait]
impl Trait for Foo {
    async fn run(&self) {
        // ...
    }
}
}

to


#![allow(unused)]
fn main() {
impl Trait for Foo {
    fn run<'a>(
        &'a self,
    ) -> Pin<Box<dyn std::future::Future<Output = ()> + Send + 'a>>
    where
        Self: Sync + 'a,
    {
        async fn run(_self: &Foo) {
            // original body
        }
        Box::pin(run(self))
    }
}
}

The benchmark Barbara uses constructs a tree of Futures 5 levels deep, using both async blocks and a manual desugaring similar to above. She runs the benchmark on hardware that is representative for her use case and finds that while executing a single native async future takes 639 ns, the manual desugaring using boxed takes 1.82 ยตs.

Barbara sees that in a real codebase, this performance would not be good enough for writing a network stack capable of sustaining gigabit-level throughput on underpowered hardware. Barbara is disappointed, but knows that support for async fn in traits is in the works.

Barbara looks at her organization's current priorities and decides that 100's of mbps will be an acceptable level of performance for the near term. She decides to adopt async_trait with the expectation that the performance penalty will go away in the long term.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

Talk about the major takeaways-- what do you see as the biggest problems.

  • Language features that don't work well together can be a major roadblock in the course of development. Developers expect all of a language's features to be at their disposal, not using one to cut them off from using another.
  • Allocation and dynamic dispatch aren't acceptable runtime performance costs for all use cases.

What are the sources for this story?

Talk about what the story is based on, ideally with links to blog posts, tweets, or other evidence.

This story is based on actual experience implementing the 3rd-generation network stack for the Fuchsia operating system.

The benchmarks are implemented here.

Why do you need to model backpressure?

The Linux network stack doesn't do this; instead it drops packets as hardware buffers fill up.

Because our network stack lives in a separate process from the driver, paying attention to hardware queue depth directly is not an option. There is a communication channel of bounded depth between the network stack and the driver. Dropping packets when this channel fills up would result in an unacceptable level of packet loss. Instead, the network stack must "push" this backpressure up to the applications using the network. This means each layer of the system has to be aware of backpressure.

How would you solve this in other systems languages?

In C++ we would probably model this using callbacks which are passed all the way down the stack (through each leayer of the system).

What's nice about async when modelling backpressure?

Futures present a uniform mechanism for communicating backpressure through polling. When requests stack up but their handler futures are not being polled, this indicates backpressure. Using this model means we get backpressure "for free" by simply adding async and .await to our code, at least in theory.

Async is a viral concern in a codebase, but so is backpressure. You can't have a backpressure aware system when one layer of that system isn't made aware of backpressure in some way. So in this case it's actually helpful that there's not an easy way to call an async fn from a sync fn; if there were, we might accidentally "break the chain" of backpressure awareness.

What was the benchmarking methodology?

A macro was used to generate 512 slightly different versions of the same code, to defeat the branch predictor. Each version varied slightly to prevent LLVM from merging duplicate code.

The leaf futures in the benchmark always returned Poll::Ready. The call depth was always 5 async functions deep.

Did you learn anything else from the benchmarks?

In one of the benchmarks we compared the async fn version to the equivalent synchronous code. This helps us see the impact of the state machine transformation on performance.

The results: synchronous code took 311.39 ns while the async fn code took 433.40 ns.

Why did you choose Barbara to tell this story?

Talk about the character you used for the story and why.

The implementation work in this story was done by @joshlf, an experienced Rust developer who was new to async.

How would this story have played out differently for the other characters?

In some cases, there are problems that only occur for people from specific backgrounds, or which play out differently. This question can be used to highlight that.

Alan might not have done the benchmarking up front, leading to a surprise later on when the performance wasn't up to par with Rust's promise. Grace might have decided to implement async state machines manually, giving up on the expressiveness of async.

๐Ÿ˜ฑ Status quo stories: Barbara bridges sync and async in perf.rust-lang.org

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara is working on the code for perf.rust-lang.org and she wants to do a web request to load various intermediate results. She has heard that the reqwest crate is quite nice, so she decides to give it a try. She writes up an async function that does her web request:


#![allow(unused)]
fn main() {
async fn do_web_request(url: &Url) -> Data {
    ...
}
}

She needs to apply this async function to a number of urls. She wants to use the iterator map function, like so:

async fn do_web_request(url: &Url) -> Data {...}

fn aggregate(urls: &[Url]) -> Vec<Data> {
    urls
        .iter()
        .map(|url| do_web_request(url))
        .collect()
}

fn main() {
    /* do stuff */
    let data = aggregate();
    /* do more stuff */
}

Of course, since do_web_request is an async fn, she gets a type error from the compiler:

error[E0277]: a value of type `Vec<Data>` cannot be built from an iterator over elements of type `impl Future`
  --> src/main.rs:11:14
   |
11 |             .collect();
   |              ^^^^^^^ value of type `Vec<Data>` cannot be built from `std::iter::Iterator<Item=impl Future>`
   |
   = help: the trait `FromIterator<impl Future>` is not implemented for `Vec<Data>`

"Of course," she thinks, "I can't call an async function from a closure."

Introducing block_on

She decides that since she is not overly concerned about performance, so she decides she'll just use a call to block_on from the futures crate and execute the function synchronously:

async fn do_web_request(url: &Url) -> Data {...}

fn aggregate(urls: &[Url]) -> Vec<Data> {
    urls
        .iter()
        .map(|url| futures::executor::block_on(do_web_request(url)))
        .collect()
}

fn main() {
    /* do stuff */
    let data = aggregate();
    /* do more stuff */
}

The code compiles, and it seems to work.

Switching to async main

As Barbara works on perf.rust-lang.org, she realizes that she needs to do more and more async operations. She decides to convert her synchronous main function into an async main. She's using tokio, so she is able to do this very conveniently with the #[tokio::main] decorator:

#[tokio::main]
async fn main() {
    /* do stuff */
    let data = aggregate();
    /* do more stuff */
}

Everything seems to work ok on her laptop, but when she pushes the code to production, it deadlocks immediately. "What's this?" she says. Confused, she runs the code on her laptop a few more times, but it seems to work fine. (There's a faq explaining what's going on. -ed.)

She decides to try debugging. She fires up a debugger but finds it is isn't really giving her useful information about what is stuck (she has basically the same problems that Alan has). She wishes she could get insight into tokio's state.

Frustrated, she starts reading the tokio docs more closely and she realizes that tokio runtimes offer their own block_on method. "Maybe using tokio's block_on will help?" she thinks, "Worth a try, anyway." She changes the aggregate function to use tokio's block_on:


#![allow(unused)]
fn main() {
fn block_on<O>(f: impl Future<Output = O>) -> O {
    let rt = tokio::runtime::Runtime::new().unwrap();
    rt.block_on(f)
}

fn aggregate(urls: &[Url]) -> Vec<Data> {
    urls
        .iter()
        .map(|url| block_on(do_web_request(url)))
        .collect()
}
}

The good news is that the deadlock is gone. The bad news is that now she is getting a panic:

thread 'main' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like block_on) attempted to block the current thread while the thread is being used to drive asynchronous tasks.'

"Well," she thinks, "I could use the Handle API to get the current runtime instead of creating a new one? Maybe that's the problem."


#![allow(unused)]
fn main() {
fn aggregate(urls: &[&str]) -> Vec<String> {
    let handle = tokio::runtime::Handle::current();
    urls.iter()
        .map(|url| handle.block_on(do_web_request(url)))
        .collect()
}
}

But this also seems to panic in the same way.

Trying out spawn_blocking

Reading more into this problem, she realizes she is supposed to be using spawn_blocking. She tries replacing block_on with tokio::task::spawn_blocking:


#![allow(unused)]
fn main() {
fn aggregate(urls: &[Url]) -> Vec<Data> {
    urls
        .iter()
        .map(|url| tokio::task::spawn_blocking(move || do_web_request(url)))
        .collect()
}
}

but now she gets a type error again:

error[E0277]: a value of type `Vec<Data>` cannot be built from an iterator over elements of type `tokio::task::JoinHandle<impl futures::Future>`
  --> src/main.rs:22:14
   |
22 |             .collect();
   |              ^^^^^^^ value of type `Vec<Data>` cannot be built from `std::iter::Iterator<Item=tokio::task::JoinHandle<impl futures::Future>>`
   |
   = help: the trait `FromIterator<tokio::task::JoinHandle<impl futures::Future>>` is not implemented for `Vec<Data>`

Of course! spawn_blocking, like map, just takes a regular closure, not an async closure; it's purpose is to embed some sync code within an async task, so a sync closure makes sense -- and moreover async closures aren't stable -- but it's all rather frustrating nonetheless. "Well," she thinks, "I can use spawn to get back into an async context!" So she adds a call to spawn inside the spawn_blocking closure:


#![allow(unused)]
fn main() {
fn aggregate(urls: &[Url]) -> Vec<Data> {
    urls
        .iter()
        .map(|url| tokio::task::spawn_blocking(move || {
            tokio::task::spawn(async move {
                do_web_request(url).await
            })
        }))
        .collect()
}
}

But this isn't really helping, as spawn still yields a future. She's getting the same errors.

Trying out join_all

She remembers now that this whole drama started because she was converting her main function to be async. Maybe she doesn't have to bridge between sync and async? She starts digging around in the docs and finds futures::join_all. Using that, she can change aggregate to be an async function too:


#![allow(unused)]
fn main() {
async fn aggregate(urls: &[Url]) -> Vec<Data> {
    futures::join_all(
        urls
            .iter()
            .map(|url| do_web_request(url))
    ).await
}
}

Things are working again now, so she is happy, although she notes that join_all has quadratic time complexity. That's not great.

Filtering

Later on, she would like to apply a filter to the aggregation operation. She realizes that if she wants to use the fetched data when doing the filtering, she has to filter the vector after the join has completed. She wants to write something like


#![allow(unused)]
fn main() {
async fn aggregate(urls: &[Url]) -> Vec<Data> {
    futures::join_all(
        urls
            .iter()
            .map(|url| do_web_request(url))
            .filter(|data| test(data))
    ).await
}
}

but she can't, because data is a future and not the Data itself. Instead she has to build the vector first and then post-process it:


#![allow(unused)]
fn main() {
async fn aggregate(urls: &[Url]) -> Vec<Data> {
    let mut data: Vec<Data> = futures::join_all(
        urls
            .iter()
            .map(|url| do_web_request(url))
    ).await;
    data.retain(test);
    data
}
}

This is annoying, but performance isn't critical, so it's ok.

And the cycle begins again

Later on, she wants to call aggregate from another binary. This one doesn't have an async main. This context is deep inside of an iterator chain and was previously entirely synchronous. She realizes it would be a lot of work to change all the intervening stack frames to be async fn, rewrite the iterators into streams, etc. She decides to just call block_on again, even though it make her nervous.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Some projects don't care about max performance and just want things to work once the program compiles.
    • They would probably be happy with sync but as the most popular libraries for web requests, databases, etc, offer async interfaces, they may still be using async code.
  • There are contexts where you can't easily add an await.
    • For example, inside of an iterator chain.
    • Big block of existing code.
  • Mixing sync and async code can cause deadlocks that are really painful to diagnose, particularly when you have an async-sync-async sandwich.

Why did you choose Barbara to tell this story?

  • Because Mark (who experienced most of it) is a very experienced Rust developer.
  • Because you could experience this story regardless of language background or being new to Rust.

How would this story have played out differently for the other characters?

I would expect it would work out fairly similarly, except that the type errors and things might well have been more challenging for people to figure out, assuming they aren't already familiar with Rust.

Why did Barbara only get deadlocks in production, and not on her laptop?

This is because the production instance she was using had only a single core, but her laptop is a multicore machine. The actual cause of the deadlocks is that block_on basically "takes over" the tokio worker thread, and hence the tokio scheduler cannot run. If that block_on is blocked on another future that will have to execute, then some other thread must take over of completing that future. On Barbara's multicore machine, there were more threads available, so the system did not deadlock. But on the production instance, there was only a single thread. Barbara could have encountered deadlocks on her local machine as well if she had enough instances of block_on running at once.

Could the runtime have prevented the deadlock?

One way to resolve this problem would be to have a runtime that creates more threads as needed. This is what was proposed in this blog post, for example.

Adapting the number of worker threads has downsides. It requires knowing the right threshold for creating new threads (which is fundamentally unknowable). The result is that the runtime will sometimes observe that some thread seems to be taking a long time and create new threads just before that thread was about to finish. These new threads generate overhead and lower the overall performance. It also requires work stealing and other techniques that can lead to work running on mulitple cores and having less locality. Systems tuned for maximal performance tend to prefer a single thread per core for this reason.

If some runtimes are adaptive, that may also lead to people writing libraries which block without caring. These libraries would then be a performance or deadlock hazard when used on a runtime that is not adaptive.

Is there any way to have kept aggregate as a synchronous function?

Yes, Barbara could have written something like this:

fn aggregate(urls: &[Url]) -> Vec<Data> {
    let handle = Handle::current();

    urls.iter()
        .map(|url| handle.block_on(do_web_request(url)))
        .collect()
}

#[tokio::main]
async fn main() {
    let data = task::spawn_blocking(move || aggregate(&[Url, Url]))
        .await
        .unwrap();
    println!("done");
}

This aggregate function can only safely be invoked from inside a tokio spawn_blocking call, however, since Handle::current will only work in that context. She could also have used the original futures variant of block_on, in that case, and things would also work.

Why didn't Barbara just use the sync API for reqwest?

reqwest does offer a synchronous API, but it's not enabled by default, you have to use an optional feature. Further, not all crates offer synchronous APIs. Finally, Barbara has had some vague poor experience when using synchronous APIs, such as panics, and so she's learned the heuristic of "use the async API unless you're doing something really, really simple".

Regardless, the synchronous reqwest API is actually itself implemented using block_on: so Barbara would have ultimately hit the same issues. Further, not all crates offer synchronous APIs -- some offer only async APIs. In fact, these same issues are probably the sources of those panics that Barbara encountered in the past.

In general, though, embedded sync within async or vice versa works "ok", once you know the right tricks. Where things become challenging is when you have a "sandwich", with async-sync-async.

Do people mix spawn_blocking and spawn successfully in real code?

Yes! Here is some code from perf.rust-lang.org doing exactly that. The catch is that it winds up giving you a future in the end, which didn't work for Barbara because her code is embedded within an iterator (and hence she can't make things async "all the way down").

What are other ways people could experience similar problems mixing sync and async?

  • Using std::Mutex in async code.
  • Calling the blocking version of an asynchronous API.
    • For example, reqwest::blocking, the synchronous zbus and rumqtt APIs.
    • These are commonly implemented by using some variant of block_on internally.
    • Therefore they can lead to panics or deadlocks depending on what async runtime they are built from and used with.

Why wouldn't Barbara just make everything async from the start?

There are times when converting synchronous code to async is difficult or even impossible. Here are some of the reasons:

  • Asynchronous functions cannot appear in trait impls.
  • Asynchronous functions cannot be called from APIs that take closures for callbacks, like Iterator::map in this example.
  • Sometimes the synchronous functions come from other crates and are not fully under their control.
  • It's just a lot of work!

How many variants of block_on are there?

  • the futures crate offers a runtime-independent block-on (which can lead to deadlocks, as in this story)
  • the tokio crate offers a block_on method (which will panic if used inside of another tokio runtime, as in this story)
  • the pollster crate exists just to offer block_on
  • the futures-lite crate offers a block_on
  • the aysnc-std crate offers block_on
  • the async-io crate offers block_on
  • ...there are probably more, but I think you get the point.

๐Ÿ˜ฑ Status quo stories: Barbara builds an async executor

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

Barbara wants to set priorities to the tasks spawned to the executor. However, she finds no existing async executor provides such a feature She would be more than happy to enhance an existing executor and even intends to do so at some point. At the same time, Barbara understand that the process of getting changes merged officially into an executor can be long, and for good reason.

Due to pressure and deadlines at work she needs a first version to be working as soon as possible. She then decides to build her own async executor.

First, Barbara found crossbeam-deque provides work-stealing deques of good quality. She decides to use it to build task schedulers. She plans for each working thread to have a loop which repeatedly gets a task from the deque and polls it.

But wait, what should we put into those queues to represent each "task"?

At first, Barbara thought it must contain the Future itself and the additional priority which was used by the scheduler. So she first wrote:


#![allow(unused)]
fn main() {
pub struct Task {
    future: Pin<Box<dyn Future<Output = ()> + Send + 'static>>,
    priority: u8
}
}

And the working thread loop should run something like:


#![allow(unused)]
fn main() {
pub fn poll_task(task: Task) {
    let waker = todo!();
    let mut cx = Context::from_waker(&waker);
    task.future.as_mut().poll(&mut cx);
}
}

"How do I create a waker?" Barbara asked herself. Quickly, she found the Wake trait. Seeing the wake method takes an Arc<Self>, she realized the task in the scheduler should be stored in an Arc. After some thought, she realizes it makes sense because both the deque in the scheduler and the waker may hold a reference to the task.

To implement Wake, the Task should contain the sender of the scheduler. She changed the code to something like this:


#![allow(unused)]
fn main() {
pub struct Task {
    future: Pin<Box<dyn Future<Output = ()> + Send + 'static>>,
    scheduler: SchedulerSender,
    priority: u8,
}

unsafe impl Sync for Task {}

impl Wake for Task {
    fn wake(self: Arc<Self>) {
        self.scheduler.send(self.clone());
    }
}

pub fn poll_task(task: Arc<Task>) {
    let waker = Waker::from(task.clone());
    let mut cx = Context::from_waker(&waker);
    task.future.as_mut().poll(&mut cx);
//  ^^^^^^^^^^^ cannot borrow as mutable
}
}

The code still needed some change because the future in the Arc<Task> became immutable.

"Okay. I can guarantee Task is created from a Pin<Box<Future>>, and I think the same future won't be polled concurrently in two threads. So let me bypass the safety checks." Barbara changed the future to a raw pointer and confidently used some unsafe blocks to make it compile.


#![allow(unused)]
fn main() {
pub struct Task {
    future: *mut (dyn Future<Output = ()> + Send + 'static),
    ...
}

unsafe impl Send for Task {}
unsafe impl Sync for Task {}

pub fn poll_task(task: Arc<Task>) {
    ...
    unsafe {
        Pin::new_unchecked(&mut *task.future).poll(&mut cx);
    }
}
}

Luckily, a colleague of Barbara noticed something wrong. The wake method could be called multiple times so multiple copies of the task could exist in the scheduler. The scheduler might not work correctly because of this. What's worse, a more severe problem was that multiple threads might get copies of the same task from the scheduler and cause a race in polling the future.

Barbara soon got a idea to solve it. She added a state field to the Task. By carefully maintaining the state of the task, she could guarantee there are no duplicate tasks in the scheduler and no race can happen when polling the future.


#![allow(unused)]
fn main() {
const NOTIFIED: u64 = 1;
const IDLE: u64 = 2;
const POLLING: u64 = 3;
const COMPLETED: u64 = 4;

pub struct Task {
    ...
    state: AtomicU64,
}

impl Wake for Task {
    fn wake(self: Arc<Self>) {
        let mut state = self.state.load(Relaxed);
        loop {
            match state {
                // To prevent a task from appearing in the scheduler twice, only send the task
                // to the scheduler if the task is not notified nor being polling. 
                IDLE => match self
                    .state
                    .compare_exchange_weak(IDLE, NOTIFIED, AcqRel, Acquire)
                {
                    Ok(_) => self.scheduler.send(self.clone()),
                    Err(s) => state = s,
                },
                POLLING => match self
                    .state
                    .compare_exchange_weak(POLLING, NOTIFIED, AcqRel, Acquire)
                {
                    Ok(_) => break,
                    Err(s) => state = s,
                },
                _ => break,
            }
        }
    }
}

pub fn poll_task(task: Arc<Task>) {
    let waker = Waker::from(task.clone());
    let mut cx = Context::from_waker(&waker);
    loop {
        // We needn't read the task state here because the waker prevents the task from
        // appearing in the scheduler twice. The state must be NOTIFIED now.
        task.state.store(POLLING, Release);
        if let Poll::Ready(()) = unsafe { Pin::new_unchecked(&mut *task.future).poll(&mut cx) } {
            task.state.store(COMPLETED, Release);
        }
        match task.state.compare_exchange(POLLING, IDLE, AcqRel, Acquire) {
            Ok(_) => break,
            Err(NOTIFIED) => continue,
            _ => unreachable!(),
        }
    }
}
}

Barbara finished her initial implementation of the async executor. Despite there were a lot more possible optimizations, Barbara already felt it is a bit complex. She was also confused about why she needed to care so much about polling and waking while her initial requirement was just adding additional information to the task for customizing scheduling.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

  • It is difficult to customize any of the current async executors (to my knowledge). To have any bit of special requirement forces building an async executor from scratch.
  • It is also not easy to build an async executor. It needs quite some exploration and is error-prone. async-task is a good attempt to simplify the process but it could not satisfy all kinds of needs of customizing the executor (it does not give you the chance to extend the task itself).

What are the sources for this story?

  • The story was from my own experience about writing a new thread pool supporting futures: https://github.com/tikv/yatp.
  • People may feel strange about why we want to set priorities for tasks. Currently, the futures in the thread pool are like user-space threads. They are mostly CPU intensive. But I think people doing async I/O may have the same problem.

Why did you choose Barbara to tell this story?

  • At the time of the story, I had written Rust for years but I was new to the concepts for async/await like Pin and Waker.

How would this story have played out differently for the other characters?

  • People with less experience in Rust may be less likely to build their own executor. If they try, I think the story is probably similar.

๐Ÿ˜ฑ Status quo stories: Barbara carefully dismisses embedded Future

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara is contributing to an OS that supports running multiple applications on a single microcontroller. These microcontrollers have as little as 10's of kilobytes of RAM and 100's of kilobytes of flash memory for code. Barbara is writing a library that is used by multiple applications -- and is linked into each application -- so the library is very resource constrained. The library should support asynchronous operation, so that multiple APIs can be used in parallel within each (single-threaded) application.

Barbara begins writing the library by trying to write a console interface, which allows byte sequences to be printed to the system console. Here is an example sequence of events for a console print:

  1. The interface gives the kernel a callback to call when the print finishes, and gives the kernel the buffer to print.
  2. The kernel prints the buffer in the background while the app is free to do other things.
  3. The print finishes.
  4. The app tells the kernel it is ready for the callback to be invoked, and the kernel invokes the callback.

Barbara tries to implement the API using core::future::Future so that the library can be compatible with the async Rust ecosystem. The OS kernel does not expose a Future-based interface, so Barbara has to implement Future by hand rather than using async/await syntax. She starts with a skeleton:


#![allow(unused)]
fn main() {
/// Passes `buffer` to the kernel, and prints it to the console. Returns a
/// future that returns `buffer` when the print is complete. The caller must
/// call kernel_ready_for_callbacks() when it is ready for the future to return. 
fn print_buffer(buffer: &'static mut [u8]) -> PrintFuture {
    // TODO: Set the callback
    // TODO: Tell the kernel to print `buffer`
}

struct PrintFuture;

impl core::future::Future for PrintFuture {
    type Output = &'static mut [u8];

    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
        // TODO: Detect when the print is done, retrieve `buffer`, and return
        // it.
    }
}
}

Note: All error handling is omitted to keep things understandable.

Barbara begins to implement print_buffer:


#![allow(unused)]
fn main() {
fn print_buffer(buffer: &'static mut [u8]) -> PrintFuture {
    kernel_set_print_callback(callback);
    kernel_start_print(buffer);
    PrintFuture {}
}

// New! The callback the kernel calls.
extern fn callback() {
    // TODO: Wake up the currently-waiting PrintFuture.
}
}

So far so good. Barbara then works on poll:


#![allow(unused)]
fn main() {
    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
        if kernel_is_print_done() {
            return Poll::Ready(kernel_get_buffer_back());
        }
        Poll::Pending
    }
}

Of course, there's something missing here. How does the callback wake the PrintFuture? She needs to store the Waker somewhere! Barbara puts the Waker in a global variable so the callback can find it (this is fine because the app is single threaded and callbacks do NOT interrupt execution the way Unix signals do):


#![allow(unused)]
fn main() {
static mut PRINT_WAKER: Option<Waker> = None;

extern fn callback() {
    if let Some(waker) = unsafe { PRINT_WAKER.as_ref() } {
        waker.wake_by_ref();
    }
}
}

She then modifies poll to set PRINT_WAKER:


#![allow(unused)]
fn main() {
    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
        if kernel_is_print_done() {
            return Poll::Ready(kernel_get_buffer_back());
        }
        unsafe { PRINT_WAKER = Some(cx.waker()); }
        Poll::Pending
    }
}

PRINT_WAKER is stored in .bss, which occupies space in RAM but not flash. It is two words in size. It points to a RawWakerVTable that is provided by the executor. RawWakerVTable's design is a compromise that supports environments both with and without alloc. In no-alloc environments, drop and clone are generally no-ops, and wake/wake_by_ref seem like duplicates. Looking at RawWakerVTable makes Barbara realize that even though Future was designed to work in embedded contexts, it may have too much overhead for her use case.

Barbara decides to do some benchmarking. She comes up with a sample application -- an app that blinks a led and responds to button presses -- and implements it twice. One implementation does not use Future at all, the other does. Both implementations have two asynchronous interfaces: a timer interface and a GPIO interface, as well as an application component that uses the interfaces concurrently. In the Future-based app, the application component functions like a future combinator, as it is a Future that is almost always waiting for a timer or GPIO future to finish.

To drive the application future, Barbara implements an executor. The executor functions like a background thread. Because alloc is not available, this executor contains a single future. The executor has a spawn function that accepts a future and starts running that future (overwriting the existing future in the executor if one is already present). Once started, the executor runs entirely in kernel callbacks.

Barbara identifies several factors that add branching and error handling code to the executor:

  1. spawn should be a safe function, because it is called by high-level application code. However, that means it can be called by the future it contains. If handled naively, this would result in dropping the future while it executes. Barbara adds runtime checks to identify this situation.
  2. Waker is Sync, so on a multithreaded system, a future could give another thread access to its Waker and the other thread could wake it up. This could happen while the poll is executing, before poll returns Poll::Pending. Therefore, Barbara concludes that if wake is called while a future is being polled then the future should be re-polled, even if the current poll returns Poll::Pending. This requires putting a retry loop into the executor.
  3. A kernel callback may call Waker::wake after its future returns Poll::Ready. After poll returns Poll::Ready, the executor should not poll the future again, so Barbara adds code to ignore those wakeups. This duplicates the "ignore spurious wakeups" functionality that exists in the future itself.

Ultimately, this made the executor logic nontrivial, and it compiled into 96 bytes of code. The executor logic is monomorphized for each future, which allows the compiler to make inlining optimizations, but results in a significant amount of duplicate code. Alternatively, it could be adapted to use function pointers or vtables to avoid the code duplication, but then the compiler definitely cannot inline Future::poll into the kernel callbacks.

Barbara publishes an analysis of the relative sizes of the two app implementations, finding a large percentage increase in both code size and RAM usage (note: stack usage was not investigated). Most of the code size increase is from the future combinator code.

In the no-Future version of the app, a kernel callback causes the following:

  1. The kernel callback calls the application logic's event-handling function for the specific event type.
  2. The application handles the event.

The call in step 1 is inlined, so the compiled kernel callback consists only of the application's event-handling logic.

In the Future-based version of the app, a kernel callback causes the following:

  1. The kernel callback updates some global state to indicate the event happened.
  2. The kernel callback invokes Waker::wake.
  3. Waker::wake calls poll on the application future.
  4. The application future has to look at the state saved in step 1 to determine what event happened.
  5. The application future handles the event.

LLVM is unable to devirtualize the call in step 2, so the optimizer is unable to simplify the above steps. Steps 1-4 only exist in the future-based version of the code, and add over 200 bytes of code (note: Barbara believes this could be reduced to between 100 and 200 bytes at the expense of execution speed).

Barbara concludes that Future is not suitable for highly-resource-constrained environments due to the amount of code and RAM required to implement executors and combinators.

Barbara redesigns the library she is building to use a different concept for implementing async APIs in Rust that are much lighter weight. She has moved on from Future and is refining her async traits instead. Here are some ways in which these APIs are lighter weight than a Future implementation:

  1. After monomorphization, kernel callbacks directly call application code. This allows the application code to be inlined into the kernel callback.
  2. The callback invocation is more precise: these APIs don't make spurious wakeups, so application code does not need to handle spurious wakeups.
  3. The async traits lack an equivalent of Waker. Instead, all callbacks are expected to be 'static (i.e. they modify global state) and passing pointers around is replaced by static dispatch.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • core::future::Future isn't suitable for every asynchronous API in Rust. Future has a lot of capabilities, such as the ability to spawn dynamically-allocated futures, that are unnecessary in embedded systems. These capabilities have a cost, which is unavoidable without backwards-incompatible changes to the trait.
  • We should look at embedded Rust's relationship with Future so we don't fragment the embedded Rust ecosystem. Other embedded crates use Future -- Future certainly has a lot of advantages over lighter-weight alternatives, if you have the space to use it.

Why did you choose Barbara to tell this story?

  • This story is about someone who is an experienced systems programmer and an experienced Rust developer. All the other characters have "new to Rust" or "new to programming" as a key characteristic.

How would this story have played out differently for the other characters?

  • Alan would have found the #![no_std] crate ecosystem lacking async support. He would have moved forward with a Future-based implementation, unaware of its impact on code size and RAM usage.
  • Grace would have handled the issue similarly to Barbara, but may not have tried as hard to use Future. Barbara has been paying attention to Rust long enough to know how significant the Future trait is in the Rust community and ecosystem.
  • Niklaus would really have struggled. If he asked for help, he probably would've gotten conflicting advice from the community.

Future has a lot of features that Barbara's traits don't have -- aren't those worth the cost?

  • Future has many additional features that are nice-to-have:
    1. Future works smoothly in a multithreaded environment. Futures can be Send and/or Sync, and do not need to have interior mutability, which avoids the need for internal locking.
      • Manipulating arbitrary Rust types without locking allows async fn to be efficient.
    2. Futures can be spawned and dropped in a dynamic manner: an executor that supports dynamic allocation can manage an arbitrary number of futures at runtime, and futures may easily be dropped to stop their execution.
      • Dropping a future will also drop futures it owns, conveniently providing good cancellation semantics.
      • A future that creates other futures (e.g. an async fn that calls other async fns) can be spawned with only a single memory allocation, whereas callback-based approaches need to allocate for each asynchronous component.
    3. Community and ecosystem support. This isn't a feature of Future per se, but the Rust language has special support for Future (async/await) and practically the entire async Rust ecosystem is based on Future. The ability to use existing async crates is a very strong reason to use Future over any alternative async abstraction.
  • However, the code size impact of Future is a deal-breaker, and no number of nice-to-have features can outweigh a deal-breaker. Barbara's traits have every feature she needs.
  • Using Future saves developer time relative to building your own async abstractions. Developers can use the time they saved to minimize code size elsewhere in the project. In some cases, this may result in a net decrease in code size for the same total effort. However, code size reduction efforts have diminishing returns, so projects that expect to optimize code size regardless likely won't find the tradeoff beneficial.

Is the code size impact of Future fundamental, or can the design be tweaked in a way that eliminates the tradeoff?

  • Future isolates the code that determines a future should wake up (the code that calls Waker::wake) from the code that executes the future (the executor). The only information transferred via Waker::wake is "try waking up now" -- any other information has to be stored somewhere. When polled, a future has to run logic to identify how it can make progress -- in many cases this requires answering "who woke me up?" -- and retrieve the stored information. Most completion-driven async APIs allow information about the event to be transferred directly to the code that handles the event. According to Barbara's analysis, the code required to determine what event happened was the majority of the size impact of Future.

I thought Future was a zero-cost abstraction?

  • Aaron Turon described futures as zero-cost abstractions. In the linked post, he elaborated on what he meant by zero-cost abstraction, and eliminating their impact on code size was not part of that definition. Since then, the statement that future is a zero-cost abstraction has been repeated many times, mostly without the context that Aaron provided. Rust has many zero-cost abstractions, most of which do not impact code size (assuming optimization is enabled), so it is easy for developers to see "futures are zero-cost" and assume that makes them lighter-weight than they are.

How does Barbara's code handle thread-safety? Is her executor unsound?

  • The library Barbara is writing only works in Tock OS' userspace environment. This environment is single-threaded: the runtime does not provide a way to spawn another thread, hardware interrupts do not execute in userspace, and there are no interrupt-style callbacks like Unix signals. All kernel callbacks are invoked synchronously, using a method that is functionally equivalent to a function call.

๐Ÿ˜ฑ Status quo stories: Barbara compares some code (and has a performance problem)

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories [cannot be wrong], only inaccurate). Alternatively, you may wish to [add your own status quo story][htvsq]!

The story

Barbara is recreating some code that has been written in other languages they have some familiarity with. These include C++, but also GC'd languages like Python.

This code collates a large number of requests to network services, with each response containing a large amount of data. To speed this up, Barbara uses buffer_unordered, and writes code like this:


#![allow(unused)]
fn main() {
let mut queries = futures::stream::iter(...)
    .map(|query| async move {
        let d: Data = self.client.request(&query).await?;
        d
     })
     .buffer_unordered(32);

use futures::stream::StreamExt;
let results = queries.collect::<Vec<Data>>().await;
}

Barbara thinks this is similar in function to things she has seen using Python's asyncio.wait, as well as some code her coworkers have written using c++20's coroutines, using this:

std::vector<folly::coro::Task<Data>> tasks;
 for (const auto& query : queries) {
    tasks.push_back(
        folly::coro::co_invoke([this, &query]() -> folly::coro::Task<Data> {
              co_return co_await client_->co_request(query);
        }
    )
}
auto results = co_await folly:coro::collectAllWindowed(
      move(tasks), 32);

However, the Rust code performs quite poorly compared to the other impls, appearing to effectively complete the requests serially, despite on the surface looking like effectively identical code.

While investigating, Barbara looks at top, and realises that her coworker's C++20 code sometimes results in her 16 core laptop using 1600% CPU; her Rust async code never exceeds 100% CPU usage. She spends time investigating her runtime setup, but Tokio is configured to use enough worker threads to keep all her CPU cores busy. This feels to her like a bug in buffer_unordered or tokio, needing more time to investigate.

Barbara goes deep into investigating this, spends time reading how buffer_unordered is implemented, how its underlying FuturesUnordered is implemented, and even thinks about how polling and the tokio runtime she is using works. She evens tries to figure out if the upstream service is doing some sort of queueing.

Eventually Barbara starts reading more about c++20 coroutines, looking closer at the folly implementation used above, noticing that is works primarily with tasks, which are not exactly equivalent to rust Future's.

Then it strikes her! request is implemented something like this:


#![allow(unused)]
fn main() {
impl Client {
    async fn request(&self) -> Result<Data> {
        let bytes = self.inner.network_request().await?
        Ok(serialization_libary::from_bytes(&bytes)?)
   }
}
}

The results from the network service are sometimes (but not always) VERY large, and the BufferedUnordered stream is contained within 1 tokio task. The request future does non-trivial cpu work to deserialize the data. This causes significant slowdowns in wall-time as the the process CAN BE bounded by the time it takes the single thread running the tokio-task to deserialize all the data. This problem hadn't shown up in test cases, where the results from the mocked network service are always small; many common uses of the network service only ever have small results, so it takes a specific production load to trigger this issue, or a large scale test.

The solution is to spawn tasks (note this requires 'static futures):


#![allow(unused)]
fn main() {
let mut queries = futures::stream::iter(...)
    .map(|query| async move {
        let d: Data = tokio::spawn(
        self.client.request(&query)).await??;
        d
     })
     .buffer_unordered(32);

use futures::stream::StreamExt;
let results = queries.collect::<Vec<Data>>().await;
}

Barbara was able to figure this out by reading enough and trying things out, but had that not worked, it would have probably required figuring out how to use perf or some similar tool.

Later on, Barbara gets surprised by this code again. It's now being used as part of a system that handles a very high number of requests per second, but sometimes the system stalls under load. She enlists Grace to help debug, and the two of them identify via perf that all the CPU cores are busy running serialization_libary::from_bytes. Barbara revisits this solution, and discovers tokio::task::block_in_place which she uses to wrap the calls to serialization_libary::from_bytes:


#![allow(unused)]
fn main() {
impl Client {
    async fn request(&self) -> Result<Data> {
        let bytes = self.inner.network_request().await?
        Ok(tokio::task::block_in_place(move || serialization_libary::from_bytes(&bytes))?)
   }
}
}

This resolves the problem as seen in production, but leads to Niklaus's code review suggesting the use of tokio::task::spawn_blocking inside request, instead of spawn inside buffer_unordered. This discussion is challenging, because the tradeoffs between spawn on a Future including block_in_place and spawn_blocking and then not spawning the containing Future are subtle and tricky to explain. Also, either block_in_place and spawn_blocking are heavyweight and Barbara would prefer to avoid them when the cost of serialization is low, which is usually a runtime-property of the system.

๐Ÿค” Frequently Asked Questions

Are any of these actually the correct solution?

  • Only in part. It may cause other kinds of contention or blocking on the runtime. As mentioned above, the deserialization work probably needs to be wrapped in something like block_in_place, so that other tasks are not starved on the runtime, or might want to use spawn_blocking. There are some important caveats/details that matter:
    • This is dependent on how the runtime works.
    • block_in_place + tokio::spawn might be better if the caller wants to control concurrency, as spawning is heavyweight when the deserialization work happens to be small. However, as mentioned above, this can be complex to reason about, and in some cases, may be as heavyweight as spawn_blocking
    • spawn_blocking, at least in some executors, cannot be cancelled, a departure from the prototypical cancellation story in async Rust.
    • "Dependently blocking work" in the context of async programming is a hard problem to solve generally. https://github.com/async-rs/async-std/pull/631 was an attempt but the details are making runtime's agnostic blocking are extremely complex.
    • The way this problem manifests may be subtle, and it may be specific production load that triggers it.
    • The outlined solutions have tradeoffs that each only make sense for certain kind of workloads. It may be better to expose the io aspect of the request and the deserialization aspect as separate APIs, but that complicates the library's usage, lays the burden of choosing the tradeoff on the callee (which may not be generally possible).

What are the morals of the story?

  • Producing concurrent, performant code in Rust async is not always trivial. Debugging performance issues can be difficult.
  • Rust's async model, particularly the blocking nature of polling, can be complex to reason about, and in some cases is different from other languages choices in meaningful ways.
  • CPU-bound code can be easily hidden.

What are the sources for this story?

  • This is a issue I personally hit while writing code required for production.

Why did you choose Barbara to tell this story?

That's probably the person in the cast that I am most similar to, but Alan and to some extent Grace make sense for the story as well.

How would this story have played out differently for the other characters?

  • Alan: May have taken longer to figure out.
  • Grace: Likely would have been as interested in the details of how polling works.
  • Niklaus: Depends on their experience.

๐Ÿ˜ฑ Status quo stories: Template

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara is working on the YouBuy server. In one particular part of the story, she has a process that has to load records from a database on the disk. As she receives data from the database, the data is sent into a channel for later processing. She writes an async fn that looks something like this:


#![allow(unused)]
fn main() {
async fn read_send(db: &mut Database, channel: &mut Sender<...>) {
  loop {
    let data = read_next(db).await;
    let items = parse(&data);
    for item in items {
      channel.send(item).await;
    }
  }
}
}

This database load has to take place while also fielding requests from the user. The routine that invokes read_send uses select! for this purpose. It looks something like this:


#![allow(unused)]
fn main() {
let mut db = ...;
let mut channel = ...;
loop {
    futures::select! {
        _ = read_send(&mut file, &mut channel) => {},
        some_data = socket.read_packet() => {
            // ...
        }
    }
}
}

This setup seems to work well a lot of the time, but Barbara notices that the data getting processed is sometimes incomplete. It seems to be randomly missing some of the rows from the middle of the database, or individual items from a row.

Debugging

She's not sure what could be going wrong! She starts debugging with print-outs and logging. Eventually she realizes the problem. Whenever a packet arrives on the socket, the select! macro will drop the other futures. This can sometime cause the read_send function to be canceled in between reading the data from the disk and sending the items over the channel. Ugh!

Barbara has a hard time figuring out the best way to fix this problem.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Cancellation doesn't always cancel the entire task; particularly with select!, it sometimes cancels just a small piece of a given task.
    • This is in tension with Rust's original design, which was meant to tear down an entire thread or task at once, precisely because of the challenge of writing exception-safe code.
  • Cancellation in Async Rust therefore can require fine-grained recovery.

What are the sources for this story?

This was based on tomaka's blog post, which also includes a number of possible solutions, all of them quite grungy.

Why did you choose Barbara to tell this story?

The problem described here could strike anyone, including veteran Rust users. It's a subtle interaction that is independent of source language. Also, the original person who reported it, tomaka, is a veteran Rust user.

How would this story have played out differently for the other characters?

They would likely have a hard time diagnosing the problem. It really depends on how well they have come to understand the semantics of cancellation. This is fairly independent from programming language background; knowing non-async Rust doesn't help in particular, as this concept is specific to async code.

What is different between this story and other cancellation stories?

There is already a story, "Alan builds a cache" that covers some of the challenges around cancellation. It is quite plausible that those stories could be combined, but the focus of this story is different. The key moral of this story is that certain combinators, notably select!, can cause small pieces of a single task to be torn down and canceled. This cancellation can occur for any reason -- it is not always associated with (for example) clients timing out or closing sockets. It might be (as in this story) the result of clients sending data!

This is one key point that makes cancellation in async Rust rather different than panics in sync Rust. Panics in sync Rust generally occur for bugs, to start, and they are typically not meant to be recovered from except at a coarse-grained level. In contrast, as this story shows, cancellation can require fine-grained recovery and for non-bug events.

๐Ÿ˜ฑ Status quo stories: Barbara makes their first foray into async

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

Barbara's first big project in Rust: a journey marred by doubt

It's Barbara's last year at their university and for their master's thesis, they have chosen to create a distributed database. They have chosen to use their favorite language, Rust, because Rust is a suitable language for low latency applications that they have found very pleasant to work in. Their project presents quite a challenge since they have only written some small algorithms in Rust, and it's also their first foray into creating a big distributed system.

Deciding to use Async

Up until now, Barbara has followed the development of Async from afar by reading the occasional Boats blog post, and celebrating the release announcements with the rest of the happy community. Due to never having worked with async in other languages, and not having had a project suitable for async experimentation, their understanding of async and its ecosystem remained superficial. However, since they have heard that async is suitable for fast networked applications, they decide to try using async for their distributed database. After all, a fast networked application is exactly what they are trying to make.

To further solidify the decision of using async, Barbara goes looking for some information and opinions on async in Rust. Doubts created by reading some tweets about how most people should be using threads instead of async for simplicity reasons are quickly washed away by helpful conversations on the Rust discord.

Learning about Async

Still enamored with the first edition of the Rust book, they decide to go looking for an updated version, hoping that it will teach them async in the same manner that it taught them so much about the language and design patterns for Rust. Disappointed, they find no mention of async in the book, aside from a note that it exists as a keyword.

Not to be deterred, they go looking further, and start looking for similarly great documentation about async. After stumbling upon the async book, their disappointment is briefly replaced with relief as the async book does a good job at solidifying what they have already learned in various blog posts about async, why one would use it and even a bit about how it all works under the hood. They skim over the parts that seem a bit too in-depth for now like pinning, as they're looking to quickly get their hands dirty. Chapter 8: The Async Ecosystem teaches them what they already picked up on through blog posts and contentious tweets: the choice of the runtime has large implications on what libraries they can use.

The wrong time for big decisions

Barbara's dreams to quickly get their hands dirty with async Rust are shattered as they discover that they first need to make a big choice: what executor to use. Having had quite a bit of exposure to the conversations surrounding the incompatible ecosystems, Barbara is perhaps a bit more paranoid about making the wrong choice than the average newcomer. This feels like a big decision to them, as it would influence the libraries they could use and switching to a different ecosystem would be all but impossible after a while. Since they would like to choose what libraries they use before having to choose an executor, Barbara feels like the decision-making is turned on its head.

Their paranoia about choosing the right ecosystem is eased after a few days of research, and some more conversations on the Rust subreddit, after which they discover that most of the RPC libraries they might want to use are situated within the most popular Tokio ecosystem anyways. Tokio also has a brief tutorial, which teaches them some basic concepts within Tokio and talks a bit more about async in general.

Woes of a newcomer to async

Being reasonably confident in their choice of ecosystem, Barbara starts building their distributed system. After a while, they want to introduce another networking library of which the api isn't async. Luckily Barbara picked up on that blocking was not allowed in async (or at least not in any of the currently existing executors), through reading some blog posts about async. More reddit discussions point them towards spawn_blocking in Tokio, and even rayon. But they're none the wiser about how to apply these paradigms in a neat manner.

Previously the design patterns learned in other languages, combined with the patterns taught in the book, were usually sufficient to come to reasonably neat designs. But neither their previous experience, nor the async book nor the Tokio tutorial were of much use when trying to neatly incorporate blocking code into their previously fully async project.

Confused ever after

To this day the lack of a blessed approach leaves Barbara unsure about the choices they've made so far and misconceptions they might still have, evermore wondering if the original tweets they read about how most people should just stick to threads were right all along.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • When entering Rust's async world without previous async experience, and no benchmarks for what good async design patters look like, getting started with async can be a bit overwhelming.
  • Other languages which only have a single ecosystem seem to have a much better story for beginners since there's no fear of lock in, or ecosystem fomo about making the wrong choices early on.
  • This lack of documentation on design patterns, and solid guidance about the async ecosystem for unopiniated newcomers is partially made up for by Rust's community which often provides educated opinions on the design and technical choices one should make. Because of this getting started in async favors those who know where to find answers about Rust: blogs, Discord, Reddit, etc.

What are the sources for their story?

This is based on the author's personal experience

What documentation did the character read during this story?

  • Various blog posts of withoutboats
  • A blog post which spurred a lot of discussion about blocking in async: https://async.rs/blog/stop-worrying-about-blocking-the-new-async-std-runtime/
  • A nice blog post about blocking in Tokio, which still doesn't have any nice design patterns: https://ryhl.io/blog/async-what-is-blocking/
  • An example of design patterns being discussed for sync Rust in the book: https://doc.rust-lang.org/book/ch17-03-oo-design-patterns.html#trade-offs-of-the-state-pattern
  • Perhaps I should've read a bit more of Niko's blogs and his async interviews.

Why did you choose Barbara to tell their story?

Like the author of this story, Barbara had previous experience with Rust. Knowing where to find the community also played a significant part in this story. This story could be construed as how Barbara got started with async while starting to maintain some async projects.

How would their story have played out differently for the other characters?

  • Characters with previous async experience would probably have had a better experience getting started with async in Rust since they might know what design patterns to apply to async code. On the other hand, since Rust's async story is noticeably different from other languages, having async experience in other languages might even be harmful by requiring the user to unlearn certain habits. I don't know if this is actually the case since I don't have any experience with async in other languages.
  • Characters which are less in touch with Rust's community than Barbara might have had a much worse time, since just skimming over the documentation might leave some lost, and unaware of common pitfalls. On the other hand, not having learned a lot about async through blog posts and other materials, might compel someone to read the documentation more thoroughly.

๐Ÿ˜ฑ Status quo stories: Barbara needs Async Helpers

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara, an experienced Rust user, is prototyping an async Rust service for work. To get things working quickly, she decides to prototype in tokio, since it is unclear which runtime her work will use.

She starts adding warp and tokio to her dependencies list. She notices that warp suggests using tokio with the full feature. She's a bit concerned about how this might affect the compile times and also that all of tokio is needed for her little project, but she pushes forward.

As she builds out functionality, she's pleased to see tokio provides a bunch of helpers like join! and async versions of the standard library types like channels and mutexes.

After completing one endpoint, she moves to a new one which requires streaming http responses to the client. Barbara quickly finds out from tokio docs, that it does not provide a stream type, and so she adds tokio-stream to her dependencies.

Moving on she tries to make some functions generic over the web framework underneath, so she tries to abstract off the functionality to a trait. So she writes an async function inside a trait, just like a normal function.


#![allow(unused)]
fn main() {
trait Client {
    async fn get();
}
}

Then she gets a helpful error message.

error[E0706]: functions in traits cannot be declared `async`
 --> src/lib.rs:2:5
  |
2 |     async fn get();
  |     -----^^^^^^^^^^
  |     |
  |     `async` because of this
  |
  = note: `async` trait functions are not currently supported
  = note: consider using the `async-trait` crate: https://crates.io/crates/async-trait

She then realizes that Rust doesn't support async functions in traits yet, so she adds async-trait to her dependencies.

Some of her functions are recursive, and she wanted them to be async functions, so she sprinkles some async/.await keywords in those functions.


#![allow(unused)]
fn main() {
async fn sum(n: usize) -> usize {
    if n == 0 {
        0
    } else {
        n + sum(n - 1).await
    }
}
}

Then she gets an error message.

error[E0733]: recursion in an `async fn` requires boxing
 --> src/lib.rs:1:27
  |
1 | async fn sum(n: usize) -> usize {
  |                           ^^^^^ recursive `async fn`
  |
  = note: a recursive `async fn` must be rewritten to return a boxed `dyn Future`

So to make these functions async she starts boxing her futures the hard way, fighting with the compiler. She knows that async keyword is sort of a sugar for impl Future so she tries the following at first.


#![allow(unused)]
fn main() {
fn sum(n: usize) -> Box<dyn Future<Output = usize>> {
    Box::new(async move {
        if n == 0 {
            0
        } else {
            n + sum(n - 1).await
        }
    })
}
}

The compiler gives the following error.

error[E0277]: `dyn Future<Output = usize>` cannot be unpinned
  --> src/main.rs:11:17
   |
11 |             n + sum(n - 1).await
   |                 ^^^^^^^^^^^^^^^^ the trait `Unpin` is not implemented for `dyn Future<Output = usize>`
   |
   = note: required because of the requirements on the impl of `Future` for `Box<dyn Future<Output = usize>>`
   = note: required by `poll`

She then reads about Unpin and Pin, and finally comes up with a solution.


#![allow(unused)]
fn main() {
fn sum(n: usize) -> Pin<Box<dyn Future<Output = usize>>> {
    Box::pin(async move {
        if n == 0 {
            0
        } else {
            n + sum(n - 1).await
        }
    })
}
}

The code works!

She searches online for better methods and finds out the async-book. She reads about recursion and finds out a cleaner way using the futures crate.


#![allow(unused)]
fn main() {
use futures::future::{BoxFuture, FutureExt};

fn sum(n: usize) -> BoxFuture<'static, usize> {
    async move {
        if n == 0 {
            0
        } else {
            n + sum(n - 1).await
        }
    }.boxed()
}
}

She also asks one of her peers for a code review asynchronously, and after awaiting their response, she learns about the async-recursion crate. Then she adds async-recursion to the dependencies. Now she can write the following, which seems reasonably clean:


#![allow(unused)]
fn main() {
#[async_recursion]
async fn sum(n: usize) -> usize {
        if n == 0 {
            0
        } else {
            n + sum(n - 1).await
        }
}
}

As she is working, she realizes that what she really needs is to write a Stream of data. She starts trying to write her Stream implementation and spends several hours banging her head against her desk in frustration (her challenges are pretty similar to what Alan experienced). Ultimately she's stuck trying to figure out why her &mut self.foo call is giving her errors:

error[E0277]: `R` cannot be unpinned
  --src/main.rs:52:26
   |
52 |                 Pin::new(&mut self.reader).poll_read(cx, buf)
   |                          ^^^^^^^^^^^^^^^^ the trait `Unpin` is not implemented for `R`
   |
   = note: required by `Pin::<P>::new`
help: consider further restricting this bound
   |
40 |     R: AsyncRead + Unpin,
   |                  ^^^^^^^

Fortunately, that weekend, @fasterthanlime publishes a blog post covering the gory details of Pin. Reading that post, she learns about pin-project, which she adds as a dependency. She's able to get her code working, but it's kind of a mess. Feeling quite proud of herself, she shows it to a friend, and they suggest that maybe she ought to try the async-stream crate. Reading that, she realizes she can use this crate to simplify some of her streams, though not all of them fit.

"Finally!", Barbara says, breathing a sigh of relief. She is done with her prototype, and shows it off at work, but to her dismay, the team decides that they need to use a custom runtime for their use case. They're building an embedded system and it has relatively limited resources. Barbara thinks, "No problem, it should be easy enough to change runtimes, right?"

So now Barbara starts the journey of replacing tokio with a myriad of off the shelf and custom helpers. She can't use warp so now she has to find an alternative. She also has to find a new channel implementations and there are a few:

  • In futures
  • async-std has one, but it seems to be tied to another runtime so she can't use that.
  • smol has one that is independent.

This process of "figure out which alternative is an option" is repeated many times. She also tries to use the select! macro from futures but it requires more pinning and workarounds (not to mention a stack overflow or two).

But Barbara fights through all of it. In the end, she gets it to work, but she realizes that she has a ton of random dependencies and associated compilation time. She wonders if all that dependencies will have a negative effect on the binary size. She also had to rewrite some bits of functionality on her own.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Functionality is found either in "framework"-like crates (e.g., tokio) and spread around many different ecosystem crates.
  • It's sometimes difficult to discover where this functionality lives.
  • Additionally, the trouble of non runtime-agnostic libraries becomes very apparent.
  • Helpers and utilities might have analogues across the ecosystem, but they are different in subtle ways.
  • Some patterns are clean if you know the right utility crate and very painful otherwise.

What are the sources for this story?

Issue 105

What are helper functions/macros?

They are functions/macros that helps with certain basic pieces of functionality and features. Like to await on multiple futures concurrently (join! in tokio), or else race the futures and take the result of the one that finishes first.

Will there be a difference if lifetimes are involved in async recursion functions?

Lifetimes would make it a bit more difficult. Although for simple functions it shouldn't be much of a problem.


#![allow(unused)]
fn main() {
fn concat<'a>(string: &'a mut String, slice: &'a str) -> Pin<Box<dyn Future<Output = ()> + 'a>> {
    Box::pin(async move {
        if !slice.is_empty() {
            string.push_str(&slice[0..1]);
            concat(string, &slice[1..]).await;
        }
    })
}
}

Why did you choose Barbara to tell this story?

This particular issue impacts all users of Rust even (and sometimes especially) experienced ones.

How would this story have played out differently for the other characters?

Other characters may not know all their options and hence might have fewer problems as a result.

๐Ÿ˜ฑ Status quo stories: Barbara plays with async

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara has been following async rust for a long time, in eager anticipation of writing some project using async. The last time she tried to do anything with futures in rust was more than a year ago (before async functions), and when you had to chain futures together with many calls to then (often leading to inscrutable error messages hundreds of characters long). This was not a pleasant experience for Barbara.

After watching the development of rust async/await (by following discussions on /r/rust and the internals forums), she wants to start to play around with writing async code. Before starting on any real project, she starts with a "playground" where she can try to write some simple async rust code to see how it feels and how it compares to how async code feels in other languages she knows (like C# and JavaScript).

She starts by opening a blank project in VSCode with rust-analyzer. Because she's been following the overall state of rust async, she knows that she needs a runtime, and quickly decides to use tokio, because she knows its quite popular and well documented.

After looking the long length of the tokio tutorial, she decides to not read most of it right now, and tries to dive right in to writing code. But she does look at the "Hello Tokio" section that shows what feature flags are required by tokio:

[dependencies]
tokio = { version = "1", features = ["full"] }

Poking around the tokio API docs in search of something to play with, she sees a simple future that looks interesting: the sleep future that will wait for a certain duration to elapse before resolving.

Borrowing again from the "Hello Tokio" tutorial to make sure she has the correct spelling for the tokio macros, she writes up the following code:

#[tokio::main]
pub async fn main() {
    let mut rng = thread_rng();
    let t = Uniform::new(100, 5000);

    let mut futures = Vec::new();
    for _ in 0..10 {
        let delay = rng.sample(t);
        futures.push(tokio::time::sleep(Duration::from_millis(delay)));
    }
    println!("Created 10 futures");

    for f in futures {
        f.await;
    }

    println!("Done waiting for all futures");
}

This very first version she wrote compiled on the first try and had no errors when running it. Barbara was pleased about this.

However, this example is pretty boring. The program just sits there for a few seconds doing nothing, and giving no hints about what it's actually doing. So for the next iteration, Barbara wants to have a message printed out when each future is resolved. She tries this code at first:


#![allow(unused)]
fn main() {
let mut futures = Vec::new();
for _ in 0..10 {
    let delay = rng.sample(t);
    futures.push(tokio::time::sleep(Duration::from_millis(delay)).then(|_| {
        println!("Done!");
    }));
}
println!("Created 10 futures");
}

But the compiler gives this error:

error[E0277]: `()` is not a future
  --> src\main.rs:13:71
   |
13 |         futures.push(tokio::time::sleep(Duration::from_millis(delay)).then(|_| {
   |                                                                       ^^^^ `()` is not a future
   |
   = help: the trait `futures::Future` is not implemented for `()`

Even though the error is pointing at the then function, Barbara pretty quickly recognizes the problem -- her closure needs to return a future, but () is not a future (though she wonders "why not?"). Looking at the tokio docs is not very helpful. The Future trait isn't even defined in the tokio docs, so she looks at the docs for the Future trait in the rust standard library docs and she sees it only has 5 implementors; one of them is called Ready which looks interesting. Indeed, this struct is a future that will resolve instantly, which is what she wants:


#![allow(unused)]
fn main() {
for _ in 0..10 {
    let delay = rng.sample(t);
    futures.push(tokio::time::sleep(Duration::from_millis(delay)).then(|_| {
        println!("Done!");
        std::future::ready(())
    }));
}
}

This compiles without error, but when Barbara goes to run the code, the output surprises her a little bit: After waiting running the program, nothing happened for about 4 seconds. Then the first "Done!" message was printed, followed very quickly by the other 9 messages. Based on the code she wrote, she expected 10 "Done!" messages to be printed to the console over the span of about 5 seconds, with roughly a uniform distribution.

After running the program few more times, she always observes that while the first view messages are printed after some delay, the last few messages are always printed all at once.

Barbara has experience writing async code in JavaScript, and so she thinks for a moment about how this toy code might have looked like if she was using JS:

async function main() {
    const futures = [];
    for (let idx = 0; idx < 10; idx++) {
        const delay = 100 + (Math.random() * 4900);
        const f = new Promise(() => {
            setTimeout(() => console.log("Done!"), delay)
        })
        futures.push(f);
    }

    Promise.all(futures);
}

After imagining this code, Barbara has an "ah-ha!" moment, and realizes the problem is likely how she is waiting for the futures in her rust code. In her rust code, she is waiting for the futures one-by-one, but in the JavaScript code she is waiting for all of them simultaneously.

So Barbara looks for a way to wait for a Vec of futures. After a bunch of searching in the tokio docs, she finds nothing. The closet thing she finds is a join! macro, but this appears to only work on individually specified futures, not a Vec of futures.

Disappointed, she then looks at the future module from the rust standard library, but module is tiny and very clearly doesn't have what she wants. Then Barbara has another "ah-ha!" moment and remembers that there's a 3rd-party crate called "futures" on crates.io that she's seen mentioned in some /r/rust posts. She checks the docs and finds the join_all function which looks like what she wants:


#![allow(unused)]
fn main() {
let mut futures = Vec::new();
for _ in 0..10 {
    let delay = rng.sample(t);
    futures.push(tokio::time::sleep(Duration::from_millis(delay)).then(|_| {
        println!("Done!");
        std::future::ready(())
    }));
}
println!("Created 10 futures");

futures::future::join_all(futures).await;
println!("Done");
}

It works exactly as expected now! After having written the code, Barbara begins to remember an important detail about rust futures that she once read somewhere: rust futures are lazy, and won't make progress unless you await them.

Happy with this success, Barbara continues to expand her toy program by making a few small adjustments:


#![allow(unused)]
fn main() {
for counter in 0..10 {
    let delay = rng.sample(t);
    let delay_future = tokio::time::sleep(Duration::from_millis(delay));

    if counter < 9 {
        futures.push(delay_future.then(|_| {
            println!("Done!");
            std::future::ready(())
        }));
    } else {
        futures.push(delay_future.then(|_| {
            println!("Done with the last future!");
            std::future::ready(())
        }));
    }
}
}

This fails to compile:

error[E0308]: mismatched types

   = note: expected closure `[closure@src\main.rs:16:44: 19:14]`
              found closure `[closure@src\main.rs:21:44: 24:14]`
   = note: no two closures, even if identical, have the same type
   = help: consider boxing your closure and/or using it as a trait object

This error doesn't actually surprise Barbara that much, as she is familiar with the idea of having to box objects sometimes. She does notice the "consider boxing your closure" error, but thinks that this is not likely the correct solution. Instead, she thinks that she should box the entire future.

She first adds explicit type annotations to the Vec:


#![allow(unused)]
fn main() {
let mut futures: Vec<Box<dyn Future<Output=()>>> = Vec::new();
}

She then notices that her IDE (VSCode + rust-analyzer) has a new error on each call to push. The code assist on each error says Store this in the heap by calling 'Box::new'. She is exactly what she wants, and it happy that rust-analyzer perfectly handled this case.

Now each future is boxed up, but there is one final error still, this time on the call to join_all(futures).await:

error[E0277]: `dyn futures::Future<Output = ()>` cannot be unpinned
  --> src\main.rs:34:31
   |
34 |     futures::future::join_all(futures).await;

Barbara has been around rust for long enough to know that there is a Box::pin API, but she doesn't really understand what it does, nor does she have a good intuition about what this API is for. But she is accustomed to just trying things in rust to see if they work. And indeed, after changing Box::new to Box::pin:


#![allow(unused)]
fn main() {
futures.push(Box::pin(delay_future.then(|_| {
    println!("Done!");
    std::future::ready(())
})));
}

and adjusting the type of the Vec:


#![allow(unused)]
fn main() {
let mut futures: Vec<Pin<Box<dyn Future<Output=()>>>> = Vec::new();
}

the code compiles and runs successfully.

But even though the run is working correctly, she wishes she had a better idea why pinning is necessary here and feels a little uneasy having to use something she doesn't yet understand well.

As one final task, Barbara wants to try to replace the chained call to then with a async block. She remembers that these were a big deal in a recent release of rust, and that they looked a lot nicer than a long chain of then calls. She doesn't remember the exact syntax for this, but she read a blog post about async rust a few weeks ago, and has a vague idea of how it looks.

She tries writing this:


#![allow(unused)]
fn main() {
futures.push(Box::pin(async || {
    tokio::time::sleep(Duration::from_millis(delay)).await;
    println!("Done after {}ms", delay);
}));
}

The compiler gives an error:

error[E0658]: async closures are unstable
  --> src\main.rs:14:31
   |
14 |         futures.push(Box::pin(async || {
   |                               ^^^^^
   |
   = note: see issue #62290 <https://github.com/rust-lang/rust/issues/62290> for more information
   = help: add `#![feature(async_closure)]` to the crate attributes to enable
   = help: to use an async block, remove the `||`: `async {`

Barbara knows that async is stable and using this nightly feature isn't what she wants. So the tries the suggestion made by the compiler and removes the || bars:


#![allow(unused)]
fn main() {
futures.push(Box::pin(async {
    tokio::time::sleep(Duration::from_millis(delay)).await;
    println!("Done after {}ms", delay);
}));
}

A new error this time:

error[E0597]: `delay` does not live long enough
15 | |             tokio::time::sleep(Duration::from_millis(delay)).await;
   | |                                                      ^^^^^ borrowed value does not live long enough

This is an error that Barbara is very familiar with. If she was working with a closure, she knows she can use a move-closure (since her delay type is Copy). But she not using a closure (she just tried, but the compiler told her to switch to an async block), but Barbara's experience with rust tells her that it's a very consistent language. Maybe the same keyword used in move closures will work here? She tries it:


#![allow(unused)]
fn main() {
futures.push(Box::pin(async move {
    tokio::time::sleep(Duration::from_millis(delay)).await;
    println!("Done after {}ms", delay);
}));
}

It works! Satisfied but still thinking about async rust, Barbara takes a break to eat a cookie.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

Why did you choose Barbara to tell this story?

Barbara has years of rust experience that she brings to bear in her async learning experiences.

What are the morals of the story?

  • Due to Barbara's long experience with rust, she knows most of the language pretty well (except for things like async, and advanced concepts like pinned objects). She generally trusts the rust compiler, and she's learned over the years that she can learn how to use an unfamiliar library by reading the API docs. As long as she can get the types to line up and the code to compile, things generally work as she expects.

    But this is not the case with rust async:

    • There can be new syntax to learn (e.g. async blocks)
    • It can be hard to find basic functionality (like futures::future::join_all)
    • It's not always clear how the ecosystem all fits together (what functionality is part of tokio? What is part of the standard library? What is part of other crates like the futures crate?)
    • Sometimes it looks like there multiple ways to do something:
      • What's the difference between futures::future::Future and std::future::Future?
      • What's the difference between tokio::time::Instant and std::time::Instant?
      • What's the difference between std::future::ready and futures::future::ok?
  • Barbara's has a lot to learn. Her usual methods of learning how to use new crates doesn't really work when learning tokio and async. She wonders if she actually should have read the long tokio tutorial before starting. She realizes it will take her a while to build up the necessary foundation of knowledge before she can be proficient in async rust.

  • There were several times where the compiler or the IDE gave helpful error messages and Barbara appreciated these a lot.

What are the sources for this story?

Personal experiences of the author

How would this story have played out differently for the other characters?

Other characters would likely have written all the same code as Barbara, and probably would have run into the same problems. But other characters might have needed quite a bit longer to get to the solution.

For example, it was Barbara's experience with move-closures that led her to try adding the move keyword to the async block. And it was her general "ambient knowledge" of things that allowed her to remember that things like the futures crate exist. Other characters would have likely needed to resort to an internet search or asking on a rust community.

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

Barbara polls a Mutex

Brief summary

Barbara is implementing an interpreter for a scripting language. This language has implicit asynchronicity, so all values in the language can potentially be futures underneath.

Barbara wants to store a namespace which maps variable names to their values. She chooses to use a HashMap and finds the async_lock crate provides an async mutex, which she can use for concurrency. She determines she'll need a lock around the namespace itself to protect concurrent modification.

For the entries in her map, Barbara decides to implement a two-variant enum. One variant indicates that there is no implicit asynchronicity to resolve and the value is stored directly here. The other variant indicates that this value is being computed asynchronously and polling will be required to resolve it. Because an asynchronous task might want to change one of these entries from the asynchronous variant to the ready variant, she'll need to wrap the entries in an Arc and a Mutex to allow an asynchronous task to update them.

Barbara wants to be able to derive a future from the entries in her namespace that will allow her to wait until the entry becomes ready and read the value. She decides to implement the Future trait directly. She's done this before for a few simple cases, and is somewhat comfortable with the idea, but she runs into significant trouble trying to deal with the mutex in the body of her poll function. Here are her attempts:


#![allow(unused)]
fn main() {
use async_lock::Mutex;

enum Value {
    Int(i32),
}

enum NSEntry {
    Ready(Value),
    Waiting(Vec<Waker>),
}

type Namespace = Mutex<String, Arc<Mutex<NSEntry>>>;

// Attempt 1: This compiles!!
struct NSValueFuture(Arc<Mutex<NSEntry>>);
impl Future for NSValueFuture {
    type Output = Value;
    pub fn poll(
        self: Pin<&mut Self>, 
        cx: &mut Context<'_>
    ) -> Poll<Self::Output> {
        let entry = match self.0.lock().poll() {
            Poll::Ready(ent) => ent,

            // When this returns, it will drop the future created by lock(),
            // which drops our position in the lock's queue.
            // You could never wake up.
            // Get starved under contention. / Destroy fairness properties of lock.
            Poll::Pending => return Poll::Pending,
        };

        ...
    }
}

// Attempt 2
struct NSValueFuture {
    ent: Arc<Mutex<NSEntry>>,
    lock_fut: Option<MutexGuard<'_, NSEntry>>,
}
impl Future for NSValueFuture {
    type Output = Value;
    pub fn poll(
        self: Pin<&mut Self>, 
        cx: &mut Context<'_>
    ) -> Poll<Self::Output> {
        if self.lock_fut.is_none() {
            self.lock_fut = Some(self.ent.lock()),
        }
        // match self.lock_fut.unwrap().poll(cx)
        // Pulled out pin-project, got confused, decided to just use unsafe.
        match unsafe { Pin::new_unchecked(&mut self).lock_fut.unwrap() }.poll(cx) {
            ...
        }
        // ??? lifetime for MutexLockFuture ???
        // try async-std, async-lock
    }
}

// Realize `lock_arc()` is a thing
// Realize you need `BoxFuture` to await it, since you can't name the type

// Working code:
struct NsValueFuture {
    target: Arc<Mutex<NsValue>>,
    lock_fut: Option<BoxFuture<'static, MutexGuardArc<NsValue>>>,
}

impl Future for NsValueFuture {
    type Output = Value;

    fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        if self.lock_fut.is_none() {
            let target = Arc::clone(&self.target);
            let lock = async move { target.lock_arc().await }.boxed();
            self.lock_fut = Some(lock)
        }

        if let Poll::Ready(mut value) = self.lock_fut.as_mut().unwrap().as_mut().poll(cx) {
            self.lock_fut = None;
            match &mut *value {
                NsValue::Ready(x) => {
                    Poll::Ready(x.clone())
                }
                NsValue::Waiting(w) => {
                    w.push(cx.waker().clone());
                    Poll::Pending
                }
            }
        } else {
            Poll::Pending
        }
    }
}
}

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Trying to compose futures manually without an enclosing async block/function is extremely difficult and may even be dangerous.

What are the sources for this story?

Talk about what the story is based on, ideally with links to blog posts, tweets, or other evidence.

Why did you choose Barbara to tell this story?

  • It's possible to be fairly comfortable with Rust and even some of the internals of async and still be stopped in your tracks by this issue.

How would this story have played out differently for the other characters?

In some cases, there are problems that only occur for people from specific backgrounds, or which play out differently. This question can be used to highlight that.

๐Ÿ˜ฑ Status quo stories: Barbara tries async streams

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

Barbara has years of experience in Rust and was looking forward to using some of that experience with the brand-new async functionality. Async/await had been a dream of Rust for so long, and it was finally here!

As she began her next side project, she would quickly partner up with other experienced Rust developers. One of these Rust developers, who had more async experience than Barbara, suggested they use 'async streams' as the core abstraction for this project. Barbara trusted the experience of this other developer. Though she didn't yet understand how async streams worked, she was happy to go along with the decision and build her experience over time.

Month after month, the side project grew in scope and number of users. Potential contributors would try to contribute, but some would leave because they found the combination of concepts and the additional set of borrowchecker-friendly code patterns difficult to understand and master. Barbara was frustrated to lose potential contributors but kept going.

Users also began to discover performance bottlenecks as they pushed the system harder. Barbara, determined to help the users as best she could, pulled her thinking cap tight and started to probe the codebase.

In her investigations, she experimented with adding parallelism to the async stream. She knew that if she called .next() twice, that in theory she should have two separate futures. There were a few ways to run multiple futures in parallel, so this seemed like it might pan out to be a useful way of leveraging the existing architecture.

Unfortunately, to Barbara's chagrin, async streams do not support this kind of activity. Each .next() must be awaited so that the ownership system allowed her to get the next value in the stream. Effectively, this collapsed the model to being a synchronous iterator with a more modern scent. Barbara was frustrated and started to clarify her understanding of what asynchrony actually meant, looking through the implementations for these abstractions.

When she was satisfied, she took a step back and thought for a moment. If optional parallelism was a potential win and the core data processing system actually was going to run synchronously anyway -- despite using async/await extensively in the project -- perhaps it would make more sense to redesign the core abstraction.

With that, Barbara set off to experiment with a new engine for her project. The new engine focused on standard iterators and rayon instead of async streams. As a result, the code was much easier for new users, as iterators are well-understood and have good error messages. Just as importantly, the code was noticeably faster than its async counterpart. Barbara benchmarked a variety of cases to be sure, and always found that the new, simpler approach performed better than the async stream original.

To help those who followed after her, Barbara sat down to write out her experiences to share with the rest of the world. Perhaps future engineers might learn from the twists and turns her project took.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

  • Easy to get the wrong idea. The current state of documentation does not make the use cases clear, so it's easy to grab this as an abstraction because it's the closest that fits.
  • Async streams are just iterators. Async streams do not offer useful asynchrony in and of themselves. A possible help here might be renaming "async streams" to "async iterators" to help underscore their use case and help developers more quickly understand their limitations.
  • A single async stream can not be operated on in parallel. They open up asynchrony only during the .next() step and are unable to offer asynchrony between steps (eg by calling .next() twice and operating on the resulting Futures).

What are the sources for this story?

Why did you choose Barbara to tell this story?

Barbara is an experienced engineer who may come to async streams and async/await in general with a partially-incorrect set of baseline understanding. It may take her time to understand and see more clearly where her model was wrong because there are things similar to other experiences she's had. For example, Rust futures differ from C++ futures and do not offer the same style of asynchrony. Terms like "streams" sound like they may have more internal functionality, and it would be easy for an experienced developer to trip up with the wrong starting assumption.

How would this story have played out differently for the other characters?

  • Alan may have come to a similar idea for an architecture, as async/await is popular in languages like JavaScript and C#. Once Alan attempted to use asynchrony between units of work, namely using async streams, this is where Alan may have failed. The amount of Rust one has to know to succeed here is quite high and includes understanding Arc, Pin, Streams, traits/adapters, the borrowchecker and dealing with different types of errors, and more.
  • Grace may have chosen a different core abstraction from the start. She has a good chance of thinking through how she'd like the data processing system to work. It's possible she would have found threads and channels a better fit. This would have had different trade-offs.
  • Niklaus may have also tried to go down the async stream path. The information available is mixed and hype around async/await is too strong. This makes it shine brighter than it should. Without experience with different systems languages to temper the direction, the most likely path would be to experiment with asynchrony and hope that "underneath the surface it does the right thing."

๐Ÿ˜ฑ Status quo stories: Barbara tries Unix socket

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Content of Cargo.toml for reproducibility:

Cargo.toml
futures = "0.3.14"
hyper = { version = "0.14.7", features = ["full"] }
pretty_env_logger = "0.4.0"
reqwest = "0.11.3"
tokio = { version = "1.5.0", features = ["macros", "rt-multi-thread"] }

There is a HTTP server in hyper which Barbara have to query.

Server code
use hyper::server::conn::Http;
use hyper::service::service_fn;
use hyper::{Body, Request, Response};
use std::convert::Infallible;
use tokio::net::TcpListener;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let listener = TcpListener::bind("127.0.0.1:3000").await?;
 
    loop {
        let (stream, _) = listener.accept().await?;
 
        tokio::spawn(async move {
            let _ = Http::new()  
                .serve_connection(stream, service_fn(serve))
                .await;
        });
    }
}
 
async fn serve(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    let res = Response::builder()
        .header("content-type", "text/plain; charset=utf-8")
        .body(Body::from("Hello World!"))
        .unwrap();
    Ok(res)
}

Nice simple query with high-level reqwest

Barbara do HTTP GET request using TCP socket with reqwest and it works fine, everything is easy.

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let res = reqwest::get("http://127.0.0.1:3000").await?;
    println!("{}", res.text().await?);
    Ok(()) 
}

Unix sockets for performance

One day, Barbara heard that using unix socket can provide a better performance by using unix socket when both the server and client is on the same machine, so Barbara decided to try it out.

Barbara starts porting the server code to use unix socket, it was a no brainer for Barbara at least. Barbara changed TcpListener::bind("127.0.0.1:3000").await? to UnixListener::bind("/tmp/socket")? and it works like a charm.

Barbara search through reqwest doc and github issues to see how to use unix socket for reqwest. Barbara found https://github.com/seanmonstar/reqwest/issues/39#issuecomment-778716774 saying reqwest does not support unix socket but hyper does with an example, which is a lower-level library. Since reqwest is so easy and porting hyper server to use unix socket is easy, Barbara thinks low-level hyper library should be easy too.

The screen stares at Barbara

use hyper::{body::HttpBody, client::conn, Body, Request};
use tokio::net::UnixStream;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    pretty_env_logger::init();
    let stream = UnixStream::connect("/tmp/socket").await?;

    let (mut request_sender, connection) = conn::handshake(stream).await?;
 
    let request = Request::get("/").body(Body::empty())?;
    let mut response = request_sender.send_request(request).await?;
    println!("{:?}", response.body_mut().data().await);
 
    let request = Request::get("/").body(Body::empty())?;
    let mut response = request_sender.send_request(request).await?;
    println!("{:?}", response.body_mut().data().await);
 
    Ok(())
}

Barbara wrote some code according to the comments Barbara saw and read some docs like what is handshake to roughly know what it does. Barbara compile and it shows a warning, the connection variable is not used:

warning: unused variable: `connection`
 --> src/main.rs:9:30
  |
9 |     let (mut request_sender, connection) = conn::handshake(stream).await?;
  |                              ^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_connection`
  |
  = note: `#[warn(unused_variables)]` on by default

Barbara then runs the program. Barbara stares at the screen and the screen stares at her. Barbara waited and ... it was stuck. So Barbara decides to look at the logs and run it again with env RUST_LOG=trace cargo r, and it was indeed stuck, but not sure where.

 TRACE mio::poll > registering event source with poller: token=Token(0), interests=READABLE | WRITABLE

Barbara try adding println! all over the code but it was still stuck, so Barbara try asking for help. Thanks to the welcoming Rust community, Barbara got help quickly in this case. It seemed like Barbara missed the connection which is a culprit and it was in the parent module of the docs Barbara read.

Barbara added the missing piece to .await for the connection, all the while Barbara thought it will work if it was .await-ed but in this case having required to await something else to work is a surprise. Someone suggests to Barbara that she follow the example in the docs to insert a tokio::spawn, so she winds up with:


#![allow(unused)]
fn main() {
    let (mut request_sender, connection) = conn::handshake(stream).await?;

    tokio::spawn(async move {
        if let Err(e) = connection.await {
            eprintln!("error: {}", e);
        }
    })
    
    let request = ...
}

Barbara run the code and it works now, yay! Barbara want to try to reuse the connection to send subsequent HTTP request. Barbara duplicates the last block and it runs.

Mysterious second request

Some time later, Barbara was told that the program did not work on second request. Barbara tried it but it works. To double confirm, when Barbara tried it again it did not work. Rather than getting stuck, this time there is a error message, which is somewhat better but Barbara did not understand.

The second request is so mysterious, it is like the second request playing hide and seek with Barbara. Sometimes it works and sometimes it does not work.


#![allow(unused)]
fn main() {
 TRACE mio::poll > registering event source with poller: token=Token(0), interests=READABLE | WRITABLE
Some(Ok(b"Hello World!"))
 TRACE want      > signal: Want
 TRACE mio::poll > deregistering event source from poller
 TRACE want      > signal: Closed
Error: hyper::Error(Canceled, "connection was not ready")
}

As a typical method of solving asynchronous issue. Barbara add prints to every await boundaries in the source code to understand what is going on.

use hyper::{body::HttpBody, client::conn, Body, Request};
use tokio::net::UnixStream;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    pretty_env_logger::init();
    let stream = UnixStream::connect("/tmp/socket").await?;

    let (mut request_sender, connection) = conn::handshake(stream).await?;
    println!("connected"); 
                        
    tokio::spawn(async move {
        if let Err(e) = connection.await {
            println!("closed"); 
            eprintln!("error: {}", e);
        }
        println!("closed"); 
    });
                  
    let request = Request::get("/").body(Body::empty())?;
    let mut response = request_sender.send_request(request).await?;
    println!("{:?}", response.body_mut().data().await);
                  
    let request = Request::get("/").body(Body::empty())?;
    println!("sending 2");
    let mut response = request_sender.send_request(request).await?;
    println!("sent 2"); 
    println!("{:?}", response.body_mut().data().await);
                     
    Ok(())
}                    

The logs are now more detailed. Barbara can see that the connection was closed but why? Barbara had no idea and Barbara had to seek help again.

 TRACE mio::poll > registering event source with poller: token=Token(0), interests=READABLE | WRITABLE
connected
Some(Ok(b"Hello World!"))
sending 2
 TRACE want      > signal: Want
 TRACE mio::poll > deregistering event source from poller
 TRACE want      > signal: Closed
closed
Error: hyper::Error(Canceled, "connection was not ready")

This time as well, Barbara was lucky enough to get a quick reply from the welcoming Rust community. Other users said there is a trick for these kind of cases, which is a tracing stream.


#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, AsyncWrite, ReadBuf};
        
pub struct TracingStream<S> {
    pub inner: S,
}

impl<S: AsyncRead + AsyncWrite + Unpin> AsyncRead for TracingStream<S> {
    fn poll_read(
        mut self: Pin<&mut Self>,
        cx: &mut Context<'_>,
        buf: &mut ReadBuf<'_>,
    ) -> Poll<Result<(), std::io::Error>> {
        let poll_result = Pin::new(&mut self.inner).poll_read(cx, buf);
        for line in String::from_utf8_lossy(buf.filled()).into_owned().lines() {
            println!("> {}", line);
        }
        poll_result
    }
}
                                 
impl<S: AsyncRead + AsyncWrite + Unpin> AsyncWrite for TracingStream<S> {
    fn poll_flush(
        mut self: Pin<&mut Self>,
        cx: &mut Context<'_>,
    ) -> Poll<Result<(), std::io::Error>> {
        Pin::new(&mut self.inner).poll_flush(cx)
    } 
    
    fn poll_shutdown(
        mut self: Pin<&mut Self>,
        cx: &mut Context<'_>,
    ) -> Poll<Result<(), std::io::Error>> {
        Pin::new(&mut self.inner).poll_shutdown(cx)
    }
 
    fn poll_write(
        mut self: Pin<&mut Self>,
        cx: &mut Context<'_>,
        buf: &[u8],
    ) -> Poll<Result<usize, std::io::Error>> {
        let poll_result = Pin::new(&mut self.inner).poll_write(cx, buf);
        for line in String::from_utf8_lossy(buf).into_owned().lines() {
            println!("< {}", line);
        }
        poll_result
    }
}
}

Barbara happily copy pasted the code and wrap the stream within TracingStream. Running it with logs gives (same thing, in some cases it works, in some cases it did not work):

 TRACE mio::poll > registering event source with poller: token=Token(0), interests=READABLE | WRITABLE
connected
< GET / HTTP/1.1
< 
> HTTP/1.1 200 OK
> content-type: text/plain; charset=utf-8
> content-length: 12
> date: Tue, 04 May 2021 17:02:49 GMT
> 
> Hello World!
Some(Ok(b"Hello World!"))
sending 2
 TRACE want      > signal: Want
 TRACE want      > signal: Want
 TRACE mio::poll > deregistering event source from poller
 TRACE want      > signal: Closed
closed
Error: hyper::Error(Canceled, "connection was not ready")

Barbara thought this probably only affects a unix socket but nope, even swapping it back with TCP socket does not work either. Now, not just Barbara was confused, even the other developers who offered help was confused now.

The single magical line

After some time, a developer found a solution, just a single line. Barbara added the line and it works like a charm but it still feels like magic.


#![allow(unused)]
fn main() {
use futures::future;

    // this new line below was added for second request
    future::poll_fn(|cx| request_sender.poll_ready(cx)).await?;
    let request = Request::get("/").body(Body::empty())?;
    println!("sending 2");
    let mut response = request_sender.send_request(request).await?;
    println!("sent 2");
    println!("{:?}", response.body_mut().data().await);
}

Barbara still have no idea why it needs to be done this way. But at least it works now.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

Barbara is not able to see the problem right away. Usually missing an await is an issue but in this case, not awaiting on another variable or not polling for ready when using a low-level library may the program incorrect, it is also hard to debug and figure out what is the correct solution.

In a way, some of the fixes "feels like magic". Sometimes polling is required to be done but where? It may make people afraid of using async/.await and end up writing safety net code (for example, writing code to do type checking in weakly typed languages in every lines of code to be safe).

Having these pitfalls in mind, one can easily relate it back to unsafe. If there are unsafe blocks, the user needs to manually audit every specific code block for undefined behaviors. But in the case of async, the situation is someone similar such that the user need to audit the whole async code blocks (which is a lot compared to unsafe) for "undefined behaviors", rather than having when it compiles it works sort of behavior.

What are the sources for this story?

pickfire was experimenting with HTTP client over unix socket and faced this issue as he though it is easy, still a lot thanks to Programatik for helping out with a quick and helpful response.

Why did you choose Barbara to tell this story?

Barbara have some experience with synchronous and high-level asynchronous rust libraries but not with low-level asynchronous libraries.

How would this story have played out differently for the other characters?

Most likely everyone could have faced the same issue unless they are lucky.

๐Ÿ˜ฑ Status quo stories: Barbara trims a stacktrace

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara is triaging the reported bugs for her SLOW library. For each bug, she tries to quickly see if she can diagnose the basic area of code that is affected so she knows which people to ping to help fix it. She opens a bug report from a user complaining about a panic when too many connections arrive at the same time. The bug report includes a backtrace from the user's code, and it looks like this:

thread 'main' panicked at 'something bad happened here', src/main.rs:16:5
stack backtrace:
   0: std::panicking::begin_panic
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:519:12
   1: slow_rs::process_one::{{closure}}
             at ./src/main.rs:16:5
   2: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
   3: slow_rs::process_many::{{closure}}
             at ./src/main.rs:10:5
   4: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
   5: slow_rs::main::{{closure}}::{{closure}}
             at ./src/main.rs:4:9
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
   7: slow_rs::main::{{closure}}
             at ./src/main.rs:3:5
   8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
   9: tokio::park::thread::CachedParkThread::block_on::{{closure}}
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/park/thread.rs:263:54
  10: tokio::coop::with_budget::{{closure}}
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/coop.rs:106:9
  11: std::thread::local::LocalKey<T>::try_with
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:272:16
  12: std::thread::local::LocalKey<T>::with
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:248:9
  13: tokio::coop::with_budget
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/coop.rs:99:5
  14: tokio::coop::budget
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/coop.rs:76:5
  15: tokio::park::thread::CachedParkThread::block_on
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/park/thread.rs:263:31
  16: tokio::runtime::enter::Enter::block_on
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/runtime/enter.rs:151:13
  17: tokio::runtime::thread_pool::ThreadPool::block_on
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/runtime/thread_pool/mod.rs:71:9
  18: tokio::runtime::Runtime::block_on
             at /home/serg/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.3.0/src/runtime/mod.rs:452:43
  19: slow_rs::main
             at ./src/main.rs:1:1
  20: core::ops::function::FnOnce::call_once
             at /home/serg/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Barbara finds the text overwhelming. She can't just browse it to figure out what code is affected. Instead, she pops up a new tab with gist.github.com copies the text into that handy text box and starts deleting stuff. To start, she deletes the first few lines until her code appears, then she deletes:

  • the extra lines from calls to poll that are introduced by the async fn machinery;
  • the bits of code that come from tokio that don't affect her;
  • the intermediate wrappers from the standard library pertaining to thread-local variables.

She's a bit confused by the ::{closure} lines on her symbols but she learned by now that this is normal for async fn. After some work, she has reduced her stack to this:

thread 'main' panicked at 'something bad happened here', src/main.rs:16:5
stack backtrace:
   1: slow_rs::process_one::{{closure}} at ./src/main.rs:16:5
   3: slow_rs::process_many::{{closure}} at ./src/main.rs:10:5
   5: slow_rs::main::{{closure}}::{{closure}} at ./src/main.rs:4:9
   7: slow_rs::main::{{closure}} at ./src/main.rs:3:5
  13: <tokio stuff> 
  19: slow_rs::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Based on this, she is able to figure out who to ping about the problem. She pastes her reduced stack trace into the issue pings Alan, who is responsible that module. Alan thanks her for reducing the stack trace and mentions, "Oh, when I used to work in C#, this is what the stack traces always looked like. I miss those days."

Fin.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

  • Rust stack traces -- but async stack traces in particular -- reveal lots of implementation details to the user:
    • Bits of the runtime and intermediate libraries whose source code is likely not of interest to the user (but it might be);
    • Intermediate frames from the stdlib;
    • ::{closure} symbols on async functions and blocks (even though they don't appear to be closures to the user);
    • calls to poll.

What are the sources for this story?

Sergey Galich reported this problem, among many others.

Why did you choose Barbara to tell this story?

She knows about the desugarings that give rise to symbols like ::{closure}, but she still finds them annoying to deal with in practice.

How would this story have played out differently for the other characters?

  • Other characters might have wasted a lot of time trying to read through the stack trace in place before editing it.
  • They might not have known how to trim down the stack trace to something that focused on their code, or it might have taken them much longer to do so.

How does this compare to other languages?

  • Rust's async model does have some advantages, because the complete stack trace is available unless there is an intermediate spawn.
  • Other languages have developed special tools to connect async functions to their callers, however, which gives them a nice experience. For example, Chrome has a UI for enabling stacktraces that cross await points.

Why doesn't Barbara view this in a debugger?

  • Because it came in an issue report (or, frequently, as a crash report or email).
  • But also, that isn't necessarily an improvement! Expand below if you would like to see what we mean.
(click to see how a backtrace looks in lldb)
* thread #1, name = 'foo', stop reason = breakpoint 1.1
  * frame #0: 0x0000555555583d24 foo`foo::main::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h617d49d0841ffc0d((null)=closure-0 @ 0x00007fffffffae38, (null)=<unavailable>) at main.rs:11:13
    frame #1: 0x0000555555583d09 foo`_$LT$T$u20$as$u20$futures_util..fns..FnOnce1$LT$A$GT$$GT$::call_once::hc559b1f3f708a7b0(self=closure-0 @ 0x00007fffffffae68, arg=<unavailable>) at fns.rs:15:9
    frame #2: 0x000055555557f300 foo`_$LT$futures_util..future..future..map..Map$LT$Fut$C$F$GT$$u20$as$u20$core..future..future..Future$GT$::poll::hebf5b295fcc0837f(self=(pointer = 0x0000555555700e00), cx=0x00007fffffffcf50) at map.rs:57:73
    frame #3: 0x00005555555836ac foo`_$LT$futures_util..future..future..Map$LT$Fut$C$F$GT$$u20$as$u20$core..future..future..Future$GT$::poll::h482f253651b968e6(self=Pin<&mut futures_util::future::future::Map<tokio::time::driver::sleep::Sleep, closure-0>> @ 0x00007fffffffb268, cx=0x00007fffffffcf50)
at lib.rs:102:13
    frame #4: 0x000055555557995a foo`_$LT$futures_util..future..future..flatten..Flatten$LT$Fut$C$$LT$Fut$u20$as$u20$core..future..future..Future$GT$..Output$GT$$u20$as$u20$core..future..future..Future$GT$::poll::hd62d2a2417c0f2ea(self=(pointer = 0x0000555555700d80), cx=0x00007fffffffcf50) at flatten.rs:48:36
    frame #5: 0x00005555555834fc foo`_$LT$futures_util..future..future..Then$LT$Fut1$C$Fut2$C$F$GT$$u20$as$u20$core..future..future..Future$GT$::poll::hf60f05f9e9d6f307(self=Pin<&mut futures_util::future::future::Then<tokio::time::driver::sleep::Sleep, core::future::ready::Ready<()>, closure-0>> @ 0x00007fffffffc148, cx=0x00007fffffffcf50) at lib.rs:102:13
    frame #6: 0x000055555558474a foo`_$LT$core..pin..Pin$LT$P$GT$$u20$as$u20$core..future..future..Future$GT$::poll::h4dad267b4f10535d(self=Pin<&mut core::pin::Pin<alloc::boxed::Box<Future, alloc::alloc::Global>>> @ 0x00007fffffffc188, cx=0x00007fffffffcf50) at future.rs:119:9
    frame #7: 0x000055555557a693 foo`_$LT$futures_util..future..maybe_done..MaybeDone$LT$Fut$GT$$u20$as$u20$core..future..future..Future$GT$::poll::hdb6db40c2b3f2f1b(self=(pointer = 0x00005555557011b0), cx=0x00007fffffffcf50) at maybe_done.rs:95:38
    frame #8: 0x0000555555581254 foo`_$LT$futures_util..future..join_all..JoinAll$LT$F$GT$$u20$as$u20$core..future..future..Future$GT$::poll::ha2472a9a54f0e504(self=Pin<&mut futures_util::future::join_all::JoinAll<core::pin::Pin<alloc::boxed::Box<Future, alloc::alloc::Global>>>> @ 0x00007fffffffc388, cx=0x00007fffffffcf50) at join_all.rs:101:16
    frame #9: 0x0000555555584095 foo`foo::main::_$u7b$$u7b$closure$u7d$$u7d$::h6459086fc041943f((null)=ResumeTy @ 0x00007fffffffcc40) at main.rs:17:5
    frame #10: 0x0000555555580eab foo`_$LT$core..future..from_generator..GenFuture$LT$T$GT$$u20$as$u20$core..future..future..Future$GT$::poll::h272e2b5e808264a2(self=Pin<&mut core::future::from_generator::GenFuture<generator-0>> @ 0x00007fffffffccf8, cx=0x00007fffffffcf50) at mod.rs:80:19
    frame #11: 0x00005555555805a0 foo`tokio::park::thread::CachedParkThread::block_on::_$u7b$$u7b$closure$u7d$$u7d$::hbfc61d9f747eef7b at thread.rs:263:54
    frame #12: 0x00005555555795cc foo`tokio::coop::with_budget::_$u7b$$u7b$closure$u7d$$u7d$::ha229cfa0c1a2e13f(cell=0x00007ffff7c06712) at coop.rs:106:9
    frame #13: 0x00005555555773cc foo`std::thread::local::LocalKey$LT$T$GT$::try_with::h9a2f70c5c8e63288(self=0x00005555556e2a48, f=<unavailable>) at local.rs:272:16
    frame #14: 0x0000555555576ead foo`std::thread::local::LocalKey$LT$T$GT$::with::h12eeed0906b94d09(self=0x00005555556e2a48, f=<unavailable>) at local.rs:248:9
    frame #15: 0x000055555557fea6 foo`tokio::park::thread::CachedParkThread::block_on::h33b270af584419f1 [inlined] tokio::coop::with_budget::hcd477734d4970ed5(budget=(__0 = core::option::Option<u8> @ 0x00007fffffffd040), f=closure-0 @ 0x00007fffffffd048) at coop.rs:99:5
    frame #16: 0x000055555557fe73 foo`tokio::park::thread::CachedParkThread::block_on::h33b270af584419f1 [inlined] tokio::coop::budget::h410dced2a7df3ec8(f=closure-0 @ 0x00007fffffffd008) at coop.rs:76
    frame #17: 0x000055555557fe0c foo`tokio::park::thread::CachedParkThread::block_on::h33b270af584419f1(self=0x00007fffffffd078, f=<unavailable>) at thread.rs:263
    frame #18: 0x0000555555578f76 foo`tokio::runtime::enter::Enter::block_on::h4a9c2602e7b82840(self=0x00007fffffffd0f8, f=<unavailable>) at enter.rs:151:13
    frame #19: 0x000055555558482b foo`tokio::runtime::thread_pool::ThreadPool::block_on::h6b211ce19db8989d(self=0x00007fffffffd280, future=(__0 = foo::main::generator-0 @ 0x00007fffffffd200)) at mod.rs:71:9
    frame #20: 0x0000555555583324 foo`tokio::runtime::Runtime::block_on::h5f6badd2dffadf55(self=0x00007fffffffd278, future=(__0 = foo::main::generator-0 @ 0x00007fffffffd968)) at mod.rs:452:43
    frame #21: 0x0000555555579052 foo`foo::main::h3106d444f509ad81 at main.rs:5:1
    frame #22: 0x000055555557b69b foo`core::ops::function::FnOnce::call_once::hba86afc3f8197561((null)=(foo`foo::main::h3106d444f509ad81 at main.rs:6), (null)=<unavailable>) at function.rs:227:5
    frame #23: 0x0000555555580efe foo`std::sys_common::backtrace::__rust_begin_short_backtrace::h856d648367895391(f=(foo`foo::main::h3106d444f509ad81 at main.rs:6)) at backtrace.rs:125:18
    frame #24: 0x00005555555842f1 foo`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h24c58cd1e112136f at rt.rs:66:18
    frame #25: 0x0000555555670aca foo`std::rt::lang_start_internal::h965c28c9ce06ee73 [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::hbcc915e668c7ca11 at function.rs:259:13
    frame #26: 0x0000555555670ac3 foo`std::rt::lang_start_internal::h965c28c9ce06ee73 [inlined] std::panicking::try::do_call::h6b0f430d48122ddf at panicking.rs:379
    frame #27: 0x0000555555670ac3 foo`std::rt::lang_start_internal::h965c28c9ce06ee73 [inlined] std::panicking::try::h6ba420e2e21b5afa at panicking.rs:343
    frame #28: 0x0000555555670ac3 foo`std::rt::lang_start_internal::h965c28c9ce06ee73 [inlined] std::panic::catch_unwind::h8366719d1f615eee at panic.rs:431
    frame #29: 0x0000555555670ac3 foo`std::rt::lang_start_internal::h965c28c9ce06ee73 at rt.rs:51
    frame #30: 0x00005555555842d0 foo`std::rt::lang_start::ha8694bc6fe5182cd(main=(foo`foo::main::h3106d444f509ad81 at main.rs:6), argc=1, argv=0x00007fffffffdc88) at rt.rs:65:5
    frame #31: 0x00005555555790ec foo`main + 28
    frame #32: 0x00007ffff7c2f09b libc.so.6`__libc_start_main(main=(foo`main), argc=1, argv=0x00007fffffffdc88, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007fffffffdc78) at libc-start.c:308:16

Doesn't Rust have backtrace trimming support?

Yes, this is the reduced backtrace. You don't even want to know what the full one looks like. Don't click it. Don't!

๐Ÿ˜ฑ Status quo stories: Barbara wants Async Insights

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara has an initial prototype of a new service she wrote in sync Rust. She then decides, since the service is extremely I/O bound, to port it to async Rust and her benchmarks have led her to believe that performance is being left on the table.

She does this by sprinkling async/.await everywhere, picking an executor, and moving dependencies from sync to async.

Once she has the program compiling, she thinks "oh that was easy". She runs it for the first time and surprisingly she finds out that when hitting an endpoint, nothing happens.

Barbara, always prepared, has already added logging to her service and she checks the logs. As she expected, she sees here that the endpoint handler has been invoked but then... nothing. Barbara exclaims, "Oh no! This was not what I was expecting, but let's dig deeper."

She checks the code and sees that the endpoint spawns several tasks, but unfortunately those tasks don't have much logging in them.

Barbara knows that debugging with a traditional debugger is not very fruitful in async Rust. She does a deep dive into the source code and doesn't find anything. Then she adds much more logging, but to her dismay she finds that a particular task seems stuck, but she has no idea why.

She really wishes that there was a way to get more insight into why the task is stuck. These were the thoughts inside her head at that moment:

  • Is it waiting on I/O?
  • Is there a deadlock?
  • Did she miss some sync code that might still be there and messing with the executor?

For the I/O question she knows to use some tools on her operating system (lsof). This reveals some open sockets but she's not sure how to act on this.

She scans the code for any std lib imports that might be blocking, but doesn't find anything.

After hours of crawling through the code, she notices that her task is receiving a message from a bounded async channel. She changes this to be an unbounded channel and then things start working.

She wants to know why the code was not working, but unfortunately she has no way to gain insight into this issue. She fears that her task might use too much memory knowing that the channel is unbounded, but she can't really tell.

She thinks, "Anyhow it is working now, let's see if we got some performance gains." After thorough benchmarking she finds out that she didn't quite get the performance gain she was expecting. "Something is not working, as intended", she thinks.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • There are very few ways to get insights into running systems. Tracing is state of the art. console.log #ftw
  • Tracing is a static activity and there's no way to dynamically gain insights.
  • While it's possible to find solutions to these issues, often you don't have insight into if those solutions bring new problems.
  • Debugging process for non-trivial issues is almost guaranteed to be painful and expensive.

What are the sources for this story?

Issue 75

What are examples of the kinds of things a user might want to have insight into?

  • Custom Events - logging/tracing (Per task?)
  • Memory consumption per task.
  • I/O handles in waiting state per task.
  • Number of tasks and their states over time.
  • Wake and drop specific tasks.
  • Denoised stack traces and/or stack traces that are task aware.
  • Who spawned the task?
  • Worker threads that are blocked from progressing tasks forward.
  • Tasks that are not progressing.

Why did you choose Barbara to tell this story?

Barbara knows what she's doing, but still there is little way to get insights.

How would this story have played out differently for the other characters?

Depending on what languages he was using before, Alan would likely have had experience with a stronger tooling story:

Barbara wants to use GhostCell-like cell borrowing with futures

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara quite likes using statically-checked cell borrowing. "Cell" in Rust terminology refers to types like Cell or RefCell that enable interior mutability, i.e. modifying or mutably borrowing stuff even if you've only got an immutable reference to it. Statically-checked cell borrowing is a technique whereby one object (an "owner") acts as a gatekeeper for borrow-access to a set of other objects ("cells"). So if you have mutable borrow access to the owner, you can temporarily transfer that mutable borrow access to a cell in order to modify it. This is all checked at compile-time, hence "statically-checked".

In comparison RefCell does borrow-checking, but it is checked at runtime and it will panic if you make a coding mistake. The advantage of statically-checked borrowing is that it cannot panic at runtime, i.e. all your borrowing bugs show up at compile time. The history goes way back, and the technique has been reinvented at least 2-3 times as far as Barbara is aware. This is implemented in various forms in GhostCell and qcell.

Barbara would like to use statically-checked cell borrowing within futures, but there is no way to get the owner borrow through the Future::poll call, i.e. there is no argument or object that the runtime could save the borrow in. Mostly this does not cause a problem, because there are other ways for a runtime to share data, e.g. data can be incorporated into the future when it is created. However in this specific case, for the specific technique of statically-checked cell borrows, we need an active borrow to the owner to be passed down the call stack through all the poll calls.

So Barbara is forced to use RefCell instead and be very careful not to cause panics. This seems like a step back. It feels dangerous to use RefCell and to have to manually verify that her cell borrows are panic-free.

There are good habits that you can adopt to offset the dangers, of course. If you are very careful to make sure that you call no other method or function which might in turn call code which might attempt to get another borrow on the same cell, then the RefCell::borrow_mut panics can be avoided. However this is easy to overlook, and it is easy to fail to anticipate what indirect calls will be made by a given call, and of course this may change later on due to maintenance and new features. A borrow may stay active longer than expected, so calls which appear safe might actually panic. Sometimes it's necessary to manually drop the borrow to be sure. In addition you'll never know what indirect calls might be made until all the possible code-paths have been explored, either through testing or through running in production.

So Barbara prefers to avoid all these problems, and use statically-checked cell borrowing where possible.

Example 1: Accessing an object shared outside the runtime

In this minimized example of code to interface a stream to code outside of the async/await system, the buffer has to be accessible from both the stream and the outside code, so it is handled as a Rc<RefCell<StreamBuffer<T>>>.


#![allow(unused)]
fn main() {
pub struct StreamPipe<T> {
    buf: Rc<RefCell<StreamBuffer<T>>>,
    req_more: Rc<dyn Fn()>,
}

impl<T> Stream for StreamPipe<T> {
    type Item = T;

    fn poll_next(self: Pin<&mut Self>, _: &mut Context<'_>) -> Poll<Option<T>> {
        let mut buf = self.buf.borrow_mut();
        if let Some(item) = buf.value.take() {
            return Poll::Ready(Some(item));
        }
        if buf.end {
            return Poll::Ready(None);
        }
        (self.req_more)();  // Callback to request more data
        Poll::Pending
    }
}
}

Probably req_more() has to schedule some background operation, but if it doesn't and attempts to modify the shared buf immediately then we get a panic, because buf is still borrowed. The real life code could be a lot more complicated, and the required combination of conditions might be harder to hit in testing.

With statically-checked borrowing, the borrow would be something like let mut buf = self.buf.rw(cx);, and the req_more call would either have to take the cx as an argument (forcing the previous borrow to end) or would not take cx, meaning that it would always have to defer the access to the buffer to other code, because without the cx there is no possible way to access the buffer.

Example 2: Shared monitoring data

In this example, the app keeps tallies of various things in a Monitor structure. This might be data in/out, number of errors detected, maybe a hashmap of current links, etc. Since it is accessed from various components, it is kept behind an Rc<RefCell<_>>.

// Dependency: futures-lite = "1.11.3"
use std::cell::RefCell;
use std::rc::Rc;

fn main() {
    let monitor0 = Rc::new(RefCell::new(Monitor { count: 0 }));
    let monitor1 = monitor0.clone();

    let fut0 = async move {
        let mut borrow = monitor0.borrow_mut();
        borrow.count += 1;
    };

    let fut1 = async move {
        let mut borrow = monitor1.borrow_mut();
        borrow.count += 1;
        fut0.await;
    };

    futures_lite::future::block_on(fut1);
}

struct Monitor {
    count: usize,
}

The problem is that this panics with a borrowing error because the borrow is still active when the fut0.await executes and attempts another borrow. The solution is to remember to drop the borrow before awaiting.

In this example code the bug is obvious, but in real life maybe fut0 only borrows in rare situations, e.g. when an error is detected. Or maybe the future that borrows is several calls away down the callstack.

With statically-checked borrowing, there is a slight problem in that currently there is no way to access the poll context from async {} code. But if there was then the borrow would be something like let mut borrow = monitor1.rw(cx);, and since the fut0.await implicitly requires the cx in order to poll, the borrow would be forced to end at that point.

Further investigation by Barbara

The mechanism

Barbara understands that statically-checked cell borrows work by having an owner held by the runtime, and various instances of a cell held by things running on top of the runtime (these cells would typically be behind Rc references). A mutable borrow on the owner is passed down the stack, which enables safe borrows on all the cells, since a mutable borrow on a cell is enabled by temporarily holding onto the mutable borrow of the owner, which is all checked at compile-time.

So the mutable owner borrow needs to be passed through the poll call, and Barbara realizes that this would require support from the standard library.

Right now a &mut Context<'_> is passed to poll, and so within Context would be the ideal place to hold a borrow on the cell owner. However as far as Barbara can see there are difficulties with all the current implementations:

  • GhostCell (or qcell::LCell) may be the best available solution, because it doesn't have any restrictions on how many runtimes might be running or how they might be nested. But Rust insists that the lifetimes <'id> on methods and types are explicit, so it seems like that would force a change to the signature of poll, which would break the ecosystem.

    Here Barbara experiments with a working example of a modified Future trait and a future implementation that makes use of LCell:

// Requires dependency: qcell = "0.4"
use qcell::{LCell, LCellOwner};
use std::pin::Pin;
use std::rc::Rc;
use std::task::Poll;

struct Context<'id, 'a> {
    cell_owner: &'a mut LCellOwner<'id>,
}

struct AsyncCell<'id, T>(LCell<'id, T>);
impl<'id, T> AsyncCell<'id, T> {
    pub fn new(value: T) -> Self {
        Self(LCell::new(value))
    }
    pub fn rw<'a, 'b: 'a>(&'a self, cx: &'a mut Context<'id, 'b>) -> &'a mut T {
        cx.cell_owner.rw(&self.0)
    }
}

trait Future<'id> {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'id, '_>) -> Poll<Self::Output>;
}

struct MyFuture<'id> {
    count: Rc<AsyncCell<'id, usize>>,
}
impl<'id> Future<'id> for MyFuture<'id> {
    type Output = ();
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'id, '_>) -> Poll<Self::Output> {
        *self.count.rw(cx) += 1;
        Poll::Ready(())
    }
}

fn main() {
    LCellOwner::scope(|mut owner| {
        let mut cx = Context { cell_owner: &mut owner };
        let count = Rc::new(AsyncCell::new(0_usize));
        let mut fut = Box::pin(MyFuture { count: count.clone() });
        let _ = fut.as_mut().poll(&mut cx);
        assert_eq!(1, *count.rw(&mut cx));
    });
}
  • The other qcell types (QCell, TCell and TLCell) have various restrictions or overheads which might make them unsuitable as a general-purpose solution in the standard library. However they do have the positive feature of not requiring any change in the signature of poll. It looks like they could be added to Context without breaking anything.

    Here Barbara tries using TLCell, and finds that the signature of poll doesn't need to change:

// Requires dependency: qcell = "0.4"
use qcell::{TLCell, TLCellOwner};
use std::pin::Pin;
use std::rc::Rc;
use std::task::Poll;

struct AsyncMarker;
struct Context<'a> {
    cell_owner: &'a mut TLCellOwner<AsyncMarker>,
}

struct AsyncCell<T>(TLCell<AsyncMarker, T>);
impl<T> AsyncCell<T> {
    pub fn new(value: T) -> Self {
        Self(TLCell::new(value))
    }
    pub fn rw<'a, 'b: 'a>(&'a self, cx: &'a mut Context<'b>) -> &'a mut T {
        cx.cell_owner.rw(&self.0)
    }
}

trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

struct MyFuture {
    count: Rc<AsyncCell<usize>>,
}
impl Future for MyFuture {
    type Output = ();
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        *self.count.rw(cx) += 1;
        Poll::Ready(())
    }
}

fn main() {
    let mut owner = TLCellOwner::new();
    let mut cx = Context { cell_owner: &mut owner };
    let count = Rc::new(AsyncCell::new(0_usize));
    let mut fut = Box::pin(MyFuture { count: count.clone() });
    let _ = fut.as_mut().poll(&mut cx);
    assert_eq!(1, *count.rw(&mut cx));
}

(For comparison, TCell only allows one owner per marker type in the whole process. QCell allows many owners, but requires a runtime check to make sure you're using the right owner to access a cell. TLCell allows only one owner per thread per marker type, but also lets cells migrate between threads and be borrowed locally, which the others don't -- see qcell docs.)

So the choice is GhostCell/LCell and lifetimes everywhere, or various other cell types that may be too restrictive.

Right now Barbara thinks that none of these solutions is likely to be acceptable for the standard library. However still it is a desirable feature, so maybe someone can think of a way around the problems. Or maybe someone has a different perspective on what would be acceptable.

Proof of concept

The Stakker runtime makes use of qcell-based statically-checked cell borrowing. It uses this to get zero-cost access to actors, guaranteeing at compile time that no actor can access any other actor's state. It also uses it to allow inter-actor shared state to be accessed safely and zero-cost, without RefCell.

(For example within a Stakker actor, you can access the contents of a Share<T> via the actor context cx as follows: share.rw(cx), which blocks borrowing or accessing cx until that borrow on share has been released. Share<T> is effectively a Rc<ShareCell<T> and cx has access to an active borrow on the ShareCellOwner, just as in the long examples above.)

Stakker doesn't use GhostCell (LCell) because of the need for <'id> annotations on methods and types. Instead it uses the other three cell types according to how many Stakker instances will be run, either one Stakker instance only, one per thread, or multiple per thread. This is selected by cargo features.

Switching implementations like this doesn't seem like an option for the standard library.

Way forward

Barbara wonders whether there is any way this can be made to work. For example, could the compiler derive all those <'id> annotations automatically for GhostCell/LCell?

Or for multi-threaded runtimes, would qcell::TLCell be acceptable? This allows a single cell-owner in every thread. So it would not allow nested runtimes of the same type. However it does allow borrows to happen at the same time independently in different threads, and it also allows the migration of cells between threads, which is safe because that kind of cell isn't Sync.

Or is there some other form of cell-borrowing that could be devised that would work better for this?

The interface between cells and Context should be straightforward once a particular cell type is demonstrated to be workable with the poll interface and futures ecosystem. For example copying the API style of Stakker:

let rc = Rc::new(AsyncCell::new(1_u32));
*rc.rw(cx) = 2;

So logically you obtain read-write access to a cell by naming the authority by which you claim access, in this case the poll context. In this case it really is naming rather than accessing since the checks are done at compile time and the address that cx represents doesn't actually get passed anywhere or evaluated, once inlining and optimisation is complete.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

The main problem is that Barbara has got used to a safer environment and it feels dangerous to go back to RefCell and have to manually verify that her cell borrows are panic-free.

What are the sources for this story?

The author of Stakker is trying to interface it to async/await and futures.

Why did you choose Barbara to tell this story?

Barbara has enough Rust knowledge to understand the benefits that GhostCell/qcell-like borrowing might bring.

How would this story have played out differently for the other characters?

The other characters perhaps wouldn't have heard of statically-checked cell borrows so would be unaware of the possibility of making things safer.

๐Ÿ˜ฑ Status quo stories: Barbara wishes for an easy runtime switch

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

Barbara has been working on an async codebase for the past 5 years. It is extremely mature and quite large (in the millions of lines of code). They've been relying on tokio as their async runtime and the codebase makes heavy use of its rich API. It has served them well over the years and they're very happy with it.

Barbara knows about async-std but has never used it. She has wondered for a while how her application would work and perform if she had used async-std instead. She decides to test it out by porting her projects from tokio to async-std.

To their disappointment, they discover many areas, where their choice of runtime permeates the code base:

  • tokio provides variants of helpers macros and types, like tokio::select! and tokio::Mutex. These helpers can be used without the rest of tokio, and there are also alternatives from the futures crate and elsewhere (albeit with subtle differences).
  • tokio uses a custom version of AsyncRead and AsyncWrite traits which differ from the ones used by other parts of the ecosystem.
  • The tokio API is needed to create core runtime operations like timers (tokio::time::sleep) and to launch tasks; there doesn't seem to be a standard way to abstract over those kinds of things in a runtime-independent way.
  • Some of their dependencies (e.g hyper and reqwest) are tied to tokio. In some cases, there are configuration options or ways to use those dependencies that don't depend on tokio, but there is no standard mechanism for that.

These things aren't specific to tokio. There just doesn't seem to be a lot of consensus in the ecosystem on how to write "runtime-independent" code and in some cases there aren't any great options available (e.g., spawning tasks).

They investigate the possibility of providing some sort of compatibility layer between tokio and their new runtime of choice but this turns out to not seem like the right way to go as this compatibility layer would require too much overhead.

Realizing that the task of porting the entire code base to async-std, will take a lot of effort and time, Barbara decides to give up. She is very disappointed.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Using a certain executor often means using a certain run-time ecosystem. This often locks the user into that ecosystem.
  • Tying yourself to a certain executor means that you are tied to the priorities of that executor. You may be happy with the run-time ecosystem, but have special needs that the default executor does not provide. If the executor doesn't have an extensibility model, you're stuck. Note: It is perfectly reasonable for a general purpose executor to not be able or willing to cater for specialized needs.
  • All of this is made worse by that fact that run-time agnostic libraries are difficult and sometimes even impossible to write.

What are the sources for this story?

This story is more of a thought experiment than a recounting of a true story. We just asked logically what would happen if a team working on code base where it was assumed they could use a specific runtime decides to use a different runtime.

Why did you choose Barbara to tell this story?

The story assumes a Rust programmer that has worked for several years on a large and complex Rust codebase, so Barbara is the natural choice here.

How would this story have played out differently for the other characters?

It wouldn't. If this story happens them, they're on the same level of Rust expertise as Barbara is.

๐Ÿ˜ฑ Status quo stories: Barbara writes a runtime-agnostic library

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories [cannot be wrong], only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Barbara and Alan work at AmoolgeSoft, where many teams are switching from Java to Rust. These teams have many different use cases and various adoption stories. Some teams are happy users of tokio, others happy users of async-std, and others still are using custom runtimes for highly specialized use cases.

Barbara is tasked with writing a library for a custom protocol, SLOW (only in use at AmoogleSoft) and enlists the help of Alan in doing so. Alan is already aware that not all libraries in Rust work with all runtimes. Alan and Barbara start by writing a parser which works on std::io::Read and get their tests working with Strings. After this they contemplate the question of how to accept a TCP connection.

Incompatible AsyncRead traits

Alan asks Barbara what is the async equivalent is of std::io::Read, and Barbara sighs and says that there isn't one. Barbara brings up tokio's and the futures crate's versions of AsyncRead. Barbara decides not to talk about AsyncBufRead for now.

Barbara and Alan decide to use the future's AsyncRead for no other reason other than it is runtime-agnostic. Barbara tells Alan not to worry as they can translate between the two. With some effort they convert their parser to using AsyncRead.

Alan, excited about the progress they've made, starts working on hooking this up to actual TCP streams. Alan looks at async-std and tokio and notices their interfaces for TCP are quite different. Alan waits for Barbara to save the day.

Barbara helps abstract over TCP listener and TCP stream (TODO: code example). One big hurdle is that tokio uses AsyncRead from their own crate and not the one from futures crate.

Task spawning

After getting the TCP handling part working, they now want to spawn tasks for handling each incoming TCP connection. Again, to their disappointment, they find that there's no runtime-agnostic way to do that.

Unsure on how to do this, they do some searching and find the agnostik crate. They reject it because this only supports N number of runtimes and their custom runtime is not one of them. However it gives them the idea to provide a trait for specifying how to spawn tasks on the runtime. Barbara points out that this has disadvantage of working against orphan rules meaning that either they have to implement the trait for all known runtimes (defeating the purpose of the exercise) or force the user to use new types.

They punt on this question by implementing the trait for each of the known runtimes. They're disappointed that this means their library actually isn't runtime agnostic.

The need for timers

To make things further complicated, they also are in need for a timer API. They could abstract runtime-specific timer APIs in their existing trait they use for spawning, but they find a runtime-agnostic library. It works but is pretty heavy in that it spawns an OS thread (from a pool) every time they want to sleep. They become sadder.

Channels

They need channels as well but after long searches and discussions on help channels, they learn of a few runtime-agnostic implementations: async-channel, futures-channel, and trimmed down ( through feature flags) async-std/tokio. They pick one and it seems to work well. They become less sadder.

First release

They get things working but it was a difficult journey to get to the first release. Some of their users find the APIs harder to use than their runtime-specific libs.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

Why did you choose Barbara to tell this story?

Barbara has years of rust experience that she brings to bear in her async learning experiences.

What are the morals of the story?

  • People have to roll their own implementations which can lead to often subtle differences between runtimes (For example TCPListeners in async-std and tokio).
  • Orphan rules and no standard traits guarantee that a truly agnostic library is not possible.
  • Takes way more time than writing synchronous protocols.
  • It's a hard goal to achieve.
  • Leads to poorer APIs sometimes (both in ease of use and performance).
  • More API design considerations need to go into making an generic async library than a generic sync library.

What are the sources for this story?

Personal experiences of the author from adding async API in zbus crate, except for AsyncRead, which is based on common knowledge in async Rust community.

How would this story have played out differently for the other characters?

Alan, Grace, and Niklaus would be overwhelmed and will likely want to give up.

TODO:

What are the downside of using runtime agnostic crates?

Some things can be implemented very efficiently in a runtime-agnostic way but even then you can't integrate deeply into the runtime. For example, see tokioโ€™s preemption strategy, which relies on deep integration with the runtime.

What other runtime utilities are generally needed?

๐Ÿ˜ฑ Status quo stories: Grace debugs a crash dump

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Grace is an engineer working on a hosted DistriData service, similar to Azure Cosmos DB or Amazon DynamoDB. Sometimes one of the DistriData nodes panics. There is a monitor system that catches these panics, saves a crash dump, and restarts the service. The crash dumps can be analyzed after the fact to try to debug the issue.

After a recent version push, there has been an increase in the number of panics. This represents a threat to the service's overall reliability, so Grace has been tasked to investigate. Grace is known as one of the team's best debuggers, with years of experience diagnosing tricky issues from crash dumps. With C and C++ code, Grace can see raw hex dumps and decode the underlying data structures in her head.

Despite this, Grace is relatively new to Rust and is still developing this intuition for Rust code. To get started, Grace hopes her debugger will help her get started. What executors are running? What tasks are running? What state were they in?

She starts by looking at a backtrace:

[dbg] bt
  0 0x407e5a7cae11 โ€ข syscalls.inc:675
  1 _zx_port_wait(โ€ฆ) โ€ข syscalls.inc:675
      handle = 1569127495
      deadline = 9223372036854775807
      packet = (*)0x1aea3201dc8

  2 distridata_zircon::port::Port::wait(โ€ฆ) โ€ข src/port.rs:323
      self = (*)0x3f0f481a580 โž” Port(Handle(โ€ฆ))
      deadline = Time(9223372036854775807)

  3 ฮป(โ€ฆ) โ€ข default/../../src/lib/distridata-async/src/executor.rs:397
      timer_heap = (*)0x2116e3c3a00 โž” BinaryHeap<distridata_async::executor::โ€ฆ>[]

  4 ฮป(โ€ฆ) โ€ข default/../../src/lib/distridata-async/src/executor.rs:316
      e = (*)0x2116e3c39f0 โž” RefCell<core::option::Option<(allโ€ฆ>{borrow: Cell<isize>{โ€ฆ}, value: UnsafeCell<core::option::Option<(allโ€ฆ>{โ€ฆ}}

  5 std::thread::local::LocalKey<โ€ฆ>::try_with<โ€ฆ>(โ€ฆ) โ€ข thread/local.rs:262
      self = (*)0x3816da0c9b0 โž” LocalKey<core::cell::RefCell<core:โ€ฆ>{inner: &distridata_async::executor::EXECUTOR::__getit}
      f = $(closure-0)($(closure-0)((*)0x1aea32022a0))

  6 std::thread::local::LocalKey<โ€ฆ>::with<โ€ฆ>(โ€ฆ) + 0x27 (no line info)
      self = (*)0x3816da0c9b0 โž” LocalKey<core::cell::RefCell<core:โ€ฆ>{inner: &distridata_async::executor::EXECUTOR::__getit}
      f = $(closure-0)($(closure-0)((*)0x1aea32022a0))

  7 distridata_async::executor::with_local_timer_heap<โ€ฆ>(โ€ฆ) + 0x2a (no line info)
      f = $(closure-0)((*)0x1aea32022a0 โž” (*)0x1aea3202758)

โ–ถ 8 distridata_async::executor::Executor::run_singlethreaded<โ€ฆ>(โ€ฆ) โ€ข default/../../src/lib/distridata-async/src/executor.rs:393
      self = (*)0x1aea3202758 โž” Executor{inner: (*)0x3f0f481a380, next_packet: โ€ฆ}
      main_future = GenFuture<generator-0>(Unresumed)

  9 distridata_pkg_testing_lib_test::serve::tests::test_serve_empty() โ€ข serve.rs:345

  10 ฮป(โ€ฆ) โ€ข serve.rs:345
      (*)0x1aea3202b80 โž” $(closure-0)

  11 core::ops::function::FnOnce::call_once<โ€ฆ>(โ€ฆ) โ€ข function.rs:232
      $(closure-0)
      <Value has no data.>

The backtrace shows a lot of detail about the executor, but not of this is really relevant to Grace's code. She will have to inspect the executor manually in order to find the information she needs. Frame 8 looks promising, so the finds the local variables there and sees one called main_future. Inspecting the code, she sees this has a pointer field, which might tell her something about the task that's running. She takes a look:

[dbg] print -t --max-array=2 main_future.pointer
(std::future::GenFuture<generator-0>*) 0x1aea32022a8 โž” std::future::GenFuture<generator-0>(
  (distridata_pkg_testing_lib_test::serve::tests::test_serve_empty::func::generator-0) distridata_pkg_testing_lib_test::serve::tests::test_serve_empty::func::$(generator-0)::Suspend6{
    packages: alloc::vec::Vec<distridata_pkg_testing_lib_test::repo::PackageEntry>[]
    bytes: alloc::vec::Vec<u8>[123, 34, ]
    (alloc::string::String) url: "ht"...
    also_bytes: alloc::vec::Vec<u8>[123, 34, ]
    pinned: std::future::GenFuture<generator-0>(
      distridata_pkg_testing_lib_test::serve::get::$(generator-0){
        (alloc::string::String) __0: "ht"...
      }
    )
  }
)

This has some more information, but it is still not as helpful as Grace was hoping for.

Grace quickly realizes her tools are not going to give her as much help as she'd like. She does manage to find the executor in memory, so she starts reading the code to understand how tasks are laid out in memory, etc. Even once she finds the list of tasks, she can only see the opaque contents of the closure. It is hard even to track these back to a line number, or to what operating system resource the task is blocked on (IOCP handle, io_uring event, etc.).

She realizes this is going to take a lot longer than it would if this were a C++ service, so she gets up to grab another cup and coffee and then settles in for a long debugging session.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

While much of the focus for async debugger is on the live debugging case, where a developer is running a build on their own machine, there will also be a need to debug crashes after the fact. For example, an application running on a consumer's device may upload crash dumps automatically, or a service running in a cloud environment may also collect a crash dump before restarting the server. Often the bugs that show up in these scenarios are hard to reproduce on a developer's machine, so the more information it's possible to glean from a crash dump, the better.

Even just an accurate and complete stack trace can help a lot. Many error reporting systems cluster crashes by stack trace, so having an incomplete stack trace can lead to unrelated crashes being grouped together.

What are the sources for this story?

This is inspired by requests from internal teams looking to expand the use of Rust in services they develop.

This story also includes some input from Fuchsia developers, including a bug they have about getting async backtraces in the debugger.

Why did you choose Grace to tell this story?

Grace is part of a team of experienced systems hackers who have recently migrated to Rust because of its safety guarantees while still maintaining high performance. Grace is used to debugging these kinds of issues in a certain way, and would like to transfer these skills to Rust.

How would this story have played out differently for the other characters?

This could happen to Alan or Barbara as well. In Alan's case, he may be used to C# and Visual Studio's async debugger tools. He'd probably miss those tools and wish support for something similar could be added to his IDE.

In Niklaus's case, he would probably need to ask one of his more experienced team mates to help him debug the issue. With better tooling, he'd probably be able to get further on his own.

  • In Alan tries to debug a hang, Alan misses some of the strong debugging tools he's used in the past. Grace would enjoy using those same tools if they worked on crash dumps in addition to live processes.

  • In Barbara wants async insights, Barbara wants to use a debugger to inspect a running process. Most of the insights Barbara is looking for in that situation would also be relevant to Grace in a post-hoc debugging situation.

  • In Barbara gets burned by select, Barbara has trouble debugging an issue where not all database updates are processed. Similar debugging tools would help both Barbara and Grace.

  • In Grace deploys her service and hits obstacles, Grace finds a tricky issue in production that only appears at high load. Because she doesn't have the right tooling to debug, she resorts to ad hoc logging, combined with some operating system tools. She could have benefited from the ability to inspect what is blocking tasks in an executor as well.

  • In Grace waits for gdb next, Grace finds that her usual debugging techniques do not work well with async programs.

  • This is tangentially related to the story Alan iteratively regresses performance, because there Alan was used to applying existing native tools to Rust, even though there is sometimes an impedence mismatch. The mismatch is likely to be even more challenging for async debugging, since this scenario is already not well supported in a lot of existing tools.

๐Ÿ˜ฑ Status quo stories: Grace deploys her service and hits obstacles

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

When examining her service metrics, Grace notices tail latencies in the P99 that exceed their target. She identifies GC in the routing layer as the culprit. Grace follows industry trends and is already aware of Rust and its ecosystem at a high level. She decides to investigate rewriting the routing service in Rust.

To meet throughput requirements, Grace has already decided to use a thread-per-core model and minimize cross-thread communication. She explores available ecosystem options and finds no option that gets her exactly what she is looking for out of the box. However, she can use Tokio with minimal configuration to achieve her architecture.

A few months of frantic hacking follow.

montage of cats typing

Soon enough, she and her team have a proof of concept working. They run some local stress tests and notice that 5% of requests hang and fail to respond. The client eventually times out. She cannot reproduce this problem when running one-off requests locally. It only shows up when sending above 200 requests-per-second.

She realizes that she doesn't have any tooling to give her insight into what's going on. She starts to add lots of logging, attempting to tie log entries to specific connections. Using an operating system tool, she can identify the socket addresses for the hung connections, so she also includes the socket addresses in each log message. She then filters the logs to find entries associated with hung connections. Of course, the logs only tell her what the connection managed to do successfully; they don't tell her why it stopped -- so she keeps going back to add more logging until she can narrow down the exact call that hangs.

Eventually, she identifies that the last log message is right before authenticating the request. An existing C library performs authentication, integrated with the routing service using a custom future implementation. She eventually finds a bug in the implementation that resulted in occasional lost wake-ups.

She fixes the bug. The service is now working as expected and meeting Grace's performance goals.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • When coming from a background of network engineering, users will bring their own design choices around architecture.
  • There is a lack of debugging tools for async.
  • Writing futures by hand is error prone.

What are the sources for this story?

This is based on the experiences of helping a tokio user to diagnose a bug in their code.

Why did you choose Grace to tell this story?

  • The actual user who experienced this problem fit the profile of Grace.
  • The story is focused on the experience of people aiming to use workflows they are familiar with from C in a Rust setting.

How would this story have played out differently for the other characters?

Alan or Niklaus may well have had a much harder time diagnosing the problem due to not having as much of a background in systems programming. For example, they may not have known about the system tool that allowed them to find the list of dangling connections.

Could Grace have used another runtime to achieve the same objectives?

  • Maybe! But in this instance the people this story is based on were using tokio, so that's the one we wrote into the story.
  • (If folks want to expand this answer with details of how to achieve similar goals on other runtimes that would be welcome!)

๐Ÿ˜ฑ Status quo stories: Grace tries new libraries

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

When Grace searched crates.io for a library, she found an interesting library that she wants to use. The code examples use a map/reduce style. As Grace is more familiar with C and C++, as a first step she wants to convert them from this style to using loops.

Controller::new(root_kind_api, ListParams::default())
    .owns(child_kind_api, ListParams::default())
    .run(reconcile, error_policy, context)
    .for_each(|res| async move {
        match res {
            Ok(o) => info!("reconciled {:?}", o),
            Err(e) => warn!("reconcile failed: {}", Report::from(e)),
        }
    })
    .await;

(Example code from taken from https://github.com/clux/kube-rs)

So she takes the naive approach to just convert that as follows:

let controller = Controller::new(root_kind_api, ListParams::default())
    .owns(child_kind_api, ListParams::default())
    .run(reconcile, error_policy, context);

while let Ok(o) = controller.try_next().await {
    info!("reconciled {:?}", o),
}

when she compiles her source code she ends up with wall of error messages like the following:

$ cargo run
   Compiling kube-rs-test v0.1.0 (/home/project-gec/src/kube-rs-test)
error[E0277]: `from_generator::GenFuture<[static generator@watcher<Secret>::{closure#0}::{closure#0} for<'r, 's, 't0, 't1> {ResumeTy, kube::Api<Secret>, &'r kube::Api<Secret>, ListParams, &'s ListParams, watcher::State<Secret>, impl futures::Future, ()}]>` cannot be unpinned
  --> src/main.rs:23:41
   |
23 |     while let Ok(o) = controller.try_next().await {
   |                                  ^^^^^^^^ within `futures_util::unfold_state::_::__Origin<'_, (kube::Api<Secret>, ListParams, watcher::State<Secret>), impl futures::Future>`, the trait `Unpin` is not implemented for `from_generator::GenFuture<[static generator@watcher<Secret>::{closure#0}::{closure#0} for<'r, 's, 't0, 't1> {ResumeTy, kube::Api<Secret>, &'r kube::Api<Secret>, ListParams, &'s ListParams, watcher::State<Secret>, impl futures::Future, ()}]>`
   |
   = note: required because it appears within the type `impl futures::Future`
   = note: required because it appears within the type `futures_util::unfold_state::_::__Origin<'_, (kube::Api<Secret>, ListParams, watcher::State<Secret>), impl futures::Future>`
   = note: required because of the requirements on the impl of `Unpin` for `futures_util::unfold_state::UnfoldState<(kube::Api<Secret>, ListParams, watcher::State<Secret>), impl futures::Future>`
   = note: required because it appears within the type `futures::stream::unfold::_::__Origin<'_, (kube::Api<Secret>, ListParams, watcher::State<Secret>), [closure@watcher<Secret>::{closure#0}], impl futures::Future>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::Unfold<(kube::Api<Secret>, ListParams, watcher::State<Secret>), [closure@watcher<Secret>::{closure#0}], impl futures::Future>`
   = note: required because it appears within the type `impl std::marker::Send+futures::Stream`
   = note: required because it appears within the type `futures::stream::try_stream::into_stream::_::__Origin<'_, impl std::marker::Send+futures::Stream>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::IntoStream<impl std::marker::Send+futures::Stream>`
   = note: required because it appears within the type `futures::stream::stream::map::_::__Origin<'_, futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectFn<futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::Map<futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectFn<futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>>`
   = note: required because it appears within the type `futures::stream::stream::_::__Origin<'_, futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::Inspect<futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>`
   = note: required because it appears within the type `futures::stream::try_stream::_::__Origin<'_, impl std::marker::Send+futures::Stream, [closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::InspectOk<impl std::marker::Send+futures::Stream, [closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>`
   = note: required because it appears within the type `impl futures::Stream`

error[E0277]: `from_generator::GenFuture<[static generator@watcher<Secret>::{closure#0}::{closure#0} for<'r, 's, 't0, 't1> {ResumeTy, kube::Api<Secret>, &'r kube::Api<Secret>, ListParams, &'s ListParams, watcher::State<Secret>, impl futures::Future, ()}]>` cannot be unpinned
  --> src/main.rs:23:27
   |
23 |     while let Ok(o) = controller.try_next().await {
   |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^ within `futures_util::unfold_state::_::__Origin<'_, (kube::Api<Secret>, ListParams, watcher::State<Secret>), impl futures::Future>`, the trait `Unpin` is not implemented for `from_generator::GenFuture<[static generator@watcher<Secret>::{closure#0}::{closure#0} for<'r, 's, 't0, 't1> {ResumeTy, kube::Api<Secret>, &'r kube::Api<Secret>, ListParams, &'s ListParams, watcher::State<Secret>, impl futures::Future, ()}]>`
   |
   = note: required because it appears within the type `impl futures::Future`
   = note: required because it appears within the type `futures_util::unfold_state::_::__Origin<'_, (kube::Api<Secret>, ListParams, watcher::State<Secret>), impl futures::Future>`
   = note: required because of the requirements on the impl of `Unpin` for `futures_util::unfold_state::UnfoldState<(kube::Api<Secret>, ListParams, watcher::State<Secret>), impl futures::Future>`
   = note: required because it appears within the type `futures::stream::unfold::_::__Origin<'_, (kube::Api<Secret>, ListParams, watcher::State<Secret>), [closure@watcher<Secret>::{closure#0}], impl futures::Future>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::Unfold<(kube::Api<Secret>, ListParams, watcher::State<Secret>), [closure@watcher<Secret>::{closure#0}], impl futures::Future>`
   = note: required because it appears within the type `impl std::marker::Send+futures::Stream`
   = note: required because it appears within the type `futures::stream::try_stream::into_stream::_::__Origin<'_, impl std::marker::Send+futures::Stream>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::IntoStream<impl std::marker::Send+futures::Stream>`
   = note: required because it appears within the type `futures::stream::stream::map::_::__Origin<'_, futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectFn<futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::Map<futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectFn<futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>>`
   = note: required because it appears within the type `futures::stream::stream::_::__Origin<'_, futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::Inspect<futures::stream::IntoStream<impl std::marker::Send+futures::Stream>, futures_util::fns::InspectOkFn<[closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>>`
   = note: required because it appears within the type `futures::stream::try_stream::_::__Origin<'_, impl std::marker::Send+futures::Stream, [closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>`
   = note: required because of the requirements on the impl of `Unpin` for `futures::stream::InspectOk<impl std::marker::Send+futures::Stream, [closure@reflector<Secret, impl std::marker::Send+futures::Stream>::{closure#0}]>`
   = note: required because it appears within the type `impl futures::Stream`
   = note: required because of the requirements on the impl of `futures::Future` for `TryNext<'_, impl futures::Stream>`
   = note: required by `futures::Future::poll`

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0277`.
error: could not compile `kube-rs-test`

To learn more, run the command again with --verbose.

From her background she has an understanding what could go wrong. So she remembered, that she could box the values to solve the issue with calling .boxed() on the controller. But on the other hand she could see no reason why this while loop should fail when the original .for_each() example just works as expected.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Working with async can give huge errors from fairly common place transforms, and requires knowing some "not entirely obvious" workarounds.

What are the sources for this story?

  • Personal experience.

Why did you choose Grace to tell this story?

  • Reflects the background of the author.

How would this story have played out differently for the other characters?

  • Ultimately the only way to know how to solve this problem is to have seen it before and learned how to solve it. The compiler doesn't help and the result is not obvious.
  • So it probably doesn't matter that much which character is used, except that Barbara may be more likely to have seen how to solve it.

Status quo: Grace waits for gdb next

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Grace wants to walk through the behavior of a toy program.

She first fires up cargo run --verbose to remind herself what the path to the target binary is. Part of the resulting Cargo output is:

     Running `target/debug/toy`

From that, Grace tries running gdb on the printed path.

    gdb target/debug/toy

and then

(gdb) start

to start the program and set a breakpoint on the main function.

Grace hits Ctrl-x a and gets a TUI mode view that includes this:

โ”‚   52          }                                                                                                                                                                                                                    โ”‚
โ”‚   53                                                                                                                                                                                                                               โ”‚
โ”‚   54          #[tokio::main]                                                                                                                                                                                                       โ”‚
โ”‚B+>55          pub(crate) async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {                                                                                                                                   โ”‚
โ”‚   56              println!("Hello, world!");                                                                                                                                                                                       โ”‚
โ”‚   57              let record = Box::new(Mutex::new(Record::new()));                                                                                                                                                                โ”‚
โ”‚   58              let record = &*Box::leak(record);                                                                                                                                                                                โ”‚
โ”‚   59                                                                                                                                                                                                                              

Excitedly Grace types next to continue to the next line of the function.

And waits. And the program does not stop anywhere.

...

Eventually Grace remembers that #[tokio::main] injects a different main function that isn't the one that she wrote as an async fn, and so the next operation in gdb isn't going to set a breakpoint within Grace's async fn main.

So Grace restarts the debugger, and then asks for a breakpoint on the first line of her function:

(gdb) start
(gdb) break 56
(gdb) continue

And now it stops on the line that she expected:

โ”‚   53                                                                                                                                                                                                                               โ”‚
โ”‚   54          #[tokio::main]                                                                                                                                                                                                       โ”‚
โ”‚   55          pub(crate) async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {                                                                                                                                   โ”‚
โ”‚B+>56              println!("Hello, world!");                                                                                                                                                                                       โ”‚
โ”‚   57              let record = Box::new(Mutex::new(Record::new()));                                                                                                                                                                โ”‚
โ”‚   58              let record = &*Box::leak(record);                                                                                                                                                                                โ”‚
โ”‚   59                                                                                                                                                                                                                               โ”‚
โ”‚   60              let (tx, mut rx) = channel(100);                                                                                                                                                                                 โ”‚

Grace is now able to use next to walk through the main function. She does notice that the calls to tokio::spawn are skipped over by next, but that's not as much of a surprise to her, since those are indeed function calls that are taking async blocks. She sets breakpoints on the first line of each async block so that the debugger will stop when control reaches them as she steps through the code.

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

  • A common usage pattern: hitting next to go to what seems like the next statement, breaks down due to implementation details of #[tokio::main] and async fn.
  • This is one example of where next breaks, in terms of what a user is likely to want. The other common scenario where the behavior of next is non-ideal is higher-order functions, like option.and_then(|t| { ... }, where someone stepping through the code probably wants next to set a temporary breakpoint in the ... of the closure.

What are the sources for this story?

Personal experience. I haven't acquired the muscle memory to stop using next, even though it breaks down in such cases.

Why did you choose Grace to tell this story?

I needed someone who, like me, would actually be tempted to use gdb even when println debugging is so popular.

How would this story have played out differently for the other characters?

* Alan might have used whatever debugger is offered by his IDE, which might have the same problem (via a toolbar button that has the same semantics as `next`); but many people using IDE's to debugger just naturally set breakpoints by hand on the lines in their IDE editor, and thus will not run into this.
* Most characters would probably have abandoned using gdb much sooner. E.g. Grace may have started out by adding `println` or `tracing` instrumentation to the code, rather than trying to open it up in a debugger.

๐Ÿ˜ฑ Status quo stories: Grace wants a zero-copy API

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

Grace had written lots of operating system code in the past, and up until recently was working on a project using DPDK for zero-copy networking. The vast majority of the bugs that Grace found were related to memory (mis)management, so she is excited for the prospect of trying Rust as part of her new job.

However, Grace has a hard time getting this to work without heavily resorting to unsafe constructs. As she evolves her undertanding of Rust, she looks hopefully at the signature of poll_read:


#![allow(unused)]
fn main() {
    fn poll_read(
        self: Pin<&mut Self>,
        cx: &mut Context,
        buf: &mut [u8]
    ) -> Poll<Result<usize, Error>>
}

She notices that the buffer is always passed to the invocation, but she can't pass it down to the operating system: because in rust-async tasks can be canceled at any time, which would free the buffer, those buffers are not guaranteed to be alive throughout the entire operation and there is no good way to extend their lifetime. There needs to be at least one copy!

Grace hears from her coworkers that they are all using Tokio anyway. But the Tokio traits, although different from the standard traits, are not much better:


#![allow(unused)]
fn main() {
    fn poll_read(
        self: Pin<&mut Self>,
        cx: &mut Context<'_>,
        buf: &mut ReadBuf<'_>
    ) -> Poll<Result<()>>;
}

There's a specialized type for the buffer, but its management and lifetime are still not suitable for zero-copy I/O.

Grace then came across a famous blog post from a seasoned developer that mentions another trait, AsyncBufRead, but she immediately identifies two issues with that:

  • There is not a similar trait for writes, which suffer from much the same problem
  • Grace's team is already using a plethora of convenience traits built upon these base traits, including AsyncReadExt and AsyncBufReadExt, and they all pass a buffer, forcing a copy.

Grace now has no good choices: she can live with the performance penalty of the copies, which lets her down since she how has the feeling she could do more with C++, or she can come up with her own specialized traits, which will make her work harder to consume by her team.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • The cancellation problem and buffer lifetimes make it impossible to keep a user-provided buffer alive. That makes zero-copy I/O much harder than it could be.

What are the sources for this story?

  • Personal experience.

Why did you choose Grace to tell this story?

  • Grace has experience with C/C++, which is still the de-facto language for very low level things like zero-copy. The author had a similar experience when trying to expose zero-copy APIs.

How would this story have played out differently for the other characters?

  • Zero-copy I/O is an important, but fairly niche use case that requires specialized prior knowledge that usually is only found among system-level programmers.
  • That is usually done in C/C++, and Grace is the only one that is very likely to have this experience.
  • There is a chance Barbara would have ventured into similar problems. She would likely have had a similar experience than Grace.

๐Ÿ˜ฑ Status quo stories: Grace wants to integrate a C-API

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

The story

Grace is integrating a camera into an embedded project. Grace has done similar projects before in the past, and has even used this particular hardware before. Fortunately, the camera manufacturer provides a library in C to interface with the driver.

Grace knows that Rust provides strong memory safety guarantees, and the library provided by the manufacturer sports an API that is easy to misuse. In particular, ownership concerns are tricky and Grace and her team have often complained in the past that making memory mistakes is very easy and one has to be extremely careful to manage lifetimes. Therefore, for this project, Grace opts to start with Rust as many of the pitfalls of the manufacturer's library can be automatically caught by embedding the lifetimes into a lightweight wrapper over code bridged into Rust with bindgen.

Grace's team manages to write a thin Rust wrapper over the manufacturer's library with little complication. This library fortunately offers two interfaces for grabbing frames from the camera: a blocking interface that waits for the next frame, and a non-blocking interface that polls to check if there are any frames currently available and waiting. Grace is tempted to write a callback-based architecture by relying on the blocking interface that waits; however, early the next morning the customer comes back and informs her that they are scaling up the system, and that there will now be 5 cameras instead of 1.

She knows from experience that she cannot rely on having 5 threads blocking just for getting camera frames, because the embedded system she is deploying to only has 2 cores total! Her team would be introducing a lot of overhead into the system with the continuous context switching of every thread. Some folks were unsure of Rust's asynchronous capabilities, and with the requirements changing there were some that argued maybe they should stick to the tried and true in pure C. However, Grace eventually convinced them that the benefits of memory safety were still applicable, and that a lot of bugs that have taken weeks to diagnose in the past have already been completely wiped out. The team decided to stick with Rust, and dig deeper into implementing this project in async Rust.

Fortunately, Grace notices the similarities between the polling interface in the underlying C library and the Poll type returned by Rust's Future trait. "Surely," she thinks, "I can asynchronously interleave polls to each camera over a single thread, and process frames as they become available!" Such a thing would be quite difficult in C while guaranteeing memory safety was maintained. However, Grace's team has already dodged that bullet thanks to writing a thin wrapper in Rust that manages these tricky lifetimes!

The first problem: polls and wake-ups

Grace sets out to start writing the pipeline to get frames from the cameras. She realizes that while the polling call that the manufacturer provided in their library is similar in nature to a future, it doesn't quite encompass everything. In C, one might have to set some kind of heartbeat timer for polling. Grace explains to her team that this heartbeat is similar to how the Waker object works in a Future's Context type, in that it is how often the execution environment should re-try the future if the call to poll returns Poll::Pending.

A member of Grace's team asks her how she was able to understand all this. After all, Grace had been writing Rust about as long as the rest of her team. The main difference was that she had many more years of systems programming under C and C++ under her belt than they had. Grace responded that for the most part she had just read the documentation for the Future trait, and that she had intuited how async-await de-sugars itself into a regular function that returns a future of some kind. The de-sugaring process was, after all, very similar to how lambda objects in C++ were de-sugared as well. She leaves her teammate with an article she once found online that explained the process in a lot more detail for a problem much harder than they were trying to solve.

Something Grace and her team learn to love immediately about Rust is that writing the Future here does not require her team to write their own execution environment. In fact, the future can be entirely written independently of the execution environment. She quickly writes an async method to represent the polling process:


#![allow(unused)]
fn main() {
/// Gets the next frame from the camera, waiting `retry_after` time until polling again if it fails.
///
/// Returns Some(frame) if a frame is found, or None if the camera is disconnected or goes down before a frame is
/// available.
async fn next_frame(camera: &Camera, retry_after: Duration) -> Option<Frame> {
    while camera.is_available() {
        if let Some(frame) = camera.poll() {
            return Some(frame);
        } else {
            task::sleep_for(retry_after).await;
        }
    }

    None
}
}

The underlying C API doesn't provide any hooks that can be used to wake the Waker object on this future up, so Grace and her team decide that it is probably best if they just choose a sufficiently balanced retry_after period in which to try again. It does feel somewhat unsatisfying, as calling sleep_for feels about as hacky as calling std::this_thread::sleep_for in C++. However, there is no way to directly interoperate with the waker without having a separate thread of execution wake it up, and the underlying C library doesn't have any interface offering a notification for when that should be. In the end, this is the same kind of code that they would write in C, just without having to implement a custom execution loop themselves, so the team decides it is not a total loss.

The second problem: doing this many times

Doing this a single time is fine, but an end goal of the project is to be able to stream frames from the camera for unspecified lengths of time. Grace spends some time searching, and realizes that what she actually wants is a Stream of some kind. Stream objects are the asynchronous equivalent of iterators, and her team wants to eventually write something akin to:


#![allow(unused)]
fn main() {
let frame_stream = stream_from_camera(camera, Duration::from_millis(5));

while let Some(frame) = frame_stream.next().await {
    // process frames
}

println!("Frame stream closed.");
}

She scours existing crates, in particular looking for one way to transform the above future into a stream that can be executed many times. The only available option to transform a future into a series of futures is stream::unfold, which seems to do exactly what Grace is looking for. Grace begins by adding a small intermediate type, and then plugging in the remaining holes:


#![allow(unused)]
fn main() {
struct StreamState {
    camera: Camera,
    retry_after: Duration,
}

fn stream_from_camera(camera: Camera, retry_after: Duration) -> Unfold<Frame, ??, ??> {
    let initial_state = StreamState { camera, retry_after };

    stream::unfold(initial_state, |state| async move {
        let frame = next_frame(&state.camera, state.retry_after).await
        (frame, state)
    })
}
}

This looks like it mostly hits the mark, but Grace is left with a couple of questions for how to get the remainder of this building:

  1. What is the type that fills in the third template parameter in the return? It should be the type of the future that is returned by the async closure passed into stream::unfold, but we don't know the type of a closure!
  2. What is the type that fills in the second template parameter of the closure in the return?

Grace spends a lot of time trying to figure out how she might find those types! She asks Barbara what the idiomatic way to get around this in Rust would be. Barbara explains again how closures don't have concrete types, and that the only way to do this will be to use the impl keyword.


#![allow(unused)]
fn main() {
fn stream_from_camera(camera: Camera, retry_after: Duration) -> impl Stream<Item = Frame> {
    // same as before
}
}

While Grace was was on the correct path and now her team is able to write the code they want to, she realizes that sometimes writing the types out explicitly can be very hard. She reflects on what it would have taken to write the type of an equivalent function pointer in C, and slightly laments that Rust cannot express such as clearly.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • Rust was the correct choice for the team across the board thanks to its memory safety and ownership. The underlying C library was just too complex for any single programmer to be able to maintain in their head all at once while also trying to accomplish other tasks.
  • Evolving requirements meant that the team would have had to either start over in plain C, giving up a lot of the safety they would gain from switching to Rust, or exploring async code in a more rigorous way.
  • The async code is actually much simpler than writing the entire execution loop in C themselves. However, the assumption that you would write the entire execution loop is baked into the underlying library which Grace's team cannot rewrite entirely from scratch. Integrating Rust async code with other languages which might have different mental models can sometimes lead to unidiomatic or unsatisfying code, even if the intent of the code in Rust is clear.
  • Grace eventually discovered that the problem was best modeled as a stream, rather than as a single future. However, converting a future into a stream was not necessarily something that was obvious for someone with a C/C++ background.
  • Closures and related types can be very hard to write in Rust, and if you are used to being very explicit with your types, tricks such as the impl trick above for Streams aren't immediately obvious at first glance.

What are the sources for this story?

My own personal experience trying to incorporate the Intel RealSense library into Rust.

Why did you choose Grace to tell this story?

  • I am a C++ programmer who has written many event / callback based systems for streaming from custom camera hardware. I mirror Grace in that I am used to using other systems languages, and even rely on libraries in those languages as I've moved to Rust. I did not want to give up the memory and lifetime benefits of Rust because of evolving runtime requirements.
  • In particular, C and C++ do not encourage async-style code, and often involve threads heavily. However, some contexts cannot make effective use of threads. In such cases, C and C++ programmers are often oriented towards writing custom execution loops and writing a lot of logic to do so. Grace discovered the benefit of not having to choose an executor upfront, because the async primitives let her express most of the logic without relying on a particular executor's behaviour.

How would this story have played out differently for the other characters?

  • Alan would have struggled with understanding the embedded context of the problem, where GC'd languages don't see much use.
  • Niklaus and Barbara may not have approached the problem with the same assimilation biases from C and C++ as Grace. Some of the revelations in the story such as discovering that Grace's team didn't have to write their own execution loop were unexpected benefits when starting down the path of using Rust!

Could Grace have used another runtime to achieve the same objectives?

Grace can use any runtime, which was an unexpected benefit of her work!

How did Grace know to use Unfold as the return type in the first place?

She saw it in the rustdoc for stream::unfold.

๐Ÿ˜ฑ Status quo stories: Template

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

This tells the story of Grace, an engineer working at Facebook on C++ services.

  • Grace writes C++ services at Facebook, built upon many libraries and support infrastructure
  • Grace's last project had several bad bugs related to memory safety, and she is motivated to give Rust a shot on a new service she's writing
  • First, she must determine if there are Rust bindings to the other FB services her new service will depend on
  • She determines that she'll need to write a binding to the FooDB service using cxx
  • She also determines that several crates she'll need from crates.io aren't vendored in the FB monorepo, so she'll need to get them and their dependencies imported. She'll need to address any version conflicts and special build rules since FB uses Buck and not Cargo to build all code
  • While developing her service, Grace discovers that IDE features she's used to in VS Code don't always work for Rust code
  • Grace writes up the performance and safety benefits of her new service after it's first month of deployment. Despite the tooling issues, the end result is a success

๐Ÿค” Frequently Asked Questions

Here are some standard FAQ to get you started. Feel free to add more!

What are the morals of the story?

  • Building successful Rust services in a company that has lots of existing tooling and infrastructure can be difficult, as Grace must do extra work when new ground is tread
    • Big companies like Facebook have large monorepos and custom build systems and the standard Rust tooling may not be useable in that environment
    • Facebook has a large team making developer's lives easier, but it is focused around the most common workflows, and Grace must work a little harder for now as Rust support is in its early days
    • Integrating with existing C++ code is quite important as Grace cannot rewrite existing services

What are the sources for this story?

This story is compiled from internal discussions with Facebook engineers and from internal reports of successful Rust projects.

Why did you choose Grace to tell this story?

Both Alan or Grace could be appropriate, but I chose Grace in order to focus on tooling and C++ service integration issues.

How would this story have played out differently for the other characters?

Had I chosen Alan, a Python programmer at Facebook, there is probably a lot more learning curve with Rust's async mechanics. Python programmers using async don't necessarily have analogs for things like Pin for example.

๐Ÿ˜ฑ Status quo stories: Niklaus Builds a Hydrodynamics Simulator

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Problem

Niklaus is a professor of physics at the University of Rustville. He needed to build a tool to solve hydrodynamics simulations; there is a common method for this that subdivides a region into a grid and computes the solution for each grid patch. All the patches in a grid for a point in time are independent and can be computed in parallel, but they are dependent on neighboring patches in the previously computed frame in time. This is a well known computational model and the patterns for basic parallelization are well established.

Niklaus wanted to write a performant tool to compute the solutions to the simulations of his research. He chose Rust because he needed high performance but he also wanted something that could be maintained by his students, who are not professional programmers. Rust's safety guarantees giver him confidence that his results are not going to be corrupted by data races or other programming errors. After implementing the core mathematical formulas, Niklaus began implementing the parallelization architecture.

His first attempt to was to emulate a common CFD design pattern: using message passing to communicate between processes that are each assigned a specific patch in the grid. So he assign one thread to each patch and used messages to communicate solution state to dependent patches. With one thread per patch this usually meant that there were 5-10x more threads than CPU cores.

This solution worked, but Niklaus had two problems with it. First, it gave him no control over CPU usage so the solution would greedily use all available CPU resources. Second, using messages to communicate solution values between patches did not scale when his team added a new feature (tracer particles) the additional messages caused by this change created so much overhead that parallel processing was no faster than serial. So, Niklaus decided to find a better solution.

Solution Path

To address the first problem: Niklaus' new design decoupled the work that needed to be done (solving physics equations for each patch in the grid) from the workers (threads), this would allow him to set the number of threads and not use all the CPU resources. So, he began looking for a tool in Rust that would meet this design pattern. When he read about async and how it allowed the user to define units of work and send those to an executor which would manage the execution of those tasks across a set of workers, he thought he'd found exactly what he needed. He also thought that the .await semantics would give a much better way of coordinating dependencies between patches. Further reading indicated that tokio was the runtime of choice for async in the community and, so, he began building a new CFD solver with async and tokio.

After making some progress, Niklaus ran into his first problem. Niklaus had been under a false impression about what async executors do. He had assumed that a multi-threaded executor could automatically move the execution of an async block to a worker thread. When this turned out to wrong, he went to Stackoverflow and learned that async tasks must be explicitly spawned into a thread pool if they are to be executed on a worker thread. This meant that the algorithm to be parallelized became strongly coupled to both the spawner and the executor. Code that used to cleanly express a physics algorithm now had interspersed references to the task spawner, not only making it harder to understand, but also making it impossible to try different execution strategies, since with Tokio the spawner and executor are the same object (the Tokio runtime). Niklaus felt that a better design for data parallelism would enable better separation of concerns: a group of interdependent compute tasks, and a strategy to execute them in parallel.

Niklaus second problem came as he tried to fully replace the message passing from the first design: sharing data between tasks. He used the async API to coordinate computation of patches so that a patch would only go to a worker when all its dependencies had completed. But he also needed to account for the solution data which was passed in the messages. He setup a shared data structure to track the solutions for each patch now that messages would not be passing that data. Learning how to properly use shared data with async was a new challenge. The initial design:


#![allow(unused)]
fn main() {
    let mut stage_primitive_and_scalar = |index: BlockIndex, state: BlockState<C>, hydro: H, geometry: GridGeometry| {
        let stage = async move {
            let p = state.try_to_primitive(&hydro, &geometry)?;
            let s = state.scalar_mass / &geometry.cell_volumes / p.map(P::lorentz_factor);
            Ok::<_, HydroError>( ( p.to_shared(), s.to_shared() ) )
        };
        stage_map.insert(index, runtime.spawn(stage).map(|f| f.unwrap()).shared());
    };
}

lacked performance because he needed to clone the value for every task. So, Niklaus switched over to using Arc to keep a thread safe RC to the shared data. But this change introduced a lot of .map and .unwrap function calls, making the code much harder to read. He realized that managing the dependency graph was not intuitive when using async for concurrency.

As the program matured, a new problem arose: a steep learning curve with error handling. The initial version of his design used panic!s to fail the program if an error was encountered, but the stack traces were almost unreadable. He asked his teammate Grace to migrate over to using the Result idiom for error handling and Grace found a major inconvenience. The Rust type inference inconsistently breaks when propagating Result in async blocks. Grace frequently found that she had to specify the type of the error when creating a result value:


#![allow(unused)]
fn main() {
Ok::<_, HydroError>( ( p.to_shared(), s.to_shared() ) )  
}

And she could not figure out why she had to add the ::<_, HydroError> to some of the Result values.

Finally, once Niklaus' team began using the new async design for their simulations, they noticed an important issue that impacted productivity: compilation time had now increased to between 30 and 60 seconds. The nature of their work requires frequent changes to code and recompilation and 30-60 seconds is long enough to have a noticeable impact on their quality of life. What he and his team want is for compilation to be 2 to 3 seconds. Niklaus believes that the use of async is a major contributor to the long compilation times.

This new solution works, but Niklaus is not satisfied with how complex his code became after the move to async and that compilation time is now 30-60 seconds. The state sharing adding a large amount of cruft with Arc and async is not well suited for using a dependency graph to schedule tasks so implementing this solution created a key component of his program that was difficult to understand and pervasive. Ultimately, his conclusion was that async is not appropriate for parallelizing computational tasks. He will be trying a new design based upon Rayon in the next version of her application.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

  • async looks to be the wrong choice for parallelizing compute bound/computational work
  • There is a lack of guidance to help people solving such problems get started on the right foot
  • Quality of Life issues (compilation time, type inference on Result) can create a drag on users ability to focus on their domain problem

What are the sources for this story?

This story is based on the experience of building the kilonova hydrodynamics simulation solver.

Why did you choose Niklaus and Grace to tell this story?

I chose Niklaus as the primary character in this story because this work was driven by someone who only uses programming for a small part of their work. Grace was chosen as a supporting character because of that persons experience with C/C++ programming and to avoid repeating characters.

How would this story have played out differently for the other characters?

  • Alan: there's a good chance he would have already had experience working with either async workflows in another language or doing parallelization of compute bound tasks; and so would already know from experience that async was not the right place to start.
  • Grace: likewise, might already have experience with problems like this and would know what to look for when searching for tools.
  • Barbara: the experience would likely be fairly similar, since the actual subject of this story is quite familiar with Rust by now

๐Ÿ˜ฑ Status quo stories: Niklaus Wants to Share Knowledge

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "status quo" story submitted as part of the brainstorming period. It is derived from real-life experiences of actual Rust users and is meant to reflect some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as they reflect peoples' experiences, status quo stories cannot be wrong, only inaccurate). Alternatively, you may wish to add your own status quo story!

The story

Niklaus, who sometimes goes by the pen name "Starol Klichols", has authored some long-form documentation about Rust that people have found helpful. One could even go so far as to call this documentation a "book".

Niklaus has typically minimized the use of crates in documentation like this as much as possible. Niklaus has limited time to dedicate to keeping the documentation up to date, and given the speed at which the ecosystem sometimes evolves, it's hard to keep up when crates are involved. Also, Niklaus would like to avoid limiting the readership of the documentation to the users of a particular crate only, and would like to avoid any accusations of favoritism.

But Niklaus would really really like to document async to avoid disappointing people like Barbara!

Niklaus was excited about the RFC proposing that block_on be added to the stdlib, because it seemed like that would solve Niklaus' problems. Niklaus would really like to include async in a big update to the documentation. No pressure.

๐Ÿค” Frequently Asked Questions

What are the morals of the story?

Writing documentation to go with the language/stdlib for something that is half in the language/stdlib and half in the ecosystem is hard. This is related to Barbara's story about wanting to get started without needing to pick an executor. There are topics of async that apply no matter what executor you pick, but it's hard to explain those topics without picking an executor to demonstrate with. We all have too much work to do and not enough time.

What are the sources for this story?

Why did you choose Niklaus to tell this story?

Niko said I couldn't add new characters.

How would this story have played out differently for the other characters?

I happen to know that the next version of Programming Rust, whose authors might be described as different characters, includes async and uses async-std. So it's possible to just pick an executor and add async to the book, but I don't wanna.

โœจ Shiny future: Where we want to get to

๐Ÿšง Under construction! Help needed! ๐Ÿšง

We are still in the process of drafting the vision document. The stories you see on this page are examples meant to give a feeling for how a shiny future story looks; you can expect them to change. See the "How to vision" page for instructions and details.

What it this

The "shiny future" is here to tell you what we are trying to build over the next 2 to 3 years. That is, it presents our "best guess" as to what will look like a few years from now. When describing specific features, it also embeds links to design notes that describe the constraints and general plans around that feature.

๐Ÿง You may also enjoy reading the blog post announcing the brainstorming effort.

Think big -- too big, if you have to

You'll notice that the ideas in this document are maximalist and ambitious. They stake out an opinionated position on how the ergonomics of Async I/O should feel. This position may not, in truth, be attainable, and for sure there will be changes along the way. Sometimes the realities of how computers actually work may prevent us from doing all that we'd like to. That's ok. This is a dream and a goal.

We fully expect that the designs and stories described in this document will change as we work towards realizing them. When there are areas of particular uncertainty, we use the Frequently Asked Questions and the design docs to call them out.

Where are the stories?

We haven't written these yet!

โœจ Shiny future stories: template

This is a template for adding new "shiny future" stories. To propose a new shiny future PR, do the following:

  • Create a new file in the shiny_future directory named something like Alan_loves_foo.md or Grace_does_bar_and_its_great.md, and start from the raw source from this template. You can replace all the italicized stuff. :)
  • Do not add a link to your story to the SUMMARY.md file; we'll do it after merging, otherwise there will be too many conflicts.

For more detailed instructions, see the How To Vision: Shiny Future page!

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Write your story here! Feel free to add subsections, citations, links, code examples, whatever you think is best.

๐Ÿค” Frequently Asked Questions

NB: These are generic FAQs. Feel free to customize them to your story or to add more.

What status quo stories are you retelling?

Link to status quo stories if they exist. If not, that's ok, we'll help find them.

What are the key attributes of this shiny future?

Summarize the main attributes of the design you were trying to convey.

What is the "most shiny" about this future?

Thing about Rust's core "value propositions": performance, safety and correctness, productivity. Which benefit the most relative to today?

What are some of the potential pitfalls about this future?

Thing about Rust's core "value propositions": performance, safety and correctness, productivity. Are any of them negatively impacted? Are there specific application areas that are impacted negatively? You might find the sample projects helpful in this regard, or perhaps looking at the goals of each character.

Did anything surprise you when writing this story? Did the story go any place unexpected?

The act of writing shiny future stories can uncover things we didn't expect to find. Did you have any new and exciting ideas as you were writing? Realize some complications that you didn't foresee?

What are some variations of this story that you considered, or that you think might be fun to write? Have any variations of this story already been written?

Often when writing stories, we think about various possibilities. Sketch out some of the turning points here -- maybe someone will want to turn them into a full story! Alternatively, if this is a variation on an existing story, link back to it here.

What are some of the things we'll have to figure out to realize this future? What projects besides Rust itself are involved, if any? (Optional)

Often the 'shiny future' stories involve technical problems that we don't really know how to solve yet. If you see such problems, list them here!

โœจ Shiny future stories: Alan learns async on his own

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories [cannot be wrong]. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Alan is trying to pick up Rust, and wants to build a command-line web scraper since it's a project he's recently written in Go. The program takes a URL, and recursively downloads all URLs named in all fetched pages.

Alan goes to crates.io and searches for "http client", and finds a library called reqwest. He opens its documentation, and sees that the library has him choose between an "async" and a "blocking" client. Confused, Alan types in "rust async" in his favorite search engine, and finds the Rust async book. On the very first page there's a summary of where async is useful and where it's not, as well as some of the downsides of each approach. Alan sees that for "make a single web request", async is not generally necessary, whereas for "making many network requests concurrently" async is recommended. Since Alan expects his crawler to make many requests, he decides he probably wants async for this application.

The async book tells Alan that he should mark his main function as async fn, so he does. He then follows the reqwest async examples, and is able to successfully make his crawler download a single web page. Next, he wants to parse each page to extract additional URLs to fetch. So, he finds a library that can parse HTML, quick-xml. He sets up his application with a HashSet to store all the yet-to-be-parsed URLs, and then writes a loop that pulls out a URL from the set, issues a HTTP request, awaits the response bytes, and passes them to quick-xml. Alan first tried to give the http::Response directly to quick_xml::Reader::from_reader, but the compiler told him:

error: This type does not implement `Read`, which is required by `Reader::from_reader`.

    let page = Reader::from_reader(request.await?);
                                   ^^^^^^^^^^^^^^

      help: The type does implement `AsyncRead`, but the method does not support asynchronous inputs.
suggestion: Use a method that supports asynchronous readers or read the data to a `Vec<u8>` first,
            and then pass that to `Reader::from_reader` instead (`Vec<u8>` implements `Read`).

Alan has his program iterate over all the links on the fetched page, and add any URLs he finds to the HashSet, before he then goes around the loop again. He is pretty satisfied -- the program seems to work well. However, it's fairly slow, as it only fetches one page at a time. Alan looks in the async book he discovered earlier, and sees a chapter titled "Doing many things at once". The chapter tells Alan that he has three options:

  • use select to wait for the first of many futures to complete;
  • use join to wait on many futures to all complete; and
  • use spawn to run a future in the background.

Alan figures that his program should keep many requests in flight at the same time, and then parse each one as it finishes, so he goes for the select approach. He writes:


#![allow(unused)]
fn main() {
let mut requests = Select::new();
requests.insert(client.get(start_url).send());
while !requests.is_empty() {
    let response = requests.await;
    // Use quick-xml to extract urls from response.
    // For each url:
        if seen_urls.insert(url.clone()) {
            requests.insert(client.get(url).send());
        }
}
}

This works, and Alan is delighted. But it seems to work a bit too well -- his crawler is so fast that it starts getting rate-limited by the servers he runs it against. So, Alan decides to make his crawler a bit less aggressive, and adds a call to std::thread::sleep after he parses each page. He compiles his application again, and sees a new warning from the compiler:

warning: blocking call in asynchronous code

    std::thread::sleep(Duration::from_secs(1));
    ^^^^^^^^^^^^^^^^^^

      help: If the thread is put to sleep, other asynchronous code running
            on the same thread does not get to run either.
suggestion: Use the asynchronous std::future::sleep method instead of std::thread::sleep in async code.
   reading: See the "Blocking in async code" chapter in the Rust async book for more details.

Alan is happy that the compiler told him about this problem up front, rather than his downloads being held up during the entire sleep period! He does as the compiler instructs, and replaces thread::sleep with its asynchronous alternative and an await. He then runs his code again, and the warning is gone, and everything seems to work correctly.

While looking at his code in his editor, however, Alan notices a little yellow squiggly line next to his while loop. Hovering over it, he sees a warning from a tool called "Clippy", that says:

warning: 

    while !requests.is_empty() {
    ^^^^^^^^^^^^^^^^^^^^^^^^^^ this loop

        let response = requests.await;
                       ^^^^^^^^^^^^^^ awaits one future from a `Select`
    
    
        std::future::sleep(Duration::from_secs(1)).await;
        ^^^^^^^^^^^^^^^^^^ and then pauses, which prevents progress on the `Select`
    

      help: Futures do nothing when they're not being awaited,
            so while the task is asleep, the `Select` cannot make progress.
suggestion: Consider spawning the futures in the `Select` so they can run in the background.
   reading: See the "Doing many things at once" chapter in the Rust async book for more details.

Alan first searches for "rust clippy" on his search engine of choice, and learns that it is a linter for Rust that checks for common mistakes and cases where code can be more idiomatic. He makes a mental note to always run Clippy from now on.

Alan recognizes the recommended chapter title from before, and sure enough, when he looks back on the page that made him choose select, he sees a box explaining that, as the warning suggests, a Select only makes progress on the asynchronous tasks it contains when it is being awaited. The same box also suggests to spawn the tasks before placing them in the Select to have them continue to run even after the Select has yielded an item.

So, Alan modifies his code to spawn each request:


#![allow(unused)]
fn main() {
// For each url:
if seen_urls.insert(url.clone()) {
    requests.insert(std::future::spawn(async { 
        client.get(url).send().await
    }));
}
}

But now his code doesn't compile any more:

error: borrow of `client` does not live long enough:

    let client = request::Client::new();
        ^^^^^^ client is created here

    requests.insert(std::future::spawn(async {
                    ^^^^^^^^^^^^^^^^^^ spawn requires F: 'static

        client.get(url).send().await
        ^^^^^^ this borrow of client makes the `async` block have lifetime 'a

    }
    ^ the lifetime 'a ends here when `client` is dropped.

      help: An async block that needs access to local variables cannot be spawned,
            since spawned tasks may run past the end of the current function.
suggestion: Consider using `async move` to move `client` if it isn't needed elsewhere,
            or keep `client` around forever by using `Arc` for reference-counting,
            and then `clone` it before passing it into each call to `spawn`.
   reading: See the "Spawning and 'static" chapter in the Rust async book for more details.

Author note: the recommendation Arc above should be inferred from the Send bound on spawn. If such a bound isn't present, we should recommend Rc instead. Ideally we would also tailor the suggestion to whether changing async to async move would actually make the code compile.

Alan is amazed at how comprehensive the compiler errors are, and is glad to see a reference to the async book, which he now realizes he should probably just make time to read start-to-finish, as it covers everything he's running into. Alan first tries to change async to async move as the compiler suggests, but the compiler then tells him that client may be used again in the next iteration of the loop, which makes Alan facepalm. Instead, he does as the compiler tells him, and puts the client in an Arc and clones that Arc for each spawn.

At this point, the code looks a little messy, so Alan decides to open the referenced chapter in the async book as well. It suggests that while the pattern he's used is a good fallback, it's often possible to construct the future outside the spawn, and then await it inside the spawn. Alan gives that a try by removing the Arc again and writing:


#![allow(unused)]
fn main() {
let fut = client.get(url).send();
requests.insert(std::future::spawn(async move {
    fut.await
}));
}

Author note: how would the compiler tell Alan about this transformation rather than him having to discover it in the book?

This works, and Alan is happy! Doubly-so when he notices the yellow Clippy squiggles telling him that the async move { fut.await } can be simplified to just fut.

Alan runs his crawler again, and this time it doesn't run afoul of any rate limiting. However, Alan notices that it's still just parsing one page's HTML at a time, and wonders if he can parallelize that part too. He figures that since each spawned future runs in the background, he can just do the XML parsing in there too! So, he refactors the code for going from a URL to a list of URLs into its own async fn urls, and then writes:


#![allow(unused)]
fn main() {
async fn urls(client: &Client, url: Url) -> Vec<Url> { /* .. */ }

let mut requests = Select::new();
requests.insert(spawn(urls(&client, start_url)));
while !requests.is_empty() {
    let urls = requests.await;
    for url in urls {
        if seen_urls.insert(url.clone()) {
            requests.insert(spawn(urls(&client, url)));
        }
    }
    sleep(Duration::from_secs(1)).await;
}
}

However, to Alan's surprise, this no longer compiles, and is back to the old 'static error:

error: borrow of `client` does not live long enough:

    let client = request::Client::new();
        ^^^^^^ client is created here

    requests.insert(spawn(urls(&client, start_url)));
                    ^^^^^ spawn requires F: 'static

    requests.insert(spawn(urls(&client, start_url)));
                               ^^^^^^^ but the provided argument is tied to the lifetime of this borrow

    }
    ^ which ends here when `client` is dropped.

      help: When you call an `async fn`, it does nothing until it is first awaited.
            For that reason, the `Future` that it returns borrows all of the `async fn`'s arguments.
suggestion: If possible, write the `async fn` (`urls`) as a regular `fn() -> impl Future` that
            first uses any arguments that aren't needed after the first `await`, and then
            returns an `async move {}` with the remainder of the function body.

            Otherwise, consider making the arguments reference-counted with `Arc` so that the async
            function's return value does not borrow anything from its caller.
   reading: See the "Spawning and 'static" chapter in the Rust async book for more details.

With the compiler's helpful explanation, Alan realizes that this is another instance of the same problem he had earlier, and changes his async fn to:


#![allow(unused)]
fn main() {
fn urls(client: &Client, url: Url) -> impl Future<Output = Vec<Url>> {
    let fut = client.get(url).send();
    async move {
        let response = fut.await;
        // Use quick-xml to extract URLs to return.
    }
}
}

At which point the code once again compiles, and runs faster than ever before! However, when Alan runs his crawler against a website with particularly large pages, he notices a new warning in his terminal when the crawler is running:

******************** [ Scheduling Delay Detected ] *********************
The asynchronous runtime has detected that asynchronous tasks are
occasionally prevented from running due to a long-running synchronous
operation holding up the executing thread.

In particular, the task defined at src/lib.rs:88 can make progress, but
the executor thread that would run it hasn't executed a new asynchronous
task in a while. It was last seen executing at src/lib.rs:96.

This warning suggests that your program is running a long-running or
blocking operation somewhere inside of an `async fn`, which prevents
that thread from making progress on concurrent asynchronous tasks. In
the worst instance, this can lead to deadlocks if the blocking code
blocks waiting on some asynchronous task that itself cannot make
progress until the thread continues running asynchronous tasks.

You can find more details about this error in the "Blocking in async
code" chapter of the Rust async book.

This warning is only displayed in debug mode.
************************************************************************

Looking at the indicated lines, Alan sees that line 88 is:


#![allow(unused)]
fn main() {
requests.insert(spawn(urls(&client, url)));
}

And line 96 is the loop around:


#![allow(unused)]
fn main() {
match html_reader.read_event(&mut buf) {
    // ...
}
}

Alan thinks he understands what the warning is trying to tell him, but he's not quite sure what he should do to fix it. So he goes to the indicated chapter in the async book, which says:

If you have to run a long-running synchronous operation, or issue a blocking system call, you risk holding up the execution of asynchronous tasks that the current thread is responsible for managing until the long-running operation completes. You have many options for mitigating the impact of such synchronous code, each with its own set of trade-offs.

It then suggests:

  • Try to make the synchronous code asynchronous if possible. This could even just consist of inserting occasional voluntary scheduling points into long-running loops using std::future::yield().await to allow the thread to continue to make progress on asynchronous tasks.
  • Run the synchronous code in a dedicated thread using spawn_blocking and simply await the resulting JoinHandle in the asynchronous code.
  • Inform the runtime that the current thread (with block_in_place) that it should give away all of its background tasks to other runtime threads (if applicable), and only then execute the synchronous code.

The document goes into more detail about the implications of each choice, but Alan likes the first option the best for this use-case, and augments his HTML reading loop to occasionally call std::future::yield().await. The runtime warning goes away.

๐Ÿค” Frequently Asked Questions

What status quo stories are you retelling?

What are the key attributes of this shiny future?

  • Not every use-case requires async, and users should be told early on that that's the case, and enough to make the decision themselves!
  • Compiler errors and warnings should recognize specific common mistakes and recommend good general patterns for solutions.
  • Warnings and errors should refer users to more comprehensive documentation for in-depth explanations and best practices.
  • A shared terminology (AsyncRead) and standard locations for key primitives (sleep, spawn, Select) is needed to be able to provide truly helpful, actionable error messages.
  • Async Rust has some very particular problem patterns which are important to handle correctly. Misleading error messages like "add 'static to your &mut" or "add move" can really throw developers for a loop by sending them down the wrong rabbit hole.
  • Detecting known cases of blocking (even if imperfect) could help users significantly in avoiding foot-guns. Some cases are: using std::thread::sleep, loops without .await in them (or where all the .awaits are on poll_fn futures), calling methods that transitively call block_on.

What is the "most shiny" about this future?

The ability to detect issues that would be performance problems at runtime at compile-time.

What are some of the potential pitfalls about this future?

Detecting blocking is tricky, and likely subject to both false-positives and false-negatives. Users hate false-positive warnings, so we'll have to be careful about when we give warnings based on what might happen at runtime.

Did anything surprise you when writing this story? Did the story go any place unexpected?

I wasn't expecting it to end up this long and detailed!

I also wasn't expecting to have to get into the fact that async fns capture their arguments, but got there very quickly by just walking through what I imagine Alan's thought process and development would be like.

What are some variations of this story that you considered, or that you think might be fun to write? Have any variations of this story already been written?

  • How does Alan realize the difference between Select (really FuturesUnordered) and select! (where the branches are known statically)?
  • Another common pain-point is forgetting to pin futures when using constructs like select!. Can the compiler detect this and suggest std::task::pin! (and can we have that in std please)?
  • Tools that allow the user to introspect the program state at runtime and detect things like blocking that way are great, but don't help newcomers too much. They won't know about the tools, or what to look for.
  • How can we detect and warn about async code that transitively ends up calling block_on?
  • This story didn't get into taking a Mutex and holding it across an .await, and the associated problems. Nor how a user finds other, better design patterns to deal with that situation.
  • A story where Alan uses the docs to decide he shouldn't use async would be nice. Including if he then needs to use some library that is itself async -- how does he bridge that gap? And perhaps one where he then later changes his mind and has to move from sync to async.
  • Barbara plays with async could also use a similar-style "shining future" story.

What are some of the things we'll have to figure out to realize this future? What projects besides Rust itself are involved, if any? (Optional)

  • Detecting the async "color" of functions to warn about crossing.
  • Detecting long-running code in runtimes.
  • Standardizing enough core terminology and mechanisms that the compiler can both detect specific problems and propose actionable solutions

โœจ Shiny future stories: Alan's trust in the compiler is rewarded

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Trust the compiler

Alan has a lot of experience in C#, but in the meantime has created some successful projects in Rust. He has dealt with his fair share of race conditions/thread safety issues during runtime in C#, but is now starting to trust that if his Rust code compiles, he won't have those annoying runtime problems to deal with.

This allows him to try to squeeze his programs for as much performance as he wants, because the compiler will stop him when he tries things that could result in runtime problems. After seeing the performance and the lack of runtime problems, he starts to trust the compiler more and more with each project finished.

He knows what he can do with external libraries, he does not need to fear concurrency issues if the library cannot be used from multiple threads, because the compiler would tell him.

His trust in the compiler solidifies further the more he codes in Rust.

The first async project

Alan now starts with his first async project. He opens up the Rust book to the "Async I/O" chapter and it guides him to writing his first program. He starts by writing some synchronous code to write to the file system:

use std::fs::File;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt")?;
    file.write_all(b"Hello, world!")?;
    Ok(())
}

Next, he adapts that to run in an async fashion. He starts by converting main into async fn main:

use std::fs::File;

async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt")?;
    file.write_all(b"Hello, world!")?;
    Ok(())
}

The code compiles, but he gets a warning:

warning: using a blocking API within an async function
 --> src/main.rs:4:25
1 | use std::fs::File;
  |     ------------- try changing to `std::async_io::fs::File`
  | ...
4 |     let mut file: u32 = File::create("a.txt")?;
  |                         ^^^^^^^^^^^^ blocking functions should not be used in async fn
help: try importing the async version of this type
 --> src/main.rs:1
1 | use std::async_fs::File;

"Oh, right," he says, "I am supposed to use the async variants of the APIs." He applies the suggested fix in his IDE, and now his code looks like:

use std::async_fs::File;

async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt")?;
    file.write_all(b"Hello, world!")?;
    Ok(())
}

His IDE recompiles instantaneously and he now sees two little squiggles, one under each ?. Clicking on the errors, he sees:

error: missing await
 --> src/main.rs:4:25
4 |     let mut file: u32 = File::create("a.txt")?;
  |                                              ^ returns a future, which requires an await
help: try adding an await
 --> src/main.rs:1
4 |     let mut file: u32 = File::create("a.txt").await?;

He again applies the suggested fix, and his code now shows:

use std::async_fs::File;

async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt").await?;
    file.write_all(b"Hello, world!").await?;
    Ok(())
}

Happily, it compiles, and when he runs it, everything works as expected. "Cool," he thinks, "this async stuff is pretty easy!"

Making some web requests

Next, Alan decides to experiment with some simple web requests. This isn't part of the standard library, but the fetch_rs package is listed in the Rust book. He runs cargo add fetch_rs to add it to his Cargo.toml and then writes:

use std::async_fs::File;
use fetch_rs;

async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut file = File::create("a.txt")?;
    file.write_all(b"Hello, world!")?;

    let body = fetch_rs::get("https://www.rust-lang.org")
        .await?
        .text()
        .await?;
    println!("{}", body);

    Ok(())
}

This feels pretty easy!

๐Ÿค” Frequently Asked Questions

What status quo story or stories are you retelling?

What are the key points you were trying to convey with this status quo story?

  • Getting started with async should be as automated as possible:
    • change main to an async fn;
    • use the APIs found in modules like std::async_foo, which should map as closely as possible to their non-async equivalents.
  • You should get some sort of default runtime that is decent
  • Lints should guide you in using async:
    • identifying blocking functions
    • identifying missing await
  • You should be able to grab libraries from the ecosystem and they should integrate with the default runtime without fuss

Is there a "one size fits all" runtime in this future?

This particular story doesn't talk about what happens when the default runtime isn't suitable. But you may want to read its sequel, "Alan Switches Runtimes".

What is Alan most excited about in this future? Is he disappointed by anything?

Alan is excited about how easy it is to get async programs up and running. He also finds the performance is good. He's good.

What is Grace most excited about in this future? Is she disappointed by anything?

Grace is happy because she is getting strong safety guarantees and isn't getting surprising runtime panics when composing libraries. The question of whether she's able to use the tricks she knows and loves is a good one, though. The default scheduler may not optimize for maximum performance -- this is something to explore in future stories. The "Alan Switches Runtimes", for example, talks more about the ability to change runtimes.

What is Niklaus most excited about in this future? Is he disappointed by anything?

Niklaus is quite happy. Async Rust is fairly familiar and usable for him. Further, the standard library includes "just enough" infrastructure to enable a vibrant crates-io ecosystem without centralizing everything.

What is Barbara most excited about in this future? Is she disappointed by anything?

Barbara quite likes that the std APIs for sync and sync fit together, and that there is a consistent naming scheme across them. She likes that there is a flourishing ecosystem of async crates that she can choose from.

What projects benefit the most from this future?

A number of projects benefit:

  • Projects like YouBuy are able to get up and going faster.
  • Libraries like SLOW become easier because they can target the std APIs and there is a defined plan for porting across runtimes.

Are there any projects that are hindered by this future?

It depends on the details of how we integrate other runtimes. If we wound up with a future where most libraries are "hard-coded" to a single default runtime, this could very well hinder any number of projects, but nobody wants that.

What are the incremental steps towards realizing this shiny future?

This question can't really be answered in isolation, because so much depends on the story for how we integrate with other runtimes. I don't think we can accept a future where is literally a single runtime that everyone has to use, but I wanted to pull out the question of "non-default runtimes" (as well as more details about the default) to other stories.

Does realizing this future require cooperation between many projects?

Yes. For external libraries like fetch_rs to interoperate they will want to use the std APIs (and probably traits).

โœจ Shiny future stories: Alan switches runtimes

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Since his early adventures with Async I/O went so well, Alan has been looking for a way to learn more. He finds a job working in Rust. One of the projects he works on is DistriData. Looking at their code, he sees an annotation he has never seen before:

#[humboldt::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let result = std::async_thread::spawn(async move {
        do_something()
    });
}

He asks Barbara, one of his coworkers, "What is this humboldt::main annotation? What's humboldt?" She answers by explaining to him that Rust's support for async I/O is actually based around an underlying runtime. "Rust gives you a pretty decent runtime by default," she says, "but it's not tuned for our workloads. We wrote our own runtime, which we call humboldt."

Alan asks, "What happens with the various std APIs? For example, I see we are calling std::async_thread::spawn -- when I used that before, it spawned tasks into the default runtime. What happens now?"

Barbara explains that the "async" APIs in std generally execute relative to the current runtime that is in use. "When you call std::async_thread::spawn, it will spawn a task onto the current runtime. It's the same with the routines in std::async_io and so forth. The humboldt::main annotation actually just creates a synchronous main function that initializes the humboldt runtime and launches the first future. When you just write an async fn main without any annotation, the compiler synthesizes the same main function with the default runtime."

Learning more about Humboldt

Alan sees that some of the networking code that is being used in their application is creating network connections using humboldt APIs:


#![allow(unused)]
fn main() {
use humboldt::network;
}

He asks Barbara, "Why don't we use the std::async_io APIs for that?" She explains that Humboldt makes use of some custom kernel extensions that, naturally enough, aren't part of the std library. "TCP is for rubes," she says, "we are using TTCP -- Turbo TCP." Her mind wanders briefly to Turbo Pascal and she has a brief moment of yearning for the days when computers had a "Turbo" button that changed them from 8 MHz to 12 MHz. She snaps back into the present day. "Anyway, the std::async_io APIs just call into humboldt's APIs via various traits. But we can code directly against humboldt when we want to access the extra capabilities it offers. That does make it harder to change to another runtime later, though."

Integrating into other event loops

Later on, Alan is working on a visualizer front-end that integrates with DistriData to give more details about their workloads. To do it, he needs to integrate with Cocoa APIs and he wants to run certain tasks on Grand Central Dispatch. He approaches Barbara and asks, "If everything is running on humboldt, is there a way for me to run some things on another event loop? How does that work?"

Barbara explains, "That's easy. You just have to use the gcd wrapper crate -- you can find it on crates.io. It implements the runtime traits for gcd and it has a spawn method. Once you spawn your task onto gcd, everything you run within gcd will be running in that context."

Alan says, "And so, if I want to get things running on humboldt again, I spawn a task back on humboldt?"

"Exactly," says Barbara. "Humboldt has a global event loop, so you can do that by just doing humboldt::spawn. You can also just use the humboldt::io APIs directly. They will always use the Humboldt I/O threads, rather than using the current runtime."

Alan winds up with some code that looks like this:


#![allow(unused)]
fn main() {
async fn do_something_on_humboldt() {
    gcd::spawn(async move {
        let foo = do_something_on_gcd();

        let bar = humboldt::spawn(async move {
            do_a_little_bit_of_stuff_on_humboldt();
        });

        combine(foo.await, bar.await);
    });
}
}

๐Ÿค” Frequently Asked Questions

What status quo story or stories are you retelling?

Good question! I'm not entirely sure! I have to go looking and think about it. Maybe we'll have to write some more.

What are the key points you were trying to convey with this status quo story?

  • There is some way to seamlessly change to a different default runtime to use for async fn main.
  • There is no global runtime, just the current runtime.
  • When you are using this different runtime, you can write code that is hard-coded to it and which exposes additional capabilities.
  • You can integrate multiple runtimes relatively easily, and the std APIs work with whichever is the current runtime.

How do you imagine the std APIs and so forth know the current runtime?

I was imagining that we would add fields to the Context<'_> struct that is supplied to each async fn when it runs. Users don't have direct access to this struct, but the compiler does. If the std APIs return futures, they would gain access to it that way as well. If not, we'd have to create some other mechanism.

What happens for runtimes that don't support all the features that std supports?

That feels like a portability question. See the (yet to be written) sequel story, "Alan runs some things on WebAssembly". =)

What is Alan most excited about in this future? Is he disappointed by anything?

Alan is excited about how easy it is to get async programs up and running, and he finds that they perform pretty well once he does so, so he's happy.

What is Grace most excited about in this future? Is she disappointed by anything?

Grace is concerned with memory safety and being able to deploy her tricks she knows from other languages. Memory safety works fine here. In terms of tricks she knows and loves, she's happy that she can easily switch to another runtime. The default runtime is good and works well for most things, but for the [DistriData] project, they really need something tailored just for them. She is also happy she can use the extended APIs offered by humboldt.

What is Niklaus most excited about in this future? Is he disappointed by anything?

Niklaus finds it async Rust quite accessible, for the same reasons cited as in "Alan's Trust in the Rust Compiler is Rewarded".

What is Barbara most excited about in this future? Is she disappointed by anything?

Depending on the technical details, Barbara may be a bit disappointed by the details of how std interfaces with the runtimes, as that may introduce some amount of overhead. This may not matter in practice, but it could also lead to library authors avoiding the std APIs in favor of writing generics or other mechanisms that are "zero overhead".

What projects benefit the most from this future?

Projects like DistriData really benefit from being able to customize their runtime.

Are there any projects that are hindered by this future?

We have to pay careful attention to embedded projects like MonsterMesh. Some of the most obvious ways to implement this future would lean on dyn types and perhaps boxing, and that would rule out some embedded projects. Embedded runtimes like embassy are also the most different in their overall design and they would have the hardest time fitting into the std APIs (of course, many embedded projects are already no-std, but many of them make use of some subset of the std capabilities through the facade). In general, traits and generic functions in std could lead to larger code size, as well.

What are the incremental steps towards realizing this shiny future?

There are a few steps required to realize this future:

  • We have to determine the core mechanism that is used for std types to interface with the current scheduler.
    • Is it based on dynamic dispatch? Delayed linking? Some other tricks we have yet to invent?
    • Depending on the details, language changes may be required.
  • We have to hammer out the set of traits or other interfaces used to define the parts of a runtime (see below for some of the considerations).
    • We can start with easier cases and proceed to more difficult ones, however.

Does realizing this future require cooperation between many projects?

Yes. We will need to collaborate to define traits that std can use to interface with each runtime, and the runtimes will need to implement those traits. This is going to be non-trivial, because we want to preserve the ability for independent runtimes to experiment, while also preserving the ability to "max and match" and re-use components. For example, it'd probably be useful to have a bunch of shared I/O infrastructure, or to have utility crates for locks, for running threadpools, and the like. On the other hand, tokio takes advantage of the fact that it owns the I/O types and the locks and the scheduler to do some nifty tricks and we would want to ensure that remains an option.

โœจ Shiny future stories: Barbara appreciates great performance analysis tools

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Barbara has built an initial system prototype in sync Rust. She notes that it's completely I/O bound, and benchmarking shows that most of her CPU consumption is thread switch overheads. She decides to rewrite it in async Rust, using an executor that she believes will fix her bottlenecks.

She sprinkles async/.await in all the right places, switches her sync dependencies to async libraries, and gets the code compiling. When she runs it, she discovers that the service no longer responds when she sends a request to the endpoint. Her logging shows her that the endpoint handler has been invoked, many tasks have been spawned, but that something isn't working as she expected.

Fortunately, there are great tracing tools available for async Rust. Barbara turns on tracing, and immediately gets interesting information in her trace viewer. She can see all the tasks she has spawned, the lines of code where a .await returns control to the executor, and delays between a Waker being invoked and the corresponding .await resuming execution.

With this information in hand, she finds a decompression path that is unexpectedly CPU-bound, because she can see a stack trace for the task that is running and blocking a woken up future from getting invoked again. The memory use of this future tells her that the compressed blobs are larger than she thought, but inspecting shows that this is reasonable. She thus puts the decompression onto its own blocking task, which doesn't fix things, but makes it clear that there is a deadlock passing data between two bounded channels; the trace shows the Waker for a rx.next().await being invoked, but the corresponding .await never runs. Looking into the code, she notes that the task is waiting on a tx.send().await call, and that the channel it is trying to send to is full. When Barbara reads this code, she identifies a classic AB-BA deadlock; the task that would consume items from the channel this task is waiting on is itself waiting on a transmit to the queue that this task will drain.

She refactors her code to resolve this issue, and then re-checks traces. This time, the endpoint behaves as expected, but she's not seeing the wall clock time she expects; the trace shows that she's waiting on a network call to another service (also written in async Rust), and it's taking about 10x longer to reply than she would expect. She looks into the tracing libraries, and finds two useful features:

  1. She can annotate code with extra information that appears on the traces.
  2. Every point in the code has access to a unique ID that can be passed to external services to let her correlate traces.

Barbara adds annotations that let her know how many bytes she's sending to the external service; it's not unreasonable, so she's still confused. A bit of work with the service owner, and she can now get traces from the external service that have IDs she sends with a request in them. The tooling combines traces nicely, so that she can now trace across the network into the external service, and she realises that it's going down a slow code path because she set the wrong request parameters.

With the extra insights from the external service's trace, she's able to fix up her code to run perfectly, and she gets the desired wins from async Rust. Plus, she's got a good arsenal of tooling to use when next she sees an unidentified problem.

๐Ÿค” Frequently Asked Questions

What status quo story or stories are you retelling?

What is Alan most excited about in this future? Is he disappointed by anything?

Alan is excited about how easy it is to find out when his projects don't work as expected. He's happy

What is Grace most excited about in this future? Is she disappointed by anything?

Grace is happy because the performance tools give her all the low level insights she wants into her code, and shows her what's going on "behind the scenes" in the executor. As a C++ developer, she is also excited when she sees that Rust developers who see an issue with her services can give her useful information about exactly what they see her C++ doing - which she can correlate with her existing C++ performance tools via the unique ID.

What is Niklaus most excited about in this future? Is he disappointed by anything?

Niklaus is content. The tooling tells him what he needs to know, and allows him to add interesting information to places where he'd otherwise be stuck trying to extract it via println!(). He's not entirely sure how to use some of the detailed information, but he can ignore it easily because the tools let him filter down to just the information he added to the traces - getting timestamps and task identifiers "for free" is just gravy to Niklaus.

What is Barbara most excited about in this future? Is she disappointed by anything?

Barbara is impressed at how easy it is to spot problems and handle them; she is especially impressed when the tooling is able to combine traces from two services and show her their interactions in a useful fashion as-if they were one process. She kinda wishes that the compiler would spot more of the mistakes she made - the decompression path should be something the compiler should get right for her - but at least the tooling made the problems easy to find.

What projects benefit the most from this future?

All the projects benefit; there's a useful amount of tracing "for free", and places where you can add your own data as needed.

Are there any projects that are hindered by this future?

MonsterMesh needs to be able to remove a lot of the tracing because the CPU and memory overhead is too high in release builds.

What are the incremental steps towards realizing this shiny future?

The tracing crate has a starting point for a useful API; combined with tracing-futures, we have a prototype.

Next steps are to make integrating that with executors trivial (minimal code change), and to add in extra information to tracing-futures so that we can output the best possible traces. In parallel to that, we'll want to work on tooling to display, combine, and filter traces so that we can always extract just what we need from any given trace.

Does realizing this future require cooperation between many projects?

Yes. We need an agreed API for tracing that all async projects use - both to add tracing information, and to consume it in a useful form.

โœจ Shiny future stories: Barbara enjoys her async-sync-async sandwich :sandwich:

:::warning Alternative titles:

  • Barbara enjoys her async-sync-async sandwich :sandwich:
  • Barbara recursively blocks
  • Barbara blocks and blocks and blocks :::

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Barbara wants to customize a permissions lookup when accepting requests. The library defines a trait PermitRequest, to allow the user to define their own rules. Nice!


#![allow(unused)]
fn main() {
trait PermitRequest {}
}

She starts small, to get her feet wet.


#![allow(unused)]
fn main() {
struct Always;

impl PermitRequest for Always {
    fn permit(&self, _: &Request) -> bool {
        true
    }
}
}

All requests are permitted! Simple, but now to actually to implement the permissions logic.

One of the basic rules Barbara has is to check the request for the existence of a header, but the function is written as async, since Barbara figured it might need to be eventually.


#![allow(unused)]
fn main() {
async fn req_has_header(req: &Request) -> bool {
    req.headers().contains_key("open-sesame")
}
}

When Barbara goes to implement the PermitRequest trait, she realizes a problem: the trait did not think permissions would require an async lookup, so its method is not async. Barbara tries the easiest thing first, hoping that she can just block on the future.


#![allow(unused)]
fn main() {
struct HasHeader;

impl PermitRequest for HasHeader {
    fn permit(&self, req: &Request) -> bool {
        task::block_on(req_has_header(req))
    }
}
}

When Barbara goes to run the code, it works! Even though she was already running an async runtime at the top level, trying to block on this task didn't panic or deadlock. This is because the runtime optimistically hoped the future would be available without needing to go to sleep, and so when it found the currently running runtime, it re-used it to run the future.

The compiler does emit a warning, thanks to a blocking lint (link to shiny future when written). It let Barbara know this could have performance problems, but she accepts the trade offs and just slaps a #[allow(async_blocking)] attribute in there.

Barbara, now energized that things are looking good, writes up the other permission strategy for her application. It needs to fetch some configuration from another server based on a request header, and to keep it snappy, she limits it with a timeout.


#![allow(unused)]
fn main() {
struct FetchConfig;

impl PermitRequest for FetchConfig {
    fn permit(&self, req: &Request) -> bool {
        let token = req.headers().get("authorization");
        
        #[allow(async_blocking)]
        task::block_on(async {
            select! {
                resp = fetch::get(CONFIG_SERVER).param("token", token) => {
                    resp.status() == 200
                },
                _ = time::sleep(2.seconds()) => {
                    false
                }
            }
        })
    }
}
}

This time, there's no compiler warning, since Barbara was ready for that. And running the code, it works as expected. The runtime was able to reuse the IO and timer drivers, and not need to disrupt other tasks.

However, the runtime chose to emit a runtime log at the warning level, informing her that while it was able to make the code work, it could have degraded behavior if the same parent async code were waiting on this and another async block, such as via join!. In the first case, since the async code was ready immediately, no actual harm could have happened. But this time, since it had to block the task waiting on a timer and IO, the log was emitted.

Thanks to the runtime warning, Barbara does some checking that the surround code won't be affected, and once sure, is satisfied that it was easier than she thought to make an async-sync-async sandwich.

๐Ÿค” Frequently Asked Questions

What status quo stories are you retelling?

While this story isn't an exact re-telling of an existing status quo, it covers the morals of a couple:

What are the key attributes of this shiny future?

  • block_on tries to be forgiving and optimistic of nested usage.
    • It does a best effort to "just work".
  • But at the same time, it provides information to the user that it might not always work out.
    • A compiletime lint warns about the problem in general.
      • This prods a user to try to use .await instead of block_on if they can.
    • A runtime log warns when the usage could have reacted badly with other code.
      • This gives the user some more information if a specific combination degrades their application.

What is the "most shiny" about this future?

It significantly increases the areas where block_on "just works", which should improve productivity.

What are some of the potential pitfalls about this future?

  • While this shiny future tries to be more forgiving when nesting block_on, the author couldn't think of a way to completely remove the potential dangers therein.
  • By making it easier to nest block_on, it might increase the times a user writes code that degrades in performance.
    • Some runtimes would purposefully panic early to try to encourage uses to pick a different design that wouldn't degrade.
    • However, by keeping the warnings, hopefully users can evaluate the risks themselves.

Thing about Rust's core "value propositions": performance, safety and correctness, productivity. Are any of them negatively impacted? Are there specific application areas that are impacted negatively? You might find the sample projects helpful in this regard, or perhaps looking at the goals of each character.

Did anything surprise you when writing this story? Did the story go any place unexpected?

No.

What are some variations of this story that you considered, or that you think might be fun to write? Have any variations of this story already been written?

A variation would be an even more optimistic future, where we are able to come up with a technique to completely remove all possible bad behaviors with nested block_on. The author wasn't able to think of how, and it seems like the result would be similar to just being able to .await in every context, possibly implicitly.

What are some of the things we'll have to figure out to realize this future? What projects besides Rust itself are involved, if any? (Optional)

  • A runtime would need to be modified to be able to lookup through a thread-local or similar whether a runtime instance is already running.
  • A runtime would need some sort of block_in_place mechanism.
  • We could make a heuristic to guess when block_in_place would be dangerous.
    • If the runtime knows the task's waker has been cloned since the last time it was woken, then probably the task is doing something like join! or select!.
    • Then we could emit a warning like "nested block_on may cause problems when used in combination with join! or select!"
    • The heuristic wouldn't work if the nested block_on were part of the first call of a join!/select!.
    • Maybe a warning regardless is a good idea.
    • Or a lint, that a user can #[allow(nested_block_on)], at their own peril.
  • This story uses a generic task::block_on, to not name any specific runtime. It doesn't specifically assume that this could work cross-runtimes, but maybe a shinier future would assume it could?
  • This story refers to a lint in a proposed different shiny future, which is not yet written.

โœจ Shiny future stories: Barbara makes a wish

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Barbara has an initial prototype of a new service she wrote in sync Rust. She then decides, since the service is extremely I/O bound, to port it to async Rust and her benchmarks have led her to believe that performance is being left on the table.

She does this by sprinkling async/.await everywhere, picking an executor, and moving dependencies from sync to async.

Once she has the program compiling, she thinks "oh that was easy". She runs it for the first time and surprisingly she finds out that when hitting an endpoint, nothing happens.

Barbara, always prepared, has already added logging to her service and she checks the logs. As she expected, she sees here that the endpoint handler has been invoked but then... nothing. Barbara exclaims, "Oh no! This was not what I was expecting, but let's dig deeper."

She checks the code and sees that the endpoint spawns several tasks, but unfortunately those tasks don't have much logging in them.

Barbara now remembers hearing something about a wish4-async-insight crate, which has gotten some buzz on her Rust-related social media channels. She decides to give that a shot.

She adds the crate as a dependency to her Cargo.toml, renaming it to just insight to make it easier to reference in her code, and then initializes it in her main async function.


#![allow(unused)]
fn main() {
async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    insight::init(); // new code
    ...
}
}

Barbara rebuilds and runs her program again. She doesn't see anything different in the terminal output for the program itself though, and the behavior is the same as before: hitting an endpoint, nothing happens. She double-checks the readme for the wish4-async-insight crate, and realizes that she needs to connect other programs to her service to observe the insights being gathered. Barbara decides that she wants to customize the port that insight is listening on before she starts her experiments with those programs.


#![allow(unused)]
fn main() {
async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    insight::init(listen_port => 8080); // new code, leveraging keyword arguments feature added in 2024
    ...
}
}

While her code rebuilds, Barbara investigates what programs she might use to connect to the insight crate.

One such program, consolation, can run in the terminal. Barbara is currently just deploying her service locally on her development box, so she opts to try that out and see what it tells her.

% rustup install wish4-consolation
...
% consolation --port 8080

This brings up a terminal window that looks similar to the Unix top program, except that instead of a list of OS processes, this offers a list of tasks, with each task having a type, ID, and status history (i.e. percentage of time spent in running, ready to poll, or blocked). Barbara skims the output in the list, and sees that one task is listed as currently blocked.

Barbara taps the arrow-keys and sees that this causes a cursor to highlight different tasks in the list. She highlights the blocked task and hits the Enter key. This causes the terminal to switch to a Task view, describing more details about that task and its status.

The Task view here says that the task is blocked, references a file and line number, and also includes the line from the source code, which says chan.send(value).await. The blocked task also lists the resources that the task is waiting on: prototype_channel, and next to that there is text on a dark red background: "waiting on channel capacity." Again, Barbara taps the arrow-keys and sees that she can select the line for the resource.

Barbara notices that this whole time, at the bottom of the terminal, there was a line that says "For help, hit ? key"; she taps question mark. This brings up a help message in a scrollable subwindow explaining the task view in general as well as link to online documentation. The help message notes that the user can follow the chain: One can go from the blocked task to the resource it's waiting on, and from that resource to a list of tasks responsible for freeing up the resource.

Barbara hits the Escape key to close the help window. The highlight is still on the line that says "prototype_channel: waiting on channel capacity"; Barbara hits Enter, and this brings up a list with just one task on it: The channel reader task. Barbara realizes what this is saying: The channel resource is blocking the sender because it is full, and the only way that can be resolved is if the channel reader manages to receive some inputs from the channel.

Barbara opens the help window again, and brings up the link to the online documentation. There, she sees discussion of resource starvation and the specific case of a bounded channel being filled up before its receiver makes progress. The main responses outlined there are 1. decrease the send rate, 2. increase the receive rate, or 3. increase the channel's internal capacity, noting the extreme approach of changing to an unbounded channel (with the caveat that this risks resource exhaustion).

Barbara skims the task view for the channel reader, since she wants to determine why it is not making progress. However, she is eager to see if her service as a whole is workable apart from this issue, so she also adopts the quick fix of swapping in an unbounded channel. Barbara is betting that if this works, she can use the data from wish4-async-insight about the channel sizes to put a bounded channel with an appropriate size in later.

Barbara happily moves along to some initial performance analysis of her "working" code, eager to see what other things wish4-async-insight will reveal during her explorations.

Alternate History

The original status quo story just said that Barbara's problem was resolved (sort of) by switching to an unbounded channel. I, much like Barbara, could not tell why this resolved her problem. In particular, I could not tell whether there was an outright deadlock due to a cycle in the task-resource dependency chain that, or if there something more subtle happening. In the story above, I assumed it was the second case: something subtle.

Here's an important alternate history though, for the first case of a cycle. Its ... the same story, right up to when Barbara first runs consolation:

% rustup install wish4-consolation
...
% consolation --port 8080

This brings up a terminal window that looks similar to the Unix top program, except that instead of a list of OS processes, this offers a list of tasks, and shows their status (i.e. running, ready to poll, or blocked), as well as some metrics about how long the tasks spend in each state.

At the top of the screen, Barbara sees highlighted warning: "deadlock cycle was detected. hit P for more info."

Barbara types capital P. The terminal switches to "problem view," which shows

  • The task types, ID, and attributes for each type.
  • The resources being awaited on
  • The location / backtrace of the await.
  • A link to a documentation page expanding on the issue.

The screen also says "hit D to generate a graphviz .dot file to disk describing the cycle."

Barbara hits D and stares at the resulting graph, which shows a single circle (labelled "task"), and an arc to a box (labelled "prototype_channel"), and an arc from that box back to the circle. The arc from the circle to the box is labelled send: waiting on channel capacity, and the arc from the box to the circle is labelled "sole consumer (mpsc channel)".

graph TD
  task -- send: waiting on channel capacity --> prototype_channel
  prototype_channel -- "sole receiver (mpsc channel)" --> task
  task((task))

Barbara suddenly realizes her mistake: She had constructed a single task that was sometimes enqueuing work (by sending messages on the channel), and sometimes dequeuing work, but she had not put any controls into place to ensure that the dequeuing (via recv) would get prioritized as the channel filled up.

Barbara reflects on the matter: she knows that she could swap in an unbounded channel to resolve this, but she thinks that she would be better off thinking a bit more about her system design, to see if she can figure out a way to supply back-pressure so that the send rate will go down as the channel fills up.

๐Ÿค” Frequently Asked Questions

What status quo story or stories are you retelling?

Barbara wants Async Insights

What is Alan most excited about in this future? Is he disappointed by anything?

Alan is happy to see a tool that gives one a view into the internals of the async executor.

Alan is not so thrilled about using the consolation terminal interface; but luckily there are other options, namely IDE/editor plugins as well as a web-browser based client, that offer even richer functionality, such as renderings of the task/resource dependency graph.

What is Grace most excited about in this future? Is she disappointed by anything?

Grace is happy to see a tool, but wonders whether it could have been integrated into gdb.

Grace is not so thrilled to learn that this tool is not going to try to provide specific insight into performance issues that arise solely from computational overheads in her own code. (The readme for wish4-async-insight says on this matter "for that, use perf," which Grace finds unsatisfying.)

What is Niklaus most excited about in this future? Is he disappointed by anything?

Niklaus is happy to learn that the wish4-async-insight is supported by both async-std and tokio, since he relies on friends in both communities to help him learn more about Async Rust.

Niklaus is happy about the tool's core presentation oriented around abstractions he understands (tasks and resources). Niklaus is also happy about the integrated help.

However, Niklaus is a little nervous about some of the details in the output that he doesn't understand.

What is Barbara most excited about in this future? Is she disappointed by anything?

Barbara is thrilled with how this tool has given her insight into the innards of the async executor she is using.

She is disappointed to learn that not every async executor supports the wish4-async-insight crate. The crate works by monitoring state changes within the executor, instrumented via the tracing crate. Not every async-executor is instrumented in a fashion compatible with wish4-async-insight.

What projects benefit the most from this future?

Any async codebase that can hook into the wish4-async-insight crate and supply its data via a network port during development would benefit from this. So, I suspect any codebase that uses a sufficiently popular (i.e. appropriately instrumented) async executor will benefit.

The main exception I can imagine right now is MonsterMesh: its resource constraints and #![no_std] environment are almost certainly incompatible with the needs of the wish4-async-insight crate.

Are there any projects that are hindered by this future?

The only "hindrance" is that the there is an expectation that the async-executor be instrumented appropriately to feed its data to the wish4-async-insight crate once it is initialized.

What are the incremental steps towards realizing this shiny future? (Optional)

  • Get tracing crate to 1.0 so that async executors can rely on it.

  • Prototype an insight console atop a concrete async executor (e.g. tokio)

  • Develop a shared protocol atop tracing that compatible async executors will use to provide the insightful data.

Does realizing this future require cooperation between many projects? (Optional)

Yes. Yes it does.

At the very least, as mentioned among the "incremental steps", we will need a common protocol that the async executors use to communicate their internal state.

โœจ Barbara Wants Async Read Write

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

Character: Barbara.

Barbara is the creator of a sans-io library for Rust. She designed her library to integrate with async and her goal was to make it runtime agnostic; so that it could be as broadly used as possible. Unfortunately, when she first wrote the library async did not have a standard abstraction for Buffered IO. So her first implementation did not use buffered IO. When she tried to update her library to use buffered IO so as to improve performance she was confronted with the problem that each runtime had its own implementation and abstractions. The result was several unavoidable compromises on her runtime-agnostic design goals. She was able to achieve her performance improvements but only with runtime specific implementations; leaving her with a larger more complex code base.

But today is a fantastic day for Barbara. The Rust async team has recently released the latest version of async and part of that release was a standard Buffered Async Read/Write abstraction. Since then, several runtimes have been updated to implement the new abstraction and Barbara refactored the buffered IO module to use this new abstraction and she deprecated the runtime specific solutions. Today is the day that Barbara gets to release her new version of sans-io which takes full advantage of the buffered Async Read/Write abstractions now defined in async. The result is a library that maintains the same performance gains that it had with the runtime specific modules while greatly reducing the amount of code.

๐Ÿค” Frequently Asked Questions

NB: These are generic FAQs. Feel free to customize them to your story or to add more.

What status quo stories are you retelling?

Link to status quo stories if they exist. If not, that's ok, we'll help find them.

What are the key attributes of this shiny future?

  • Just like AsyncRead/AsyncWrite there are no standard traits for buffered I/O

    • This is made worse by the fact that there isnโ€™t even ecosystem traits for buffered writes.
  • There are no standard (or even present in futures-io) concrete types for async buffered I/O.

    • Each major runtime has their own async BufReader, BufWriter types.
  • All the issues with creating runtime agnostic libraries are very present here. (TODO: link with runtime agnostic lib story) std::io doesnโ€™t have a BufWrite trait for sync I/O.

    • Itโ€™s less of an issue than in async Rust because of the existence of the standardized std::io::BufWriter.

What is the "most shiny" about this future?

Thing about Rust's core "value propositions": performance, safety and correctness, productivity. Which benefit the most relative to today? This benefits productivity and correctness the most. The problem is not performance, in particular, as each runtime provides buffered IO solutions. The problem is that they are inconsistent and not compatible. This means that writing code that is compatible with any async runtime becomes both: much more difficult and much more likely to be wrong when runtimes change.

What are some of the potential pitfalls about this future?

Thing about Rust's core "value propositions": performance, safety and correctness, productivity. Are any of them negatively impacted? Are there specific application areas that are impacted negatively? You might find the sample projects helpful in this regard, or perhaps looking at the goals of each character.

  • Having a design which makes it difficult for existing runtimes to make their buffered IO types compatible or to migrate their runtimes over to the new designs.

Did anything surprise you when writing this story? Did the story go any place unexpected?

The act of writing shiny future stories can uncover things we didn't expect to find. Did you have any new and exciting ideas as you were writing? Realize some complications that you didn't foresee? The most surprising thing is that there is a buffered read type in futures but no buffered write type in futures. I would expect both or neither.

What are some variations of this story that you considered, or that you think might be fun to write? Have any variations of this story already been written?

Often when writing stories, we think about various possibilities. Sketch out some of the turning points here -- maybe someone will want to turn them into a full story! Alternatively, if this is a variation on an existing story, link back to it here. No variations.

What are some of the things we'll have to figure out to realize this future? What projects besides Rust itself are involved, if any? (Optional)

Often the 'shiny future' stories involve technical problems that we don't really know how to solve yet. If you see such problems, list them here!

โœจ Barbara wants async tracing

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

The problem: When you have a complex network of async tasks it can be difficult to debug issues or investigate behavior because itโ€™s hard to reason through the path of execution just by reading the code. Adding async tracing helps solve this by letting you trace an event through the network and see which async tasks the event executed and when and in what order.

Character is Barbara: Barbaraโ€™s team works on a set of services that power the API that powers her companyโ€™s website and all the features their customerโ€™s use. Theyโ€™ve built the backend for these services in Rust and make heavy use of async to manage IO bound operations and help make concurrency easier to leverage. However, the services have grown quite a bit and there are a large number of features and data requirements and different internal systems which they must interact with. The result is a very complex network of async expressions that do the job well and perform great, but, are too complex to easily reason about anymore and can be extraordinarily intimidating when trying to fix transient small issues. Issues such as infrequent slow requests or a very small number of requests executing certain actions out of order are very hard to resolve when the network of async expressions is complex.

Recently, Barbara and her team have been notified about some customers experiencing slow responses on some features. The lag events are rare but Barbara and her team are determined to fix them. With some work Barbara is able to recreate the lag reliably in the QA environment; but now she must figure out where in the complex code base this lag could be coming from and why itโ€™s happening. Fortunately, Rustโ€™s async framework now provides a built in Tracing tool. By building her service with the tracing flag on, her code is automatically instrumented and will start logging trace data to a file for later analysis.

Barbara runs the instrumented code in QA and recreates the laggy event several times. Then she takes the generated trace file and looks through the data. When she views the trace data with the analysis tools she is given a list of all the requests from her test, along with a timestamp and duration. She very quickly identifies the slow requests and chooses to view more detail on one of them. Here she can view a graph of the request's execution: each async expression is a vertex and edges connect parents to children. Each vertex shows the duration of the expression and the vertices are arranged vertically by when they started according to the system time. She immediately sees where each of the slow requests actually lagged. Each request experienced a slow down in different async expressions, but each expression had one thing in common: they each queried the same database table. She also noticed that there was a relation in when the latency occurred: all the laggy requests tended to occur in clusters. From this she was able to identify that the root cause was some updates made to the database which led to some queries, if they arrived together, to run relatively slowly. With tracing, Barbara was saved the effort of having to meticulous work through the code and try to deduce what the cause was and she didnโ€™t have to add in a large amount of logging or other instrumentation. All the instrumentation and analysis was provided out of the box and required no development work for Barbara to isolate the cause.

Barbara canโ€™t believe how much time she saved having this debugging tool provided out of the box.

๐Ÿค” Frequently Asked Questions

NB: These are generic FAQs. Feel free to customize them to your story or to add more.

What status quo stories are you retelling?

Link to status quo stories if they exist. If not, that's ok, we'll help find them. Alan Builds A Cache Alan Iteratively Regresses Performance Alan Tries To Debug A Hang

What are the key attributes of this shiny future?

  • Provide a protocol for linking events across async expressions.
  • Provide an output that allows a user to understand the path of execution of a program through a network of async expressions.

What is the "most shiny" about this future?

Thing about Rust's core "value propositions": performance, safety and correctness, productivity. Which benefit the most relative to today?

  • This will benefit the productivity of a developer. Providing a whole new way of debugging Rust programs and giving a way to view the actual execution of code in a human readable form can make it significantly faster to debug programs. This also saves time for a developer from having to write a tracer themselves.
  • This can also help with correctness. When working with asynchronous code it can be difficult; having a built-in means to trace a flow of execution makes it much easier to verify that specific inputs are following the correct paths in the correct order.

What are some of the potential pitfalls about this future?

Thing about Rust's core "value propositions": performance, safety and correctness, productivity. Are any of them negatively impacted? Are there specific application areas that are impacted negatively? You might find the sample projects helpful in this regard, or perhaps looking at the goals of each character.

  • Figuring out how to propagate a trace ID in a way thatโ€™s compatible with any use of async could be difficult
  • Recording trace data will have some impact on performance.
  • We could output too much data for a person to be able to use it.

Did anything surprise you when writing this story? Did the story go any place unexpected?

The act of writing shiny future stories can uncover things we didn't expect to find. Did you have any new and exciting ideas as you were writing? Realize some complications that you didn't foresee?

No.

What are some variations of this story that you considered, or that you think might be fun to write? Have any variations of this story already been written?

Another variation of this story is tracking down functional bugs: where the program is not always executing the expected code paths. An example of this is from the status quo story Alan Builds A Cache. In this type of story, a developer uses tracing to see execution flow of an event as it is fully processed by the application. This can the be used to make sure that every expected or required action is completed and done in the correct order; and if actions were missed, be able to determine why.

What are some of the things we'll have to figure out to realize this future? What projects besides Rust itself are involved, if any? (Optional)

Often the 'shiny future' stories involve technical problems that we don't really know how to solve yet. If you see such problems, list them here!

  • There will need to be some form of protocol for how to trace data as they move through a graph of async expressions. Perhaps by weaving a trace ID through the execution of async workflows. We will also have to provide a way "inject" or "wrap" this protocol around the users data in a way that can be automatically done as a compile time option (or is always done behind the scenes).
  • A protocol or standard for recording this information and decorating logs or metrics with this data would need to be provided.
  • Collecting entry and exit events for async expressions and linking them together in a graph
  • How to store the traces
  • How to identify each async expression so that a user knows what step in the trace refers to.
  • How to show this information to the user.

โœจ Shiny future stories: Grace debugs a crash dump again

๐Ÿšง Warning: Draft status ๐Ÿšง

This is a draft "shiny future" story submitted as part of the brainstorming period. It is derived from what actual Rust users wish async Rust should be, and is meant to deal with some of the challenges that Async Rust programmers face today.

If you would like to expand on this story, or adjust the answers to the FAQ, feel free to open a PR making edits (but keep in mind that, as peoples needs and desires for async Rust may differ greatly, shiny future stories cannot be wrong. At worst they are only useful for a small set of people or their problems might be better solved with alternative solutions). Alternatively, you may wish to add your own shiny vision story!

The story

It's been a few years since the new DistriData database has shipped. For the most part things have gone smoothly. The whole team is confident in trusting the compiler, and they have far fewer bugs in production than they had in the old system. The downside is that now when a bug does make it to production, it tends to be really subtle and take a lot of time to get right.

Today when Grace opens her e-mail, she discovers she's been assigned to investigate a dump from a crash that has been occurring in production lately. The crash happens rarely, so it's important to glean as much information as possible. They need to get this fixed soon!

Even though there's a lot of pressure around this situation, Grace is grateful that she won't have to fight her tools to make progress. A lot has changed in Async Rust over the years. The async community got together and defined the Async Debugging Protocol, which provides a standard way for tools to inspect the state of an asynchronous Rust program. Many of the most popular runtimes like Tokio and async-std follow this protocol, and a number of tools have been written to use the protocol as well. Even though Grace's team has opted to build a custom runtime to address their own unique needs, it was not too much work to implement the Async Debugging Protocol and it was well worth it due to the increase in developer productivity. This has truly revolutionized async debugging in much the same way the Language Server Protocol did for IDEs.

Upon opening the crash dump, her favorite debugger immediately gives an overview of the state of the program at the point it crashed. It shows what executors are running, how many OS-level threads each executor is using, what tasks are there, and what the state of each task is. For each thread, Grace can see a stack trace and the debugger provides a logical stack trace for each task as well. Many of the resources that the blocked tasks are waiting on are visible too, particularly those provided by the runtime like timers, mutexes, and I/O.

This high level, generic view provides a good start, but the team's custom executor provides additional functionality that the Async Debugging Protocol does not support. Still, using the features already provided as a starting point, Grace was able to write some additional debugging macros to recover the additional state. These macros are used by the whole team and are now a standard part of their debugging toolkit.

Grace has seen a few instances of this crash now and she notices a constellation of tasks that look a little funny. This gives her an idea for what might be going wrong. She uses that to add a new test case than ends up crashing the service in a way that looks very similar. It seems like she's found the bug! Even better, it looks like it should be a simple fix and the team will be able to put this issue behind them once and for all.

๐Ÿค” Frequently Asked Questions

What status quo stories are you retelling?

Grace debugs a crash dump.

What are the key attributes of this shiny future?

  • Most of the abilities to inspect executor and task state while debugging a live process also work on crash dumps.
  • Debugging async programs is both runtime- and tooling- agnostic.
    • People should be able to get a good experience using whatever tools they are comfortable with, whether that's gdb, lldb, VS Code, IntelliJ, or a specialized Rust async debugger.
    • Debugging tools should be able to work with different runtimes. Not all projects in an organization will use the same runtime, and some may be custom.
  • It's possible to see the following things while debugging:
    • What tasks are running, along with logical stack traces.
    • Some idea of what the task is waiting on if it is blocked.
    • If there are multiple executors, we can inspect each one.
    • Raw stack traces for the OS-level threads that the executors use to schedule tasks.
    • Which futures have been passed into a select!, their current state, and which one is being polled.
  • Additional tooling may be necessary for custom or exotic executors. The hypothetical Async Debugging Protocol is one size fits all, but one size won't fit all. We don't want to constrain what an executor can do just so we can debug it.
  • An async runtime should not be required to support these common debugging features. For example, perhaps it requires more space to support and therefore is not appropriate for an extremely constrained embedded environment.

I envisioned provided this with some kind of "Async Debugging Protocol" that is analogous to the Language Server Protocol. It's not really clear what this would be exactly, and there may be a better approach to solving these problems. For live debugging, it may be as simple as a few traits the executor can implement that provide introspection capabilities. For crash dumps, maybe there's a convention around including a couple of debugging symbols. It might require some kind of rich metadata format that tells the debugger how to inspect and interpret the core data structures for the executor.

What is the "most shiny" about this future?

The biggest aspect of this shiny future is the increased developer productivity, particularly in debugging. Many of the status quo stories called out the difficulty of debugging async code. In this shiny future, there are really good tools for live debugging, and many of these work offline in the crash dump case as well.

As a follow-on, the enhanced developer productivity will support writing more correct and safer programs, and probably allow developers to diagnose performance problems as well. These are a direct consequence of better debugging, but rather an indirect consequence of giving the developer better tools.

What are some of the potential pitfalls about this future?

Depending on how the "Async Debugging Protocol" works, there may be some overhead in following it. Hopefully this would be minimal, and not require any additional code during normal execution scenarios. But, it might make the debugging symbols or other metadata larger. Following the protocol may constrain some of the choices an async runtime can make.

At the very least, choosing to follow the protocol will require additional work on the part of the runtime implementor.

Did anything surprise you when writing this story? Did the story go any place unexpected?

Doing this in a way that is runtime and tooling agnostic will be challenging, so the details of how that could be done are not included in this story.

In some ways, doing this for a live process seems easier, since you can write code that inspects or reports on its own state. This seems to be the approach that tokio-console is taking.

There seems to be a lot of overlap between live debugging scenarios and post-mortem scenarios. With a little care, it might be able to support both using many of the same underlying capabilities.

What are some variations of this story that you considered, or that you think might be fun to write? Have any variations of this story already been written?

It would be worth removing the runtime agnostic aspect of this story and looking at how things would look if we just focused on Tokio or async-std. Perhaps each runtime would include a set of debugger macros to help find the runtime's state.

What are some of the things we'll have to figure out to realize this future? What projects besides Rust itself are involved, if any? (Optional)

A lot of the work here probably will not be done by the core Rust team, other than perhaps to coordinate and guide it. Most of the work will require coordination among projects like Tokio and async-std, as well as the debugging tool authors.

There does not seem to be an obvious way to implement everything in this story. It would probably be good to focus on a particular runtime at least to get a proof of concept and better sketch out the requirements.

๐Ÿ” Triage meetings

When, where

The weekly triage meeting is held on Zulip at 11:30 US Eastern time on Mondays.

So you want to fix a bug?

If you're interested in fixing bugs, there is no need to wait for the triage meeting. Take a look at the mentored async-await bugs that have no assignee. Every mentored bug should have a few comments. If you see one you like, you can add the @rustbot claim comment into the bug and start working on it! Feel to reach out to the mentor on Zulip to ask questions.

Project board

The project board tracks various bugs and other work items for the async foundation group. It is used to drive the triage process.

Triage process

In our weekly triage meetings, we take new issues assigned A-async-await and categorize them. The process is:

  • Review the project board, from right to left:
    • Look at what got Done, and celebrate! :tada:
    • Review In progress issues to check we are making progress and there is a clear path to finishing (otherwise, move to the appropriate column)
    • Review Blocked issues to see if there is anything we can do to unblock
    • Review Claimed issues to see if they are in progress, and if the assigned person still intends to work on it
    • Review To do issues and assign to anyone who wants to work on something
  • Review uncategorized issues
    • Mark P-low, P-medium, or P-high
    • Add P-high and assigned E-needs-mentor issues to the project board
    • Mark AsyncAwait-triaged
  • If there's still a shortage of To do issues, review the list of P-medium or P-low issues for candidates

Mentoring

If an issue is a good candidate for mentoring, mark E-needs-mentor and try to find a mentor.

Mentors assigned to issues should write up mentoring instructions. Often, this is just a couple lines pointing to the relevant code. Mentorship doesn't require intimate knowledge of the compiler, just some familiarity and a willingness to look around for the right code.

After writing instructions, mentors should un-assign themselves, add E-mentor, and remove E-needs-mentor. On the project board, if a mentor is assigned to an issue, it should go to the Claimed column until mentoring instructions are provided. After that, it should go to To do until someone has volunteered to work on it.

๐Ÿ”ฌ Design documents

The design documents (or "design docs", more commonly) describe potential designs. These docs vary greatly in terms of their readiness to be implemented:

  • Early on, they describe a vague idea for a future. Often this takes the shape of capturing constraints on the solution, rather than the solution itself.
  • When a feature is getting ready to ship, they can evolve into a full blown RFC, with links to tracking issues or other notes.

Early stage design docs

In the early stages, design docs are meant to capture interesting bits of "async design space". They are often updated to capture the results of a fruitful conversation or thread which uncovered contraints or challenges in solving a particular problem. They will capture a combination of the following:

  • use cases;
  • interesting aspects to the design;
  • alternatives;
  • interactions with other features.

Late stage design docs

As a design progresses, the doc should get more and more complete, until it becomes something akin to an RFC. (Often, at that point, we will expand the design document into a directory, adding an actual RFC draft and other contents; those things can live in this repo or elsewhere, depending.) Once we decide to put a design doc onto the roadmap, it will also contain links to tracking issues or other places to track the status.

โš ๏ธ Yield-safe lint

โ˜” Stream trait

โšก Generator syntax

๐Ÿ“ AsyncRead, AsyncWrite traits

๐Ÿงฌ Async fn in traits

๐Ÿ”’ Mutex (future-aware)

๐Ÿ“บ Async aware channels

๐Ÿ“ฆ Async closures

๐Ÿค Join combinator

๐Ÿคทโ€โ™€๏ธ Select combinator

๐Ÿšฐ Sink trait

๐ŸŽ‡ Async main

๐Ÿ—‘๏ธ Async drop

โ™ป๏ธ Async lifecycle

โณ Completion-based futures

๐Ÿ’ฌ Conversations

This section contains notes and summaries from conversations that we have had with people are using Rust and async and describing their experiences. These conversations and links are used as "evidence" when building the "status quo" section.

Not exhaustive nor mandatory

This section is not meant to be an "exhaustive list" of all sources. That would be impossible. Many conversations are short, not recorded, and hard to summaize. Others are subject to NDA. We certainly don't require that all claims in the status quo section are backed by evidence found here. Still, it's useful to have a place to dump notes and things for future reference.

๐Ÿฆ 2021-02-12 Twitter thread

Notes taken from the thread in response to Niko's tweet.

  • Enzo
    • A default event loop. "choosing your own event loop" takes time, then you have to understand the differences between each event loop etc.
    • Standard way of doing for await (variable of iterable) would be nice.
    • Standard promise combinators.
  • creepy_owlet
    • https://github.com/dtantsur/rust-osauth/blob/master/src/sync.rs
  • async trait --
    • https://twitter.com/jcsp_tweets/status/1359820431151267843
    • "I thought async was built-in"?
    • nasty compiler errors
    • ownership puzzle blog post
  • rubdos
    • blog post describes integrating two event loops
    • mentions desire for runtime independent libraries
    • qt provides a mechanism to integrate one's own event loop
    • llvm bug generates invalid arm7 assembly
  • alexmiberry
    • kotlin/scala code, blocked by absence of async trait
  • helpful blog post
    • jamesmcm
      • note that join and Result play poorly together
    • the post mentions rayon but this isn't really a case where one ought to use rayon -- still, Rayon's APIs here are SO much nicer :)
    • rust aws and lambda
  • issue requiring async drop
  • fasterthanlime --
    • this post is amazing
    • the discussion on Send bounds and the ways to debug it is great
  • bridging different runtimes using GATs
  • first server app, great thread with problems
    • "I wasn't expecting that it will be easy but after Go and Node.js development it felt extremely hard to start off anything with Rust."
    • "felt like I have to re-learn everything from scratch: structuring project and modules, dependency injection, managing the DB and of course dealing with concurrency"
    • common thread: poor docs, though only somewhat in async libraries
      • I had enums in the DB and it was a bit more complex to map them to my custom Rust enums but I succeeded with the help of a couple of blog posts โ€“ and not with Diesel documentation
      • I used Rusoto for dealing with AWS services. It's also pretty straightforward and high quality package โ€“ but again the documentation was sooo poor.
  • implaustin wrote a very nice post but it felt more like a "look how well this worked" post than one with actionable feedback
    • "Async has worked well so far. My top wishlist items are Sink and Stream traits in std. It's quite difficult to abstract over types that asynchronously produce or consume values."
    • "AsyncRead/AsyncWrite work fine for files, tcp streams, etc. But once you are past I/O and want to pass around structs, Sink and Stream are needed. One example of fragmentation is that Tokio channels used to implement the futures Sink/Stream traits, but no longer do in 1.0."
    • "I usually use Sink/Stream to abstract over different async channel types. Sometimes to hide the details of external dependencies from a task (e.g. where is this data going?). And sometimes to write common utility methods."
    • "One thing I can think of: there are still a lot of popular libraries that don't have async support (or are just getting there). Rocket, Criterion, Crossterm's execute, etc."
  • EchoRior:
    • "I've written a bit of rust before, but rust is my first introduction to Async. My main gripes are that it's hard to figure our what the "blessed" way of doing async is. I'd love to see async included in the book, but I understand that async is still evolving too much for that."
    • "Adding to the confusion: theres multiple executors, and they have a bit of lock in. Async libraries being dependent on which executor version I use is also confusing for newcomers. In other langs, it seems like one just uses everything from the stdlib and everything is compatible"
    • "That kind of gave me a lot of hesitation/fomo in the beginning, because it felt like I had to make some big choices around my tech stack that I felt I would be stuck with later. I ended up chatting about this in the discord & researching for multiple days before getting started."
    • "Also, due to there not being a "blessed" approach, I don't know if I'm working with some misconceptions around async in rust, and will end up discovering I will need to redo large parts of what I wrote."

โค๏ธ Acknowledgments

Thanks to everyone who helped forming the future of Rust async.

โœ๏ธ Participating in an writing session

Thanks to everyone who helped writing Stories by participating in one of the Async Rust writing sessions.

๐Ÿ’ฌ Discussing about stories

Thanks to everyone who discussed about stories, shiny future and new features.