Generic Associated Types initiative

initiative status: active

This page tracks the work of the Generic Associated Types (GATs) initiative! To learn more about what we are trying to do, and to find out the people who are doing it, take a look at the charter.

⚡ Current focus: MVP ⚡

We are currently focused on stabilizing a Minimum Viable Product form of GATs in [rust-lang/rust#96709]. Learn more here!

How Can I Get Involved?

  • Check for "help wanted" issues on this repository!
  • If you would like to help with development, please contact the owner to find out if there are things that need doing.
  • If you would like to help with the design, check the list of active design questions first.
  • If you have questions about the design, you can file an issue, but be sure to check the FAQ or the design-questions first to see if there is already something that covers your topic.
  • If you are using the feature and would like to provide feedback about your experiences, please [open a "experience report" issue].
  • If you are using the feature and would like to report a bug, please open a regular issue.

We also participate on Zulip, feel free to introduce yourself over there and ask us any questions you have.

Building Documentation

This repository is also an mdbook project. You can view and build it using the following command.

mdbook serve

"Generic Associated Types" Charter

Goals

Extend traits by permitting associated types to have generic parameters (types, lifetimes).

This enables writing traits that capture more complex patterns:

  • an [Iterable] trait for collections that support an iter method, yielding references into themselves
  • a [Mode] trait

It also provides the technical foundation of async functions in traits and return-position impl Trait in traits (although this dependence is internal to the compiler and exposed to users).

  • Concretely, extend associated types with generic parameters.
  • This makes it possible to write traits describing more complex patterns, e.g.
    • methods with arguments or return values that borrow from other arguments
  • Extend Rust traits to describe patterns that currently cannot be described...
    • associated types that reference
  • Extend associated types on traits to have type and lifetime parameters.

Membership

MVP Stabilization

We are currently focused on stabilizing a Minimum Viable Product form of GATs in rust-lang/rust#96709. That's a long github thread, so this page summarizes the key points to understand.

Places where conversation has been happening?

Want to read the firehose? Check out these places where conversation has been happening:

GATs permit complex patterns that will, on net, make Rust harder to use

Summary

There is no question that GATs enable new design patterns. For the most part, these design patterns take the form of enabling a new kind of abstraction. For example, many modes allows a trait that encapsulates a "mode" in which some other code will be executed; lacking GATs, this can only be done in simple cases. These new design patterns may locally improve the code, but, if it becomes commonplace to use more complex abstractions in Rust code bases, Rust code overall will become harder for people to understand and read. As burntsushi wrote:

The complexity increase comes precisely from the abstraction power given to folks. By increasing the accessibility to that power, that power will in turn be used more frequently and users of the language will need to contend with those abstractions.

burntsushi continues (emphasis ours):

Here's where we come to the contentious part of this perspective: a lot of people have real difficulty understanding abstractions.

Nick Cameron also wrote a blog post covering this theme. One of Nick's comments on Zulip points out that duplicated code in an API surface is annoying to maintain, but could be easier for consumers:

Nick Cameron: I think the risk of features like GATs is that encourages smaller, more abstract APIs which are more satisfying for library authors but which are harder for the median user to comprehend. For a non-GAT example, consider Result and Option compared to a more abstract monadic type. The latter reduces code duplication in the library and perhaps permits more interoperability, but we prefer having separate Result and Option in Rust because for most users, it's easier to understand an API with concrete values and functions like Ok and unwrap, rather than abstract concepts like unit and bind.

What are the alternatives?

Nobody is contending that the design patterns aren't addressing real problems, but they are contending that solving those problems with a trait-based abstraction may not be the right approach. So what are these alternatives?

  • In some cases, it's a "slightly less convenient API" (as in CSV reader).
  • In others, the answer is some combination of macros and code generation. For example, the many modes pattern was used to make parser combinators work better, but that could also be addressed via procedural macros and code generation.
  • In yet others, it's HRTB; the standard workaround for the Iterable pattern is to write for<'c> Iterate<'c>, for example.
  • Finally, it is sometimes possible to workaround the lack of GATs by introducing runtime overhead. e.g., foregoing the optimizations that many modes enabled, or using an enum, or a trait object.

Counterpoint: GATs often allow library authors to hide complexity

Somewhat counter the above, frequently the goal with GATs is actually to make the interface of the library simpler:

oli: Yes, and in user libraries right now you do stumble across those very complex APIs that work around the lack of GATs

vultix: I've personally seen two use-cases where GATs would have been useful, both times while writing libraries. We found an ugly workaround for the first use-case that wasn't horrible, but the second workaround made the user-facing api severely worse.

almann: Currently, delegation via our current APIs are difficult because we cannot have an associated type that can be constructed with a per invocation lifetime. This example illustrates this, and shows how we can work around this by the judicious use of unsafe code, by essentially emulating the reference with pointers we know to be scoped appropriately. We also work around the lack of GAT by simplifying the value writer used in the implementation of IonCAnnotationsFieldWriter for IonCWriterHandle by refering to Self which works, but makes the API potentially easier to misuse.

The many modes pattern shows how the Chumsky parser generator used GATs internally. They don't appear in the public facing methods.

Jake Goulding points out that we hear very few stories of people who tried GATs but backed off:

Jake Goulding: It would be wonderful if there were experience reports of people saying "I thought I needed GATs but after writing them I realized I could do X instead and that was much clearer". My one experience was that I could use one of the stable GAT workarounds, but it felt wrong and I was happy to trial-run the nightly GAT version instead.

Counterpoint: Macros are not easier

Many folks chimed in to express their feeling that proc macros, or duplicated code, are not easier to understand or maintain. Some examples:

Alice Cecile: Macros are so much worse to read / write / maintain than pretty much any type magic I've ever seen. Even very simple stuff is rough

Ralf Jung: I would not call this kind of nonsense very clear. macros do tend to lead to state explosion and huge amounts of redundancy in the docs, making it very hard to see the actual pattern. that's unsurprising since the compiler is never told about the pattern. (that particular example is not solved by GATs, it requires other new type system features, including variadics. it just demonstrates well the perils of macro-generated code.)

Counterpoint: Arguing against a feature because it could be misused would block many parts of Rust

Ralf Jung: yes, that I think is the main argument to me. not having a feature because it could be used to write unnecessarily complicated APIs sounds like a bad argument to me -- it effectively means preventing some people from writing good APIs (assuming we accept there are APIs where the GAT version is clearly superior) for fear of other people writing bad APIs. one can already write unnecessarily complicated APIs in Rust in a lot of ways -- e.g. we didnt block proc-macros just because they can be used for bad APIs, though anyone who had to debug other people's sea of macros knows it can easily lead to terribly opaque API surfaces. heck, we have unsafe code, where the consequences of using the feature in the wrong way are (IMO) a lot worse than GATs. I dont understand why GATs are suddenly considered a problem when they are (IMO) a feature much less in danger of being misused than unsafe code or proc macros. Rust has empowerment literally in its one-sentence slurb: we give people powerful tools and all the help we can to use them, and we accept that this means some people will misuse them, and we do what we can (technically and socially) to mitigate the consequences of that.

Concern: GATs are useful but the current state is too rough and should not be stabilized

Examples

nrc wrote:

There are numerous cases of small bugs or small gaps in expressivity which have prevented people using GATs for the use cases they want to use them for (see e.g., the blog post linked from the OP, or this reddit thread). These are the sort of things which must be addressed before stabilisation so that we can be sure that they are in fact small and not hiding insurmountable issues.

and burntsushi wrote:

I understand the idea of making incremental progress, but as a total outsider to lang development, this feature looks like it has way too many papercuts and limitations to stabilize right now. Not being able to write a basic filter implementation on a LendingIterator trait in safe code is a huge red flag to me. Namely, if I can't write a filter adaptor in safe straight-forward code, then... what else can't I do?

clintfred chimes in with a "intermediate Rust user" perspective similar to this:

As a solidly intermediate Rust user I can say that I have tried wrap my mind around GATs several times over the past 3-4 years, and I don't really "get" it. To be completely fair, I have never tried using them on nightly, so maybe this something best groked hands-on.

Sabrina also wrote a blog post showing some of the problems with lifetime GATs in their current state.

Details

GATs in their current form have some pretty severe limitations. These limitations block one of the most compelling use cases, lending iterators, and fixing that will require polonius (c.f. #92985). Worse, these limitations are hard to convey to users: things just kind of randomly don't work, and it's not obvious that this is the compiler's fault (this is kind of a "worst case" for learnability, since it can trick you into thinking you don't understand the model, when in fact you do).

Responses

Aren't GATs too hard to learn in their current state (or maybe in general)?

There is no doubt that "papercuts" can have a major impact on learnability. However, the lack of GATs, even in rudimentary form, causes its share of papercuts as well. Things that feel like they ought to be expressable (e.g., an Iterable trait) are not. For example, CLEckhardt writes:

The complexity is either in the language or the code. There are a lot of complex things that Rust lets me express cleanly, which is why I love it. But when I run into things that Rust can't express, I have to write complex code. So the complexity doesn't leave, it just moves.

mijamo wrote something similar:

For what it's worth I am a rust beginner coming from mostly Python / Typescript. There are MANY things in Rust that make my head hurt but GAT just seem natural after some introduction and I actually get stuck trying to go around the lack of GAT nearly every time I try to write Rust code (I have given up in every single of my rust projects so far and I would say lack of GAT and annoying lifetime handling were my 2 main problems).

audunhalland left a similar comment:

I'm no "type theorist" at all, I have no deep insight into how the Rust typesystem works internally, or what is the deeper cause of some random compile error. With GATs I just tend to keep flicking on the code until the compiler is somehow happy. Just like classic non-GAT Rust code :) I have stumbled across some of the HRTB limitations, but have managed to work around them quite easily. My feeling is that for my use-cases GATs now feel like a natural part of the language.

Pzixel describes how they wind up using GATs in most projects:

It all starts similarly: I have bunch of traits that return Vecs or something. Then I profile and see allocs in tight loops I cannot afford. [..] Every time you want to have zero-cost in traits GATs come into play. Iterators and futures are just some examples, there are more. And I bet that every program written in rust are using these concepts and people there either don't care and put boxes in traits or write their own super niche fragile implementations to fit their needs. [..] I'd like to put some thought on the matter "No one using GATs" and "GATs are just too complicated" - it reminds me of generics discussions in Go. While GATs may be complicated making it work without them is even harder. It worth always keeping it in mind

Will we really do the follow-up required?

One specific danger is that we would stabilize GATs and then "declare victory", leaving the feature unchanged for years. There are a few reasons to think that won't happen.

For one, the new types team is ramping up and focused on precisely these kinds of improvements.

Second, having features on master leads to more attention and volunteers, not less (although that alone is not enough when the problems are deeper). jam1garner writes:

And yeah the UX could be better :( I can't speak for everyone but as someone who tries to make diagnostics PRs when I hit issues and have the time/energy, I personally can't really kick the tires if I can't use the feature in my libraries, and thus don't have a very natural path to find pain points to.

More generally, Rust's entire history has been one of taking complex features, exposing them, and sanding down the edges over time -- and sometimes stalling out for an extended period of time. And yet, it's hard to come up with an example where it would've been better to hold off on stabilization. Take async functions: when stabilized, they had many diagnostic rough edges, and still do. And yet, it's clearly good that they're available.

Concern: we should stabilize lifetimes only

Summary

This suggestion stems from the concern that GATs give rise to abstractions that are too complex, combined with a recognition that the lack of lifetime GATs means that common patterns like Iterable cannot be expressed, which leads to its own form of complexity.

The argument then is that we should stabilize lifetime GATs only and avoid types. This limits the kinds of patterns that GATs can be used for. Patterns that abstract over types, like many modes, pointer types, or generic scopes, should instead use alternatives, like code generation or macros.

Counterpoint: Irregular design is its own learnability hazard

Limiting things to types is actually its own kind of learnability hazard. Having to learn "exceptions" (e.g., this only works for lifetimes) rather than a uniform set of rules (everything can take type parameters) can be quite challenging, particularly when the motivation for those exceptions is simply that we don't want people to be able to express certain abstractions (versus, say, the limitations that motivate dyn-safety, which are that it is simply not possible to general compiled code).

Counterpoint: Without types, we can't express RPITIT or async fn

To desugar -> impl Trait, one must have the ability to write type-level GATs. For example:

#![allow(unused)]
fn main() {
trait Convert {
    fn convert(&self, c: impl Converter) -> impl Display;
}
}

would be translated to something like this:

#![allow(unused)]
fn main() {
trait Convert {
    type convert<T: Converter>: Display;
    fn convert<T: Converter>(&self, c: T) -> Self::convert<T>;
}
}

If we lose the ability to write type-only GATs, then we can no longer fully explain or desugar these more complex features, which is also its own learnability hazard (and gives some indication as to the expressiveness that is being lost).

How do we know that the current MVP is forwards compatible with the fixes?

Summary

GATs in their current form have some pretty severe limitations. As we address those limitations, we may find we wish to make backwards incompatible changes. Keeping things unstable ensures we have room to change the details.

Counterpoints

The GAT syntax being stabilized is leveraging existing syntactic constructs, like associated types and for<'a> bounds. The problems we are running into exist, for the most part, with those constructs as well. The fixes discussed in the shiny future section are not typically specific to GATs.

One area where we've specifically concerned backcompat is the question of required bounds, and in that area we particularly chose the most forwards compatible option (requiring users to write things out explicitly, thus ensuring they can be made optional or defaulted later on). We will always have to permit users to write where-clauses, so there is no real danger there.

Concern: Do we have the right rules for required bounds?

Summary

QuineDot writes1:

Are the rules actually how we want them to be?

1

QuineDot also asked about the current rules. The rules are documented on the required bounds page, and they've been brought up to date with nightly.

QuineDot pointed out various grey cases. The rules are reviewed in the outlives-defaults design discussion, which also discussed (in the FAQ) the possibility that we are missing some rules.

QuineDot also pointed out that the implementation seems to have diverged from the rules as discussed. That needs to be evaluated and reconciled.

📚 Explainer

available on nightly stabilization

🚨 Warning: Rough edges ahead. This section describes the GATS Minimum Viable Product. This contains a number of known rough edges. The shiny future page desribes GATs how we would like to them to be. Note that things which are not stabilized (as of this writing, all of GATs) are subject to change. The design pattern page captures and describes uses of GATs in the wild.

Read on!

Why GATs?

available on nightly stabilization

As the name suggests, generic associated types are an extension that permit associated types to have generic parameters. This allows associated types to capture types that may include generic parameters that came from a trait method. These can be lifetime or type parameters.

To see why they are useful, we'll start by looking at the Iterator trait, which defines a single iterator over some items. This trait works great with plain associated types. We'll then consider how we could create an Iterable trait, that defines a collection that can be iterated many times -- this trait requires generic associated types.

Example: Iterator

Associated types in a trait are used to capture types that appear in methods but are determined based on the impl that is chosen (i.e., by the type implementing the trait) rather than being specified from the outside. The simplest example is the Iterator trait:

#![allow(unused)]
fn main() {
trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
}
}

Because the Iterator trait defines the Item type as an associated type, that implies that every type which implements Iterator will also specify what sort of Item it generates.

As an example, imagine I have an iterator for items in a slice &'c [T]...

#![allow(unused)]
fn main() {
struct Iter<'c, T> {
    data: &'c [T],
}

impl<'c, T> Iterator for Iter<'c, T> {
    type Item = &'c T;

    fn next(&mut self) -> Option<Self::Item> {
        if let Some((prefix_elem, suffix)) = self.data.split_first() {
            self.data = suffix;
            Some(prefix_elem)
        } else {
            None
        }
    }
}
}

...the impl specifies that the type of this iterator is &'c T.

Extending to Iterable

Associated types are useful, but on their own they are often insufficient to capture the patterns we would like to write as a trait. Often, this occurs when the associated type wants to include data borrowed from self or some other parameter.

The Iterator trait is great, but if you write a generic function that takes an Iterator, that iterator can only be iterated over a single time. Imagine a function that wanted to iterate once to find a total count and then again to process each item, this time knowing the full count:

#![allow(unused)]
fn main() {
fn count_twice<I: Iterator>(iterator: I) {
    let mut count = 0;
    for _ in iterator {
        count += 1;
    }

    // Error: iterator already used
    for elem in iterator {
        process(elem, count);
    }
}
}

Of course, most Rust types in the standard library have a method called iter that is exactly what we want (e.g., [T]::iter). Given an &'i self, these functions return a fresh iterator that yields up &'i T references into the collection. Because they take a &self, they can be called as many times as we want. So, if we knew which kind of collection we had, we could easily write count_twice to take a [T] or a HashMap<T> or whatever. But what if we want to write it generically, so it works across any collection? That turns out to be impossible to do nicely with associated types.

To see why, let's try to write the code and see where we get a stuck. We'll start by defining an Iterable trait:

#![allow(unused)]
fn main() {
trait Iterable {
    // Type of item yielded up; will be a reference into `Self`.
    type Item;

    // Type of iterator we return. Will return `Self::Item` elements.
    type Iter: Iterator<Item = Self::Item>;


    fn iter<'c>(&'c self) -> Self::Iter;
}
}

But when we try to write the impl, we run into a problem:

#![allow(unused)]
fn main() {
impl<T> Iterable for [T] {
    type Item = &'hmm T;
    //           ^^^^ what lifetime to use here?

    type Iter = Iter<'hmm, T>;
    //               ^^^^ what lifetime to use here?

    fn iter<'c>(&'c self) -> Self::Iter {
        //       ^^ THIS is the lifetime we want, but it's not in scope!
        Iter { data: self }
    }
}
}

You see the problem? In the case of Iterator, the Self type was &'c [T] and the Item type was &'c T. Since both 'c and T appeared at the impl header level, both of those generic parameters were in scope and usable in the associated type. But for Iterable, we still want to yield references like &'c T, but 'c is not declared on the impl -- it's specific to the call to iter.

The fact that the lifetime parameter 'c is declared on the method is not just a minor detail. It is exactly what allows something to be iterated many times; that is, the thing we are trying to capture with the Iterable trait. It means that the borrow you get when you call iter() only has to last as long as this specific call to iter. If you call iter multiple times, they can be instantiated with distinct borrows. (In contrast, if 'c were declared at the impl scope, the borrow would last across all calls to any method in Iterable.)

Defining and implementing the Iterable trait with GATs

available on nightly stabilization

Play with this example on the Rust playground.

To express traits like Iterable, we can make use generic associated types -- that is, associated types with generic parameters. Here is the complete Iterable trait:

#![allow(unused)]
fn main() {
trait Iterable {
    // Type of item yielded up; will be a reference into `Self`.
    type Item<'collection>
    where
        Self: 'collection;

    // Type of iterator we return. Will return `Self::Item` elements.
    type Iterator<'collection>: Iterator<Item = Self::Item<'collection>>
    where
        Self: 'collection;

    fn iter<'c>(&'c self) -> Self::Iterator<'c>;
    //           ^^                         ^^
    //
    // Returns a `Self::Iter` derived from `self`.
}
}

Let's walk through it piece by piece...

  • We added a 'collection parameter to Item. This represents "the specific collection that the Item is borrowed from" (or, if you prefer, the lifetime for which that collection is borrowed).
  • The same 'collection parameter is added to Iterator, indicating the collection that the iterator borrows its items from.
  • In the iter method, the value of 'collection comes from self, indicating that iter returns an Iterator linked to self.
  • Each associated type also has a where Self: 'collection bound. These bounds are required by the compiler -- if you don't add them, you will get a compilation error. As explained here, this is a compromise that is part of the GATs MVP to give us time to work out the best long-term solution.
    • The bound where Self: 'collection is called an outlives bound -- it indicates that the data in Self must outlive the 'collection lifetime

Implementing the trait

Let's write an implementation of this trait. We'll implement it for the Vec<T> type; a &Vec<T> can be coerced into a &[T] slice, so we can re-use the slice Iter that we defined before (the playground link includes an impl of Iterable for [T] as well, but we'll use Vec here because it's more convenient).

#![allow(unused)]
fn main() {
// from before
struct Iter<'c, T> {
    data: &'c [T],
}

impl<T> Iterable for Vec<T> {
    type Item<'c> = &'c T
    where
        T: 'c;
    
    type Iterator<'c> = Iter<'c, T>
    where
        T: 'c;

    fn iter<'c>(&'c self) -> Self::Iterator<'c> {
        Iter { data: self }
    }
}
}

Invoking it

Now that we have the Iterable trait, we can reference it in our "count twice" function.

#![allow(unused)]
fn main() {
fn count_twice<I: Iterable>(collection: &I) {
    let mut count = 0;
    for _ in collection.iter() {
        count += 1;
    }

    for elem in collection.iter() {
        process(elem, count);
    }
}
}

and we can invoke that by writing code like count_twice(&vec![1, 2, 3, 4, 5, 6]).

Play with the code from this section on the Rust playground.

GATs in where clauses

available on nightly stabilization

Now that we have defined an Iterable trait, we can explore different ways to reference it in where clauses.

Specifying the value of a GAT

Given some type T: Clone, this function takes any Iterable that yields &T references, clones them, and returns a vector of the resulting T values:

#![allow(unused)]
fn main() {
fn into_vec<T>(
    iterable: &impl for<'a> Iterable<Item<'a> = &'a T>,
) -> Vec<T>
where
    T: Clone
{
    let mut out = vec![];
    for elem in iterable.iter() {
        out.push(elem.clone());
    }
    out
}
}

Let's look at this function more closely. The most interesting part is the type of the iterable parameter:

#![allow(unused)]
fn main() {
iterable: &impl for<'a> Iterable<Item<'a> = &'a T>,
//              ^^^^^^^          ^^^^^^^^
}

This admittedly verbose syntax is a way of saying:

  • iterable is some kind of Iterable that, when iterated with some lifetime 'a, yields up values of type &'a T.

The for<'a> binder is a way of making this bound apply for any lifetime, rather than talking about some specific lifetime.

Applying GATs to a specific lifetime

The previous example showed an iterable applied to any lifetime. It is also possible to give bounds for some specific lifetime. This function, for example, takes an iterable with lifetime 'i and yields up the first element:

#![allow(unused)]
fn main() {
fn first<'i, T>(
    iterable: &'i impl Iterable<Item<'i> = &'i T>,
) -> Option<&'i T>
{
    iterable.iter().next()
}
}

The bound impl Iterable<Item<'i> = &'i T> says "when iterated with lifetime 'i, the resulting reference is &'i T".

Bounding a GAT

Sometimes we want to specify that the value of a GAT meets some additional trait bound. For example, maybe wish to accept any Iterable, so long as its Item values implement Send. We can do that like so...

#![allow(unused)]
fn main() {
fn sendable_items<I>(iterable: &I)
where
    I: Iterable,
    for<'a> I::Item<'a>: Send,
{
}
}

Rough edges

available on nightly [stabilization][rust-lang/rust#96709]

[rust-lang/rust#96709]: https://github.com/rust-lang/rust/pull/96709 seeking feedback

The current MVP state includes a number of known rough edges, explained here.

EdgeBrief explanation
Clumsy HRTB syntaxThe syntax for<'a> T: Iterable<Item<'a> = ...> is clumsy
Required boundsCompiler requires you to write where Self: 'me a lot
No implied bounds on HRTBThe for<'a> syntax should really mean "for any suitable 'a" to avoid false errors
HRTB limited to lifetimesYou cannot write for<T> or for<const C> to talk about types, lifetimes
Polonius interactionSome GAT patterns require polonius to pass the borrow checker
Not dyn safeTraits with GATs are not dyn safe, even if those GATs are limited to lifetimes

Required bounds

available on nightly [stabilization][rust-lang/rust#96709]

[rust-lang/rust#96709]: https://github.com/rust-lang/rust/pull/96709 seeking feedback

We are actively soliciting feedback on the design of this aspect of GATs. This section explains the current nightly behavior, but at the end there is note about the behavior we expect to adopt in the future.

A common use for GATs is to represent the lifetime of data that is borrowed from a value of type Self, or some other parameter. Consider the iter method of the Iterable trait:

#![allow(unused)]
fn main() {
trait Iterable {
    ...

    fn iter<'a>(&'a self) -> Self::Iterator<'a>;
}
}

Here, the argument 'a that is given to Self::Iterator represents the lifetime of the self reference. However, by default, there is nothing in the definition of the Iterator associated type that links its lifetime argument and the Self type:

#![allow(unused)]
fn main() {
// Warning: For reasons we are in the midst of explaining,
// this version of the trait will not compile.
trait Iterable {
    type Item<'me>;

    type Iterator<'me>: Iterator<Item = Self::Item<'me>>;

    fn iter<'a>(&'a self) -> Self::Iterator<'a>;
}
}

If you try to compile this trait, you will find that you get an error. To make it compile, you have to indicate the link between 'me and Self by adding where Self: 'me. This outlives bound indicates the GATs can only be used with a lifetime 'me that could legally be used to borrow Self. This version compiles:

#![allow(unused)]
fn main() {
trait Iterable {
    type Item<'me>
    where
        Self: 'me;

    type Iterator<'me>: Iterator<Item = Self::Item<'me>>
    where
        Self: 'me;

    fn iter<'a>(&'a self) -> Self::Iterator<'a>;
}
}

Why are these bounds required?

Without these bounds, users of the trait would almost certainly not be able to write the impls that they need to write. Consider an implementation of Iterable for Vec<T>, assuming that there are no where Self: 'me bounds on the GATs:

#![allow(unused)]
fn main() {
impl Iterable for Vec<T> {
    type Item<'c> = &'c T;
    type Iterator<'c> = std::vec::Iter<'c, T>;
    fn iter(&self) -> Self::Iterator<'_> { self.iter() }
}
}

The problem comes from the associated types. Consider the type Item<'c> = &'c T declaration, for example: for the type &'c T to be legal, we must know that T: 'c. Otherwise, nothing stops us from using Self::Item<'static> to construct a reference with a lifetime 'static that may outlive its referent T, and that can lead to unsoundness in the type system. In the case of the iter method, the fact that it takes a parameter self of type &'c Vec<T> already implies that T: 'me (otherwise that parameter would have an invalid type). However, that doesn't apply to the GAT Item. This is why the associated types need a where clause:

#![allow(unused)]
fn main() {
impl Iterable for Vec<T> {
    type Item<'c> = &'c T where Self: 'c;
    type Iterator<'c> = std::vec::Iter<'c, T> where Self: 'c;
    fn iter(&self) -> Self::Iterator<'_> { self.iter() }
}
}

However, this impl is not legal unless the trait also has a where Self: 'c requirement. Otherwise, the impl has more where clauses than the trait, and that causes a problem for generic users that don't know which impl they are using.

Where can I learn more?

You can learn more about the precise rules of this decision, as well as the motivations, by visiting the detailed design page.

Feedback requested!

The current compiler adds the future defaults as a hard error precisely so that we can get the attention of early users and find out if these where clauses pose any kind of problem. We are not sure yet what long term path is best:

  • Remove the required bounds altogether.
  • Remove the required bounds and replace them with a warn-by-default lint, allowing users to more easily opt out.
  • Add the required bounds by default so you don't have to write them explicitly.

If you are finding that you have a trait and impls that you believe would compile fine, but doesn't because of these where clauses, then we would like to hear from you! Please file an issue on this repository, and use the "Feedback on required bounds" template. In the meantime, there is a workaround described in the next section that should allow any trait to work.

Workaround

If you find that this requirement is causing you a problem, there is a workaround. You can refactor your trait into two traits. For example, to write a version of Iterable that doesn't require where Self: 'me, you might do the following:

#![allow(unused)]
fn main() {
trait IterableTypes {
    type Item<'me>;
    type Iterator<'me>: Iterator<Item = Self::Item<'me>>;
}

trait Iterable: IterableTypes {
    fn iter<'a>(&'a self) -> Self::Iterator<'a>;
}
}

This is a bit heavy-handed, but there's a logic to it: the rules are geared around ensuring that the associated types and methods that appear together in a single trait will work well together. By separating the associated types from the function into distinct traits, you are effectively asserting that the associated types are meant to be used independently from the function and hence it doesn't necessarily make sense to have the where clauses derived from the method signature.

No implied bounds on HRTB

🔮 Shiny future

status: speculative, no RFC

🚨 Warning: Speculation ahead. This section is a rewrite of the explainer that includes various improvements. These improvements are still in proposal form and will require RFCs and design work before they are stabilized.

You can also see a list of the specific proposals being used in this section.

Defining and implementing the Iterable trait with GATs

status: speculative, no RFC

🚨 Warning: Speculation ahead. This is the "shiny future" page that integrates various speculative features. To see how things work today, see the corresponding page on the explainer.

To express traits like Iterable, we can make use generic associated types -- that is, associated types with generic parameters. Here is the complete Iterable trait:

#![allow(unused)]
fn main() {
trait Iterable {
    // Type of item yielded up; will be a reference into `Self`.
    type Item<'collection>;

    // Type of iterator we return. Will return `Self::Item` elements.
    type Iterator<'collection>: Iterator<Item = Self::Item<'collection>>;

    fn iter(&self) -> Self::Iterator<'_>;
    //      ^                        ^^
    //
    // Returns a `Self::Iter` derived from `self`.
}
}

Let's walk through it piece by piece...

  • We added a 'collection parameter to Item. This represents "the specific collection that the Item is borrowed from" (or, if you prefer, the lifetime for which that collection is borrowed).
  • The same 'collection parameter is added to Iterator, indicating the collection that the iterator borrows its items from.
  • In the iter method, the value of 'collection comes from self, indicating that iter returns an Iterator linked to self.

Implementing the trait

Let's write an implementation of this trait. We'll implement it for the Vec<T> type; a &Vec<T> can be coerced into a &[T] slice, so we can re-use the slice Iter that we defined before (the [playground] link includes an impl of Iterable for [T] as well, but we'll use Vec here because it's more convenient).

#![allow(unused)]
fn main() {
// from before
struct Iter<'c, T> {
    data: &'c [T],
}

impl<T> Iterable for Vec<T> {
    type Item<'c> = &'c T;
    
    type Iterator<'c> = Iter<'c, T>;

    fn iter(&self) -> Self::Iterator<'_> {
        Iter { data: self }
    }
}
}

Invoking it

Now that we have the Iterable trait, we can reference it in our "count twice" function.

#![allow(unused)]
fn main() {
fn count_twice<I: Iterable>(collection: &I) {
    let mut count = 0;
    for _ in collection.iter() {
        count += 1;
    }

    for elem in collection.iter() {
        process(elem, count);
    }
}
}

and we can invoke that by writing code like count_twice(&vec![1, 2, 3, 4, 5, 6]).

[Play with the code from this section on the Rust playground.][playground]

GATs in where-clauses

status: speculative, no RFC

🚨 Warning: Speculation ahead. This is the "shiny future" page that integrates various speculative features. To see how things work today, see the corresponding page on the explainer.

Now that we have defined an Iterable trait, we can explore different ways to reference it.

Specifying the value of a GAT

Given some type T: Clone, this function takes any Iterable that yields &T references, clones them, and returns a vector of the resulting T values:

#![allow(unused)]
fn main() {
fn into_vec<T>(
    iterable: &impl Iterable<Item<'_> = &T>,
) -> Vec<T>
where
    T: Clone
{
    let mut out = vec![];
    for elem in iterable.iter() {
        out.push(elem.clone());
    }
    out
}
}

Let's look at this function more closely. The most interesting part is the type of the iterable parameter:

#![allow(unused)]
fn main() {
fn into_vec<T>(
    iterable: &impl Iterable<Item<'_> = &T>,
//                                ^^    ^
//                                |     |
//                                |    Lifetime elided in output position
//                               Lifetime elided in input position          
) -> Vec<T>
}

Both '_ and &T-with-no-explicit-lifetime are examples of Rust's "lifetime elision" syntax. You're probably familiar with elision from functions like this one:

#![allow(unused)]
fn main() {
fn pick(c: &[T]) -> &T
//         ^        ^
//         |        |
//         |       Lifetime elided in output position
//        Lifetime elided in input position          
}

Whenever lifetimes are elided in input position, it means "pick any lifetime, I don't care". When they are elided in output position, it means "pick a lifetime from the inputs, or error if that's ambiguous". For functions, you can use a named lifetime to make the connection more explicit:

#![allow(unused)]
fn main() {
fn pick<'c>(c: &'c [T]) -> &'c T
}

In the same way, with GATs, we can use a named lifetime, bound with for, to make things more explicit:

#![allow(unused)]
fn main() {
fn into_vec<T>(
    iterable: &impl for<'c> Iterable<Item<'c> = &'c T>,
    //              -------               --     --
) -> Vec<T>
}

The for notation here is meant to read like "for any lifetime 'c, Item<'c> will be &'c T".

Applying GATs to a specific lifetime

The previous example showed an iterable applied to any lifetime. It is also possible to give bounds for some specific lifetime. This function, for example, takes an iterable with lifetime 'i and yields up the first element:

#![allow(unused)]
fn main() {
fn first<'i, T>(
    iterable: &'i impl Iterable<Item<'i> = &'i T>,
) -> Option<&'i T>
{
    iterable.iter().next()
}
}

The bound impl Iterable<Item<'i> = &'i T> says "when iterated with lifetime 'i, the resulting reference is &'i T".

Bounding a GAT

Sometimes we want to specify that the value of a GAT meets some additional trait bound. For example, maybe wish to accept any Iterable, so long as its Item values implement Send. The '_ notation we saw earlier can be used to do that quite easily:

#![allow(unused)]
fn main() {
fn sendable_items<I>(iterable: &I)
where
    I: Iterable,
    I::Item<'_>: Send, // 👈
{
}
}

Using another nightly feature (associated_type_bounds, tracked in #52662), you can also write the above more compactly:

#![allow(unused)]
fn main() {
fn sendable_items<I>(iterable: &I)
where
    I: Iterable<Item<'_>: Send>, // 👈
}

Proposals

status: speculative, no RFC

Specific improvements being used in the shiny future section include:

ImprovementSummary
Concise HRTB syntaxpermit T: Iterable<Item<'_> = &u32> or T::Item<'_>: Send instead of for<'a> T: Iterable<Item<'a> = &'a u32>
[HRTB implied bounds]The for<'a> syntax in HRTB means "any suitable 'a" and not "any 'a at all"
PoloniusPolonius-style borrow checking
Default outlives boundsAdd default bounds for where Self: 'a when appropriate rather than requiring users to write them automatically; add those same defaults to the impl

Design patterns

A natural question with GATs is to ask "what are they used for?" Because GATs represent a kind of "fundamental capability" of traits, though, that question can be difficult to answer in a short summary -- they can be used for all kinds of things! Therefore, this section attempts to answer by summarizing "design patterns" that we have seen in the wild that are enabled by GATs. These patterns are described through a "deep dive" into a particular example, often of a crate in the wild; but they represent patterns that could be extracted and applied in other cases.

List of projects using GATs

Over the years, many people have posted examples of how they would like to use GATs. compiler-errors compiled a mostly complete list which was posted to the stabilization issue. We reproduce that list here:

Projects using GATs

  • connector-x - uses GATs to provide zero-copy interfaces to load data from DBs.
  • kas - uses Generic Associated Types to avoid the unsafety around draw_handle and size_handle, fixing a possible memory-safety violation in the process.
    • "generic associated types remove a usage of unsafe (revealing a bug in the process), and are almost certainly the way forward (once the compiler properly supports this)"
  • raft-engine - uses GATs in a generic builder pattern

Blocked (at least in part) by GATs:

  • Rust in the linux kernel - https://github.com/Rust-for-Linux/linux/issues/2
  • udoprog/audio - "My goal is to author a set of traits and data structures that can be used across audio libraries in the ecosystem. I'm currently holding off on GATs landing, since that's needed to provide proper abstractions without incurring a runtime overhead."
  • graphene - "This would allow the result types of most Graph methods to be impl Iterator, such that implementors can use whatever implementation they want. To achieve this currently we are using Box<Iterator> as return types which is inefficient."
  • ion-rust - "Currently, delegation via our current APIs are difficult because we cannot have an associated type that can be constructed with a per invocation lifetime. This example illustrates this, and shows how we can work around this by the judicious use of unsafe code[...]"
  • proptest - GATs could be used to represent non-owned values
  • ink - GATs could be used to simplify macro codegen around trait impl
  • objc2 - Could use GATs to abstract over a generic Reference type, simplifying two methods into one
  • mpris-rs - GATs could be used to abstract over an iterator type
  • dioxus - "It allows some property of a node to depend on both the state of it's children and it's parent. Specifically this would make text wrapping, and overflow possible"
  • sophia_rs - An other way to go would be to have an "rdf-api" crate similar to what RDF/JS is doing for the RDF models and its commons extensions. And have Oxigraph and Sophia and hopefully the other RDF related libraries in Rust use it. But it might be hard to build a nice and efficient API without GAT.

Other miscellaneous mentions of GATs, or GATs blocked a rewrite but workarounds were found

  • veracruz - The workaround here is to require the associated-type implementations to all be "lifetime-less", which probably requires unsafe code in the implementations.
  • embedded-nal - Using a typestate pattern to represent UDP trait's states
  • linfa - For now though, there are several limitations to the implementation due to a lack of type-system features. For instance, it has to return a Vec of references to points because the lifetime to &self has to show up somewhere in the return type, and we don't have access to GATs yet. Basically, I get around these issues by performing more allocation than we should [...]
  • heed - Initial rewrite of a trait relied on GATs, was eventually worked around but has its own limitations?
  • ockam - https://github.com/build-trust/ockam/issues/1564
  • rust-imap - could benefit with a GATified Index trait
  • anchors - "GAT will let us skip cloning during map"
  • capnproto-rust - The main obstacle is that getting this to work (particularly with capnp::traits::Owned) probably requires generic associated types, and it's unclear when those will land on stable rust.
  • gamma - Could use GATs to return iterators instead of having to box them, possibly providing a more general API without the slowdown of boxing
  • dicom-rs - GAT could be used to representing lifetime of borrowed data
  • rust-multihash - Apparently could use GATs to get around const-generics limitations
  • libprio-rs - Doing better will require a feature called "generic associated types" (per the SO above). Unfortunately, GATs are not stabilized yet; it looks like they are set to be stabilized in a few months.
  • gluesql - Could use GATs to turn a trait into an associated type(i think)
  • pushgen - mentioned that things could be simplified by GATs (or RPITIT)
  • plexus - "Until GATs land in stable Rust, this change requires boxing iterators to put them behind StorageProxy."
  • tensorflow/rust - "It would be most natural to provide an iterator over the records, but no efficient implementation is possible without GATS which I think I was hoping would already be in the language by now."

General themes for why folks want GATs

The general themes for why folks want GATs include...

  • GATs to avoid boxing/cloning (achieving a more performant, zero-copy API)
  • GATs to represent lifetime relationships that can't be expressed using regular associated types (e.g. fn(&self) -> Self::Gat<'_>)
    • Some of these get around it by using unsafe code which could be removed with GATs
  • GATs as a manual desugaring of RPITIT
  • GATs to offer a more abstract/pluggable API
  • GATs to provide a cleaner, DRY-er codebase

Iterable, lending iterators, etc

available on nightly stabilization

Summary

Traits often contain methods that return data borrowed from self or some other argument. When the type of data is an associated type, it needs to include a lifetime that links with self from a calling method. For example, in the Iterable trait...

#![allow(unused)]
fn main() {
trait Iterable {
    type Item<'me>
    where
        Self: 'me;

    type Iter<'me>: Iterator<Item = Self::Item<'me>>
    where
        Self: 'me;

    fn iter(&iter self) -> Self::Item<'_>;
}
}

...the Item and Iter traits take a 'me parameter, which is linked to the self variable given when iter is called.

Details

There are many variants on this pattern:

  • Iterable, as shown above;
  • LendingIterator (and other LendingFoo) traits, which permit one to iterate over items but where the data may be stored within the iterator itself;
  • etc.

The where Self: 'me shown in the summary is (hopefully) a temporary limitation imposed by the current MVP. It indicates that the 'me lifetime can be used to borrow data from Self. Currently these where clauses are mandatory; they may be defaulted or made optional in the future. For a deeper explanation, see the required bounds page in the explainer.

Workarounds

Lacking this pattern, there are a number of common workarounds, each with downsides:

  • Use Box<dyn> values, as in graphene, though this adds dynamic dispatch overhead, inhibits inlining, and makes interactions with Send and Sync more complex;
  • Return a collection, like a Vec, as in metamolectular, though this results in unnecessary memory allocation;
  • Use HRTB, as rustc does, which is complex and leaks into your caller's signatures.

The "many modes" pattern

available on nightly stabilization

Summary

The "many modes" pattern is being able to take a single function and have it operate in multiple "modes". In the specific case examined here, the chumsky parsing library, GATs were used to make the parsing combinators generic over a mode (produce a result vs do not produce a result). This results in significant speedups, because producing a result when you don't need one is expensive at runtime.

To implement the "many modes" pattern, you often have a type representing each mode:

#![allow(unused)]
fn main() {
struct ParseMode;
struct CheckMode;
}

and then with a trait that defines the effect of that type. This trait often has associated types that can, e.g., transform the result of a function executing in that mode. This associate type takes a generic parameter T representing the return type of the function in question:

#![allow(unused)]
fn main() {
trait Mode {
    /// Represents the *actual* output when a function that produces 
    /// `T` is processed in this mode.
    type Output<T>;
}
}

Details

This was originally posted as a comment on the issue thread.

The first example I looked at closely was the chumsky parsing library. This is leveraging a pattern that I would call the "many modes" pattern. The idea is that you have some "core function" but you want to execute this function in many different modes. Ideally, you'd like to define the modes independently from the function, and you'd like to be able to add more modes later without having to change the function at all. (If you're familiar with Haskell, monads are an example of this pattern; the monad specifies the "mode" in which some simple sequential function is executed.)

chumsky is a parser combinator library, so the "core function" is a parse function, defined in the Parser trait. Each Parser trait impl contains a function that indicates how to parse some particular construct in the grammar. Normally, this parser function builds up a data structure representing the parsed data. But sometimes you don't need the full results of the parse: sometimes you might just like to know if the parse succeeds or fails, without building the parsed version. Thus, the "many modes" pattern: we'd like to be able to define our parser and then execute it against one of two modes, emit or check. The emit mode will build the data structure, but check will just check if the parse succeeds.

In the past, chumsky only had one mode, so they always built the data structure. This could take significant time and memory. Adding the "check" mode let's them skip that, which is a significant performance win. Moreover, the modes are encapsulated within the library traits, and aren't visible to end-users. Nice!

How did chumsky model modes with GATs?

Chumsky added a Mode trait, encapsulated as part of their internals module. Instead of directly constructing the results from parsing, the Parser impls invoke methods on Mode with closures. This allows the mode to decide which parts of the parsing to execute and which to skip. So, in check mode, the Mode would decide not to execute the closure that builds the output data structure, for example.

Using this approach, the Parser trait does indeed have several 'entrypoint' methods, but they are all defaulted and just invoke a common implementation method called go:

#![allow(unused)]
fn main() {
pub trait Parser<'a, I: Input + ?Sized, E: Error<I::Token> = (), S: 'a = ()> {
    type Output;
    
    fn parse(&self, input: &'a I) -> Result<Self::Output, E> ... {
        self.go::<Emit>(...)
    }

    fn check(&self, input: &'a I) -> Result<(), E> ... {
        self.go::<Check>(...)
    }
    
    #[doc(hidden)]
    fn go<M: Mode>(&self, inp: &mut InputRef<'a, '_, I, E, S>) -> PResult<M, Self::Output, E>
    where
        Self: Sized;
}
}

Implementations of Parser just specify the go method. Note that the impls are, presumably, either contained within chumsky or generated by chumsky proc-macros, so the go method doesn't need to be documented. However, even if go were documented, the trait bounds certainly look quite reasonable. (The type of inp is a bit...imposing, admittedly.)

So how is the Mode trait defined? Just to focus on the GAT, the trait look likes this:

#![allow(unused)]
fn main() {
pub trait Mode {
    type Output<T>;
    ...
}
}

Here, the T represents the result type of "some parser parsed in this mode". GATs thus allow us to define a Mode that is independent from any particular Parser. There are two impls of Mode (also internal to chumsky):

  • Check, defined like struct Check; impl Mode for Check { type Output<T> = (); ... }. In other words, no matter what parser you use, Check just builds a () result (success or failure is propagated inepdendently of the mode).
  • Emit, defined like struct Emit; impl Mode for Emit { type Output<T> = T; ... }. In Emit mode, the output is exactly what the parser generated.

Note that you could, in theory, produce other modes. For example, a Count mode that not only computes success/failure but counts the number of nodes parsed, or perhaps a mode that computes hashes of the resulting parsed value. Moreover, you could add these modes (and the defaulted methods in Parser) without breaking any clients.

How could you model this today?

I was trying to think how one might model this problem with traits today. All the options I came up with had significant downsides.

Multiple functions on the trait, or multiple traits. One obvious option would be to use multiple functions in the parse trait, or multiple traits:

#![allow(unused)]
fn main() {
// Multiple functions
trait Parser { fn parse(); fn check(); }

// Multiple traits
trait Parser: Checker { fn parse(); }
trait Checker { fn check(); }
}

Both of these approaches mean that defining a new combinator requires writing the same logic twice, once for parse and once for check, but with small variations, which is both annoying and a great opportunity for bugs. It also means that if chumsky ever wanted to define a new mode, they would have to modify every implementation of Parser (a breaking change, to boot).

Mode with a type parameter. You could try defining a the mode trait with a type parameter, like so...

#![allow(unused)]
fn main() {
trait ModeFor<T> {
    type Output;
    ...
}
}

The go function would then look like

#![allow(unused)]
fn main() {
fn go<M: ModeFor<Self::Output>>(&self, inp: &mut InputRef<'a, '_, I, E, S>) -> PResult<M, Self::Output, E>
where
    Self: Sized;
}

In practice, though, this doesn't really work, for a number of reasons. One of them is that the Mode trait includes methods like combine, which take the output of many parsers, not just one, and combine them together. Good luck writing that constraint with ModeFor. But even ignoring that, lacking HRTB, the signature of go itself is incomplete. The problem is that, given some impl of Parser for some parser type MyParser, MyParser only knows that M is a valid mode for its particular output. But maybe MyParser plans to (internally) use some other parser combinators that produce different kinds of results. Will the mode M still apply to those? We don't know. We'd have to be able to write a HRTB like for<O> Mode<O>, which Rust doesn't support yet:

#![allow(unused)]
fn main() {
fn go<M: for<O> Mode<O>>(&self, inp: &mut InputRef<'a, '_, I, E, S>) -> PResult<M, Self::Output, E>
where
    Self: Sized;
}

But even if Rust did support it, you can see that the Mode<T> trait doesn't capture the user's intent as closely as the Mode trait from Chumsky did. The Mode trait was defined independently from all parsers, which is what we wanted. The Mode<T> trait is defined relative to some specific parser, and then it falls to the go function to say "oh, I want this to be a mode for all parsers" using a HRTB.

Using just HRTB (which, against, Rust doesn't have), you could define another trait...

#![allow(unused)]
fn main() {
trait Mode: for<O> ModeFor<O> {}

trait ModeFor<O> {}
}

...which would allow us to write M: Mode on go against, but it's hard to argue this is simpler than the original GAT variety. This extra ModeFor trait has a "code smell" to it, it's hard to understand why it is there. Whereas before, you implemented the Mode trait in just the way you think about it, with a single impl that applies to all parsers...

#![allow(unused)]
fn main() {
impl Mode for Check {
    type Output<T> = ();
    ...
}
}

...you now write an impl of ModeFor, where one "instance" of the impl applies to only one parser (which has output type O). It feels indirect:

#![allow(unused)]
fn main() {
impl<O> ModeFor<O> for Check {
    type Output = ();
    ...
}
}

How could you model this with RPITIT?

It's also been proposed that we should keep GATs, but only as an implementation detail for things like return position impl Trait in traits (RPITIT) or async functions. This implies that we could model the "many modes" pattern with RPITIT. If you look at the Mode trait, though, you'll see that this simply doesn't work. Consider the combine method, which takes the results from two parsers and combines them to form a new result:

#![allow(unused)]
fn main() {
fn combine<T, U, V, F: FnOnce(T, U) -> V>(
    x: Self::Output<T>,
    y: Self::Output<U>,
    f: F,
) -> Self::Output<V>;
}

How could we write this in terms of a function that returns impl Trait?

Other patterns

In this post, I went through the chumsky pattern in detail. I've not had time to dive quite as deep into other examples, but I've been reading through them and trying to extract out patterns. Here are a few patterns I extracted so far:

Did I miss something?

Maybe you see a way to express the "many modes" pattern (or one of the other patterns I cited) in Rust today that works well? Let me know by commenting on the thread.

(Since posting this, it occurs to me that one could probably use procedural macros to achieve some similar goals, though I think this approach would also have significant downsides.)

Generic scopes

available on nightly stabilization

Summary

APIs like std::thread::scope follow a "scope" pattern:

#![allow(unused)]
fn main() {
in_scope(|scope| {
    ... /* can use `scope` in here */ ...
})
}

In this pattern, the closure takes a scope argument whose type is somehow limited to the closure body, often by including a fresh generic lifetime ('env, in the case of [std::thread::scope]). The closure is then able to invoke methods on scope. This pattern makes sense when there is some setup and teardown required both/after the scope (e.g., blocking on all the threads that were spawned to terminate).

The "generic scopes" pattern encapsulates this "scoped closure" concept:

#![allow(unused)]
fn main() {
owner.with_scope(|scope| {
    ...
})
}

Here, the type of the scope that the closure will depend on the with_scope call, but it needs to include some fresh lifetime that is tied to the with_scope call itself.

Details

The generic scopes pattern arise from smithay and was demonstrated by this playground) snippet.

In this case, the "owner" object is a renderer, and the "scope" call is called render. Clients invoke...

#![allow(unused)]
fn main() {
r.render(|my_frame| { ... })
}

...where the precise type of my_frame depends on the renderer. Frames often include thread-local information which should only be accessible during that callback.

Pointer types

available on nightly stabilization

Summary

GATs allow code to be generic over "references to data". This can include Rc vs Arc, but also other more abstract situations.

Details

(To be written.)

References

  • Pythonesque's comment covered one case where they wanted something like a pointer types pattern (I think) but had to work around it, as well as commits from Veloren that may be this pattern (but could also be "many modes").
  • evenyag writes about type-exercise-in-rust

Design discussions

This directory hosts notes on important design discussions along with their resolutions. In the table of contents, you will find the overall status:

  • ✅ -- Settled! Input only needed if you have identified a fresh consideration that is not covered by the write-up.
  • 💬 -- Under active discussion. Check the write-up, which may contain a list of questions or places where feedback is desired.
  • 💤 -- Paused. Not under active discussion, but we may be updating the write-up from time to time with details.

Outlives defaults

Summary

GATs as naively implemented have a major footgun. Given a trait like this...

#![allow(unused)]
fn main() {
trait Iterable {
    type Iter<'c>;

    fn iter<'s>(&'s self) -> Self::Iter<'s>;
}
}

...users would not be able to write a typical impl, e.g....

#![allow(unused)]
fn main() {
impl<T> Iterable for Vec<T> {
    type Iter<'c> = std::slice::Iter<'c, T>;

    fn iter(&self) -> Self::Iter<'_> {
        self.iter()
    }
}
}

This would not work because the type Iter<'c, T> is only well-formed if T: 'c, and that is not known on the associated type. How should we manage this?

Conclusion: Require "probably needed" where-clauses on GATs to be written explicitly

The original write-up and details of the discussion can be found here. The conclusion was to adopt the most conservative route and require users to explicitly write a set of where clauses on associated types. These where-clauses are deduced by examining the method signatures of methods that appear in the same trait, looking for relationships that hold within the methods and requiring those same relationships to be reproduced on the associated type.

In our example trait Iterable...

#![allow(unused)]
fn main() {
trait Iterable {
    type Iter<'c>;

    fn iter<'s>(&'s self) -> Self::Iter<'s>;
}
}

...the method iter returns <Self as Iterable>::Iter<'s> (written in fully qualified form), and we have that self: &'s Self. The parameter type implies that Self: 's, and therefore we require the bound where Self: 'c to be placed on the associated type:

#![allow(unused)]
fn main() {
trait Iterable {
    type Iter<'c>
    where
        Self: 'c; // required for trait to compile

    fn iter<'s>(&'s self) -> Self::Iter<'s>;
}
}

Rationale

The rationale for this decision is that it is the most forwards compatible one: we can opt to remove the required bounds later, and all code still works. We can also opt to add the required bounds by default later, and all existing code still works, it is merely more explicit than required.

Further reading

You can read more about this decision here:

Reference rules

The precise rules are as follows:

  • For every GAT G in a trait definition with generic parameters X0...Xn from the trait and Xn..Xm on the GAT... (e.g., Item or Iterable, in the case of Iterable, with generic parameters [Self] from the trait and ['me] from the GAT)
    • If, for every method in the trait... (e.g., iter, in the case of Iterable)
      • When the method signature (argument types, return type, where clauses) references G like <P0 as Trait<P1..Pn>>::G<Pn..Pm> (e.g., <Self as Iterable>::Iterator<'a>, in the iter method, where P0 = Self and P1 = 'a)...
        • we can show that Pi: Pj for two parameters on the reference to G, and Pi is not 'static (e.g., Self: 'a, in our example)
          • then the GAT must have Xi: Xj in its where clause list in the trait (e.g., Self: 'me).

Frequently asked questions

Can you work through the Iterable example in more detail?

You mean the reference example from this page? Sure! This trait requires a where Self: 'c clause on the associated type Iter...

#![allow(unused)]
fn main() {
trait Iterable {
    type Iter<'c>;

    fn iter<'s>(&'s self) -> Self::Iter<'s>;
}
}

...this occurs because:

  • the iter function references <Self as Iterable>::Iter<'s> in its return type
  • we can show that Self: 's in the method environment
  • and Self is not 'static (in fact, it's a type, not a lifetime)
  • when we translate Self: 's into the namespace of Iter, we wind up with Self: 'c, which is the required bound

Why do the rules ignore parameters equal to 'static?

Consider this example:

#![allow(unused)]
#![feature(generic_associated_types)]

fn main() {
trait X<'a> {
    type Y<'b>;
    fn foo(&self) -> Self::Y<'static>;
})):
}

Without the special case for 'static, we would see that the return type includes

#![allow(unused)]
fn main() {
<Self as X<'a>>::Y<'static>
}

and then check that 'static: 'a (it does, unsurprisingly), and hence conclude that we need to preserve that relationship by adding a where 'b: 'a clause to the associated type. But that where clause isn't likely to help any impls type check. In fact, the fact that Self::Y<'static> can be hard-coded into the trait signature suggests that, for all impls, the value of Y must either (a) not reference 'b or else (b) only use 'b as part of some ref &'b T where T: 'static. So really there isn't much point to adding where-clauses relating 'b. You could imagine that an impl might want to have &'a &'b u32, and to rely on the fact that 'b: 'a in every case where it appears in the interface -- but right now, the only usage in the interface is 'static, and so that same type could just be &'a &'static u32, which would work fine.

How do you know you've gotten the exact rules for required bounds correct (for backcompat)?

Well, we are pretty sure, because our algorithm is quite general. It essentially looks for any patterns or relationships between parameters found in the method signatures of the trait, modulo the carve-out for 'static described in answer to the previous question. It's possible that we could find a source of relationships we haven't considered, or we could find that the carve-out masks something more common, but those seem unlikely, and regardless they would likely be quite obscure cases (and hence it may be possible to tweak the rules without affecting existing code, or tweak the rules in an edition).

Bounds against other parameters

The required bounds sometimes relate to parameters on the trait itself, and not the GAT parameters:

#![allow(unused)]
fn main() {
pub trait Get<'a> {
    type Item<'c>
    where
        Self: 'a; // <-- Required

    fn get(&'a self) -> Self::Item<'static>;
}
}

The reason for this is that the value of the Item type likely incorporates 'a and Self and relies on the relationships of those types:

#![allow(unused)]
fn main() {
pub trait Get<'a> {
    type Item<'c>
    where
        Self: 'a; // <-- Required

    fn get(&'a self) -> Self::Item<'static>;
}

impl<'a, 'b> Get<'a> for &'b [String] {
    type Item<'c> = &'a str;

    fn get(&'a self) -> Self::Item<'static> {
        &self[0]
    }
}
}

Why not issue default bounds against other associated types?

Hmm, good question! It turns out that the idea of default bounds is applicable beyond GATs. For example, you might have a trait

#![allow(unused)]
fn main() {
trait Iterator<'i> {
    type Item;

    fn foo(&'i self) -> Self::Item;
}
}

and the code could suggest that type Item wants a where-clause like where Self: 'i. After all, it will only be used in cases where &'i self is valid type.

We actually tried to enable default bounds but found that it caused the compiler to fail to bootstrap. Interestingly the trait in question was found in gimli, and it turned out to be a case where the default bounds weren't wrong. They were expressing a lifetime relationship that the trait did require, but that relationship was being encoded on the trait in a different, arguably more roundabout way. The trait in question is the Object trait:

#![allow(unused)]
fn main() {
/// An object file.
pub trait Object<'data: 'file, 'file>: read::private::Sealed {
    ...
    type SectionIterator: Iterator<Item = Self::Section>;
    ...
    fn sections(&'file self) -> Self::SectionIterator;
}
}

The error here suggested adding where Self: 'file to the type SectionIterator. Interestingly, if you look closely at the trait header, you can see that it is 'data: 'file. This 'data lifetime turns out to be the lifetime of data that appears in Self. So e.g. an example impl looks like this:

#![allow(unused)]
fn main() {
impl<'data, 'file, R> Object<'data, 'file> for CoffFile<'data, R>
where
    'data: 'file,
    R: 'file + ReadRef<'data>,
{
    type Segment = CoffSegment<'data, 'file, R>;
    type SegmentIterator = CoffSegmentIterator<'data, 'file, R>;
    ...
}
}

In other words, the default bound of where Self: 'file was correct, but was being managed in a more complex way by the trait -- i.e., by adding a special lifetime ('data) into the trait signature that reflects "the lifetime of borrowed data in Self", and then relating that lifetime to 'file. In fact, the entire Object trait in gimil looks like it probably wanted to be a GAT, roughly like so:

#![allow(unused)]
fn main() {
/// An object file.
pub trait Object: read::private::Sealed {
    ...
    type SectionIterator<'file>: Iterator<Item = Self::Section>
    where
        Self: 'file;
    ...
    fn sections(&self) -> Self::SectionIterator<'_>;
}
}

To my eyes, this is unquestionably a simpler trait (and it fits what will likely become a fairly standard pattern).

Outlives defaults

Summary

We are moving towards stabilizing GATs (tracking issue: #44265) but there is one major ergonomic hurdle that we should decide how to manage before we go forward. In particular, a great many GAT use cases require a surprising where clause to be well-typed; this typically has the form where Self: 'a. In "English", this states that the GAT can only be used with some lifetime 'a that could've been used to borrow the Self type. This is because GATs are frequently used to return data owned by the Self type. It might be useful if we were to create some rules to add this rule by default. Once we stabilize, changing defaults will be more difficult, and could require an edition, therefore it's better to evaluate the rules now.

I have an opinion! What should I do?

To make this decision in an informed way, what we need most are real-world examples and experience reports. If you are experimenting with GATs, for example, how often do you use where Self: 'a and how did you find out that it is necessary? Would the default proposals described below work for you? If not, can you describe the trait so we can understand why they would not work?

Of great use would be example usages that do NOT require where Self: 'a. It'd be good to be able to evaluate the various defaulting schemes and see whether they would interfere with the trait. Knowing the trait and a rough sketch of the impls would be helpful.

Background: what where clause now?

Consider the typical "lending iterator" example. The idea here is to have an iterator that produces values that may have references into the iterator itself (as opposed to references into the collection being iterated over). In other words, given a next method like fn next<'a>(&'a mut self), the returned items have to be able to reference 'a. The typical Iterator trait cannot express that, but GATs can:

#![allow(unused)]
fn main() {
trait LendingIterator {
    type Item<'a>;

    fn next<'b>(&'b mut self) -> Self::Item<'b>;
}
}

Unfortunately, this trait definition turns out to be not quite right in practice. Consider an example like this, an iterator that yields a reference to the same item over and over again (note that it owns the item it is referencing):

#![allow(unused)]
fn main() {
struct RefOnce<T> {
    my_data: T    
}

impl<T> LendingIterator for RefOnce<T> {
    type Item<'a> where Self: 'a = &'a T;

    fn next<'b>(&'b mut self) -> Self::Item<'b> {
        &self.my_data
    }
}
}

Here, the type type Item<'a> = &'a T declaration is actually illegal. Why is that? The assumption when authoring the trait was that 'a would always be the lifetime of the self reference in the next function, of course, but that is not in fact required. People can reference Item with any lifetime they want. For example, what if somebody wrote the type <SomeType<T> as LendingIterator>::Item<'static>? In this case, T: 'static would have to be true, but T may in fact contain borrowed references. This is why the compiler gives you a "T may not outlive 'a" error (playground).

We can encode the constraint that "'a is meant to be the lifetime of the self reference" by adding a where Self: 'a clause to the type Item declaration. This is saying "you can only use a 'a that could be a reference to Self". If you make this change, you'll find that the code compiles (playground):

#![allow(unused)]
fn main() {
trait LendingIterator {
    type Item<'a> where Self: 'a;

    fn next<'b>(&'b mut self) -> Self::Item<'b>;
}
}

When would you NOT want the where clause Self: 'a?

If the associated type cannot refer to data that comes from the Self type, then the where Self: 'a is unnecessary, and is in fact somewhat constraining.

Example: Output doesn't borrow from Self

In the Parser trait, the Output does not ultimately contain data borrowed from self:

#![allow(unused)]
fn main() {
trait Parser {
    type Output<'a>;
    fn parse<'a>(&mut self, data: &'a [u8]) -> Self::Output<'a>;
}
}

If you were to implement Parser for some reference type (in this case, &'b Dummy) you can now set Output to something that has no relation to 'b:

#![allow(unused)]
fn main() {
impl<'b> Parser for &'b Dummy {
    type Output<'a> = &'a [u8];

    fn parse<'a>(&mut self, data: &'a [u8]) -> Self::Output<'a> {
        data 
    }
}
}

Note that you would need a similar where clause if you were going to have a setup like:

#![allow(unused)]
fn main() {
trait Transform<Input> {
    type Output<'a>
    where
        Input: 'i;

    fn transform<'i>(&mut self: &'i Input) -> Self::Output<'i>;
}
}

Example: Iter static

In the previous example, the lifetime parameter for Output was not related to the self parameter. Are there (realistic) examples where the associated type is applied to the lifetime parameter from self but the where Self: 'a is not desired?

There are some, but they rely on having "special knowledge" of the types that will be used in the impl, and they don't seem especially realistic. The reason is that, if you have a GAT with a lifetime parameter, it is likely that the GAT contains some data borrowed for that lifetime! But if you use the lifetime of self, that implies we are borrowing some data from self -- however, it doesn't necessarily imply that we are borrowing data of any particular type. Consider this example:

#![allow(unused)]
fn main() {
trait Message {
    type Data<'a>: Display;

    fn data<'b>(&'b mut self) -> Self::Data<'b>;

    fn default() -> Self::Data<'static>;
}

struct MyMessage<T> {
    text: String,
    payload: T,
}

impl<T> Message for MyMessage<T> {
    type Data<'a>: Display = &'a str;
    // No requirement that `T: 'a`!

    fn data<'b>(&'b mut self) -> Self::Data<'b> {
        // In here, we know that `T: 'b`
    }

    fn default() -> Self::Data<'static> {
        "Hello, world"        
    }
}
}

Here the where T: 'a requirement is not necessary, and may in fact be annoying when invoking <MyMessage<T> as Message>::default() (as it would then require that T: 'static).

Another possibility is that the usage of <MyMessage<T> as Message>::Data<'static> doesn't appear inside the trait definition, although it is hard to imagine exactly how one writes a useful function like that in practice.

Alternatives

Status quo

We ship with no default. This kind of locks in a box, because adding a default later would be a breaking change to existing impls that are affected by the default. since some of them may be using the associated types with a lifetime unrelated to Self. Note though that a sufficiently tailored default would only break code that was going to -- or perhaps very likely to -- not compile anyhow.

Smart default: add where Self: 'a if the GAT is used with the lifetime from &self (and extend to other type parameters)

Analyze the types of methods within the trait definition. It a GAT is applied to a lifetime 'x, examine the implied bounds of the method for bounds of the form T: 'x, where T is an input parameter to the trait. If we find such bounds on all methods for every use of the GAT, then add the corresponding default.

Consider the LendingIterator trait:

#![allow(unused)]
fn main() {
trait LendingIterator {
    type Item<'a>;

    fn next<'b>(&'b mut self) -> Self::Item<'b>;
}
}

Analyzing the closure body, we see that it contains Self::Item<'b> where 'b is the lifetime of the self reference (e.g., self: &'b Self or self: &'b mut Self). The implied bounds of this method contain Self: 'b. Since there is only one use of Self::Item<'b>, and the implied bound Self: 'b applies in that case, then we add the default where Self: 'a to the GAT.

This check is a fairly simple syntactic check, though not necessarily easy to explain. It would accept all the examples that appear in this document, including the example with fn default() -> Self::Data<'static> (in that case, the default is not triggered, because we found a use of Data that is applied to a lifetime for which no implied bound applies). The only case where this default behaves incorrectly is the case where all uses of Self::Data that appear within the trait need the default, but there are uses outside the trait that do not (I couldn't come up with a realistic example of how to do this usefully).

Extending to other type parameters

The inference can be extended naturally beyond self to other type parameters. Therefore this example:

#![allow(unused)]
fn main() {
trait Parser<Input> {
    type Output<'i>;

    fn get<'input>(&mut self, i: &'input Input) -> Self::Output<'input>;
}
}

would infer a where Input: 'i bound on type Output<'i>.

Similarly:

#![allow(unused)]
fn main() {
trait Parser<Input> {
    type Output<'i>;

    fn get(&mut self, i: &'input Input) -> Self::Output<'input>;
}
}

would infer a where Input: 'i bound on type Output<'i>.

Avoiding the default

If this default is truly not desired, there is a workaround: one can declare a supertrait that contains just the associated type. For example:

#![allow(unused)]
fn main() {
trait IterType {
    type Iter<'b>;
}

trait LendingIterator: IterType {
    fn next(&mut self) -> Self::Iter<'_>;
}
}

This workaround is not especially obvious, however.

We used to require T: 'a bounds in structs:

#![allow(unused)]
fn main() {
struct Foo<'a, T> {
    x: &'a T
}
}

but as of RFC 2093 we infer such bounds from the fields in the struct body. In this case, if we do come up with a default rule, we are essentially inferring the presence of such bounds by usages of the associated type within the trait definition.

Recommendation

Niko's recommendation is to use the "smart defaults". Why? They basically always do the right thing, thus contributing to supportive, at the cost of (theoretical) versatility. This seems like the right trade-off to me.

The counterargument would be: the rules are sufficiently complex, we can potentially add this later, and people are going to be surprised by this default when it "goes wrong" for them. It would be hard, but not impossible, to add a tailored error message for cases where the where T: 'b check fails.

Not sure about Jack's opinion. =)

Appendix A: Ruled out alternatives

Special syntax

We could use the 'self "keyword", permitted only in GATs, to indicate "a lifetime with the where clause where Self: 'self". The LendingIterator trait would therefore be written

#![allow(unused)]
fn main() {
trait LendingIterator {
    type Item<'self>;

    fn next(&mut self) -> Self::Item<'_>;
}
}

Forwards compatibility note: This option could be added later; note also that 'self is not currently valid.

Why not? 'self is an awfully suggestive syntax. It may be useful for things like self-referential structs. This just doesn't important enough.

Force people to write where Self: 'a

To buy time, we could force people to write where Self: 'a, so that we can later allow it to be elided. This unfortunately would eliminate a number of valid use cases for GATs (though they would later be supported).

Why not? Rules out a number of useful cases.

Dumb default: Always default to where Self: 'a

The most obvious default is to add where Self: 'a to the where clause list for any GAT with a lifetime parameter 'a, but that seems too crude. It will rule out all existing cases unless we add some form of "opt-out" syntax, for which we have no real precedent.

Why not? Rules out a number of useful cases.

Appendix B: Considerations

Appendix C: Other examples

Example: Ruma

#![allow(unused)]
fn main() {
// Every endpoint in the Matrix REST API has two request and response types in Ruma, one Incoming
// (used for deserialization) and out Outgoing (used for serialization). To avoid annoying clones when
// sending a request, most non-copy fields in the outgoing structs are references.
//
// The request traits have an associated type for the corresponding response type so things can be
// matched up properly.
pub trait IncomingRequest: Sized {
    // This is the current definition of the associated type I'd like to make generic.
    type OutgoingResponse: OutgoingResponse;
    // AFAICT adding a lifetime parameter is all I need.
    type OutgoingResponse<'a>: OutgoingResponse;

    // Other trait members... (not using Self::OutgoingResponse)
}
}

full definition

Appendix D: Examples

We go through several examples and document whether and why bounds are required.

Default bounds in return position

#![feature(generic_associated_types)]

trait Foo {
    type Item<'me>
    where
        Self: 'me; // <-- Required

    fn push_back<'a>(&'a mut self) -> Self::Item<'a>;
}

fn main() { }

The bound here is required because:

  • push_back returns a value of type Self::Item<'a>:
    • the &'a mut self parameter implies that Self: 'a
    • therefore, we require this bound be written on the trait

No default bounds in argument position

#![feature(generic_associated_types)]

trait Foo {
    type Item<'me>; 
    fn push_back1<'a>(&'a mut self, arg: Self::Item<'a>);
    fn push_back2<'a>(&'a mut self, arg: &mut Self::Item<'a>);
}

fn main() { }

No bounds here are required because Self::Item only appears in argument position.

Bounds against other parameters

The required bounds sometimes relate to parameters on the trait itself, and not the GAT parameters:

#![allow(unused)]
fn main() {
pub trait Get<'a> {
    type Item<'c>
    where
        Self: 'a; // <-- Required

    fn get(&'a self) -> Self::Item<'static>;
}
}

The reason for this is that the value of the Item type likely incorporates 'a and Self and relies on the relationships of those types:

#![allow(unused)]
fn main() {
pub trait Get<'a> {
    type Item<'c>
    where
        Self: 'a; // <-- Required

    fn get(&'a self) -> Self::Item<'static>;
}

impl<'a, 'b> Get<'a> for &'b [String] {
    type Item<'c> = &'a str;

    fn get(&'a self) -> Self::Item<'static> {
        &self[0]
    }
}
}

You may have noticed that Item didn't need to be a GAT at all here -- in fact, the same logic would apply to a trait with no lifetime parameters. However, adding the rules that users write the bounds explicitly after the fact is not backwards compatible. We found crates that would stop compiling, such as gimli:

https://github.com/gimli-rs/object/blob/0a38064531fef4ddbaf93770a3551d333338980e/src/read/traits.rs#L24

#![allow(unused)]
fn main() {
/// An object file.
pub trait Object<'data: 'file, 'file>: read::private::Sealed {
    ...
    type SectionIterator: Iterator<Item = Self::Section>;
    // needs `Self: 'file`
    ...
    fn sections(&'file self) -> Self::SectionIterator;
}
}

Interestingly, if you look closely at the trait header, you can see that it is 'data: 'file. This 'data lifetime turns out to be the lifetime of data that appears in Self. So e.g. an example impl looks like this:

#![allow(unused)]
fn main() {
impl<'data, 'file, R> Object<'data, 'file> for CoffFile<'data, R>
where
    'data: 'file,
    R: 'file + ReadRef<'data>,
{
    type Segment = CoffSegment<'data, 'file, R>;
    type SegmentIterator = CoffSegmentIterator<'data, 'file, R>;
    ...
}
}

In other words, the default bound of where Self: 'file was correct, but was being managed in a more complex way by the trait -- i.e., by adding a special lifetime ('data) into the trait signature that reflects "the lifetime of borrowed data in Self", and then relating that lifetime to 'file. In fact, the entire Object trait in gimil looks like it probably wanted to be a GAT, roughly like so:

#![allow(unused)]
fn main() {
/// An object file.
pub trait Object: read::private::Sealed {
    ...
    type SectionIterator<'file>: Iterator<Item = Self::Section>
    where
        Self: 'file;
    ...
    fn sections(&self) -> Self::SectionIterator<'_>;
}
}

To my eyes, this is unquestionably a simpler trait (and it fits what will likely become a fairly standard pattern).

💬 Where does the where clause go?

This is write-up of the conclusion to the [where does the where clause go?] question. To read more background, see the links section below.

Conclusion

Where clauses in generic associated types comes after value/binding

Example:

#![allow(unused)]
fn main() {
trait MyTrait {
    type MyType<T>: Iterator
    where
        T: Ord;
}
    
impl MyTrait for MyOtherType {
    type MyType<T> = MyIterator
    where
        T: Ord;
}
}

Effectively the = type in the impl replaces the : Bound from the declaration with the value that has to meet those bounds.

Later phase: type aliases

Type aliases will eventually be aligned to this syntax as well:

#![allow(unused)]
fn main() {
type MyType<T> = Vec<T> where T: Ord;
}

Currently, however, where clauses on type aliases are ignored, so we will not stabilize this new syntax until they have the meaning we want.

Suggestions for users who put the where clause in the wrong place

We will parse where clauses in both positions and suggest to users that they be moved:

#![allow(unused)]
fn main() {
impl MyTrait for MyOtherType {
    type MyType<T>
    where
        T: Ord
    = MyIterator;
}
}

Gets an error with a suggested rewrite. The compiler proceeds "as if" the where clauses had been written after the = Type.

Where clause syntax for trait aliases will have to be revisited

As described in the FAQ below, we currently support trait alias syntax like

#![allow(unused)]
fn main() {
trait ReverseEq<T> = where T: PartialEq<Self>;
}

This syntax will be removed. Although its capabilities could be useful, it is also quite confusing (the placement of the where is a subtle distinction), and not clearly needed. If we find that we do want it, we can add in a similar syntax later, but hopefully in a way that is more broadly consistent with the language.

Discussion and FAQ

But isn't it inconsistent with other trait items to put the where clauses before the =?

From one perspective, yes. One can view the value of an associated type as its "body", and the where clauses typically come before the "body" of an item. Put another way, typically you can "copy and paste" the impl and then add some text to the end of each item to specify its value: but with this syntax, you have to edit the "middle" of an associated type to specify its value.

The analogy of an associated type value to a function body, however, is somewhat flawed. The value of an associated type needs to be considered part of the "signature", or public facing, part of the impl. Consider: you can change the body of a function and be certain that your callees will still compile, but you cannot do the same for the value of an associated type.

Given this perspective, when you copy the associated type from the trait to the impl, you are "completing" the signature that was left incomplete by the trait. Moreover, to do so, you replace the : Bound1 + Bound2 list (which constraints what kinds of types the impl might use) with a specific type, thus making it more specific.

What about a more purely syntactic point-of-view? What is more consistent?

There is precedent that the placement of the where clause has less to do with the logical role that it plays and more to do with other factors, like whether it is followed by a braced list of items:

  • With struct Foo<T> where T: Ord { t: T }, the "body" of the struct is its fields, and the where clause comes first.
  • But we write struct Foo<T>(T) where T: Ord, thus placing the "body" (the fields (T)) first and the where clause second. Moreover, we initially implemented the grammar struct Foo<T> where T: Ord (T) but this was deemed so obviously confusing that it was changed with little discussion.

As further evidence that this syntax is inconsistent with Rust's traditions, placing the where clauses before the = ty makes it objectively hard to determine how to run rustfmt in a way that feels natural. rustfmt handles where by putting the where onto its own line, with one line per where clause. This structure works for existing Rust items because where clauses are always either following by nothing (tuple structs) or by a braced ({}) list of items (e.g., struct fields, fn body, etc). That opening { can therefore go on its own line. This where clause formatting does not work well with =.

The idea of having where clauses come at the "end" of the signature is also supported by the original RFC, which motivated where clauses in part by describing how they allow you to treat the precise bounds as "secondary" to the "important stuff":

If I may step aside from the "impersonal voice" of the RFC for a moment, I personally find that when writing generic code it is helpful to focus on the types and signatures, and come to the bounds later. Where clauses help to separate these distinctions. Naturally, your mileage may vary. - nmatsakis

In the case of an impl specifying the value for an associated type, the "important stuff" the value of the associated type.

What about trait aliases, don't they distinguish where clause placement?

As currently implemented, trait aliases have two distinct possible placements for where clauses, which effectively distinguishes between a where clause (which must be proven true in order to use the alias) and an implied bound (which is part of what the alias expands to). One can write:

#![allow(unused)]
fn main() {
trait Foo<T: Debug> = Bar<T> + Baz<T>
}

in which case where X: Foo<Y> is only legal if Y: Debug is known from some other place. This is roughly equivalent to a trait like so:

#![allow(unused)]
fn main() {
trait Foo1<T: Debug>: Bar<T> + Baz<T> { }
}

The clause where X: Foo1<Y> is also only valid when Y: Debug is known. This is in contrast to the "supertraits" Bar<Y> and Baz<Y>, which are implied by X: Foo1<Y> ("supertraits" are also sometimes called "implied bounds").

Alternatively, one can include the where clause in the "value" of the trait alias like so:

#![allow(unused)]
fn main() {
trait ReverseEq<T> = where T: PartialEq<Self>;
}

In this case, where X: ReverseEq<Y> is equivalent to Y: PartialEq<X>. There is no "equivalent trait" for usage like this; the T: PartialEq<Self> effectively acts like a supertrait or implied bound.

Our decision was that this is a subtle distinction and that using the placement of the where clause was not a great way to make it.

Is that trait alias syntax consistent with the rest of the language?

Not really. There are other places in the language that could benefit from a similar flexibility around implied bounds. For example, one could imagine wanting to have an associated type T::MyType<Y> where it is known that Y: PartialEq<T::MyType<Y>>, but this cannot be readily written with today's syntax:

#![allow(unused)]
fn main() {
trait MyTrait {
    type MyType<T>: PartialEq<T>;
    //              ^^^^^^^^^ not what we wanted
}
}

We decided that if we were going to offer that capability, we should find a way to offer it more generally, and hopefully with more clarity than putting the where clause before or after the =. As we have seen, where clauses for different kinds of items can be rather variable in their placement, so it is not clear that all users will recognize that distinction and understand it (many present in the meeting were surprised by the distinction as well).

Alternatively, the implied bounds proposal goes another way, turning most where clauses into implied bounds by default!

Why do you even need where clauses in the impl anyway?

Given that the where clauses appear in the trait, you might wonder why they are needed in the impl anyway. After all, the impl could just assume that the trait bounds are met when checking the value of the associated type for validity, making the whole issue moot.

This would however be inconsistent with other sorts of items, which do require users to copy over the bounds from the trait. Furthermore, we have discussed the idea of allowing impls to relax the bounds from those described in the trait if they are not needed in the impl -- this came up most recently in the context of allowing impls to forego the unsafe keyword for unsafe fn declared in the trait if the fn in the impl body is completely safe. This could then even be relied upon by people invoking the method who know the precise impl they will be using.

In short, this might be a reasonable choice to make, but we should make it uniformly, and it shuts down the direction of using the lack of bounds in the impl as a kind of signal.

Why not change type alias notation too?

Top-level type aliases currently parse with the where clause before the =:

#![allow(unused)]
fn main() {
type Foo<T> where T: Ord = T;
}

If you try that above example, however, you will find that you get a warning: this is because the where T: Ord is completely ignored! This is an implementation limitation in the way the current compiler eagerly expands type aliases. Moving the placement of where clauses actually gives us an opportunity to change this behavior without breaking any existing code, which is nice. It will however require some kind of opt-in (such as a cargo fix run) to migrate existing code that uses where clauses in the "ignored place" to the new format.

Where does the where clause go?

UPDATE

This document is retained for historical purposes. See the Where the Where conclusion for the most up-to-date conversation.

Summary

Proposed: to alter the syntax of where clauses on type aliases so that they appear after the value:

type StringMap<K> = BTreeMap<K, String>
where
    K: PartialOrd

This applies both in top-level modules and in trats (associated types, generic or otherwise).

Background

The current syntax for where to place the "where clause" of a generic associated types is awkward. Consider this example (playground):

#![allow(unused)]
fn main() {
trait Iterable {
    type Iter<'a> where Self: 'a;

    fn iter(&self) -> Self::Iter<'_>;
}

impl<T> Iterable for Vec<T> {
    type Iter<'a>
    where 
        Self: 'a = <&'a [T] as IntoIterator>::IntoIter;

    fn iter(&self) -> Self::Iter<'_> {
        self.iter()
    }
}
}

Note the impl. Most people expect the impl to be written as follows (indeed, the author wrote it this way in the first draft):

#![allow(unused)]
fn main() {
impl Iterable for Vec<T> {
    type Iter<'a>  = <&'a [T] as Iterator>::Iter
    where 
        Self: 'a;

    fn iter(&self) -> Self::Iter<'_> {
        self.iter()
    }
}
}

However, this placement of the where clause is in fact rather inconsistent, since the = <&'a [T] as Iterator>::Iter is in some sense the "body" of the item.

The same current syntax is used for where clauses on type aliases (playground):

type Foo<T> where T: Eq = Vec<T>;

fn main() { }

Top-level type aliases

Currently, we accept where clauses in top-level type aliases, but they are deprecated (warning) and semi-ignored:

type StringMap<K> where
    K: PartialOrd
= BTreeMap<K, String>

Under this proposal, this syntax remains, but is deprecated. The newer syntax for type aliases (with where coming after the type) would remain feature gated until such time as we enforce the expected semantics.

Alternatives

Keep the current syntax.

In this case, we must settle the question of how we expect it to be formatted (surely not as I have shown it above).

#![allow(unused)]
fn main() {
impl<T> Iterable for Vec<T> {
    type Iter<'a> where Self: 'a 
        = <&'a [T] as IntoIterator>::IntoIter;

    fn iter(&self) -> Self::Iter<'_> {
        self.iter()
    }
}
}

Accept either

What do we do if both are supplied?

Are GATs too complex?

Question

Conclusion

Should GATs only support lifetime parameters?

Question

One possibility is to only stabilize GATs with lifetime parameters:

#![allow(unused)]
fn main() {
trait Iterable {
    type Item<'a>;
}
}

Conclusion

😕 FAQ