2020-03-12 shared library to represent Rust types

Shared library to represent Rust types

Notes from the meeting

The later sections of this document were largely prepared before the discussion. This section contains notes from the discussion itself. You can also read the complete discussion in our zulip-archive.

Conclusions

  • consensus from folks in the meeting:
    • are “cautiously optimistic” about this approach, and open to us trying to create PRs that move rustc in this direction
    • are ok with having a rusc-ty library that is published to crates.io
  • but there remains disagreement about the big picture question of how stable API boundaries can/should be between libraries, whether they should live in their own repos, and whether they should use unstable features (but that was not the direct topic of this meeting, so that’s ok)
  • sketch of a plan for how to do the transition:
    1. ty.kind -> ty.kind()
    2. ty.kind() -> ty.kind(tcx)
    3. Add a type alias and make kind work on I: Interner
    4. Start using Ty more (This is a lot of PRs)

Key or interesting points

Some key or unique points that were raised during the discussion:

Deferred questions

Some bikeshed or other detailed questions were raised but largely deferred:

Context

As part of library-ification, we need a library to represent Rust types. The intent is that this library can be used by chalk, rustc, rust-analyzer, and any other projects which need to represent Rust types.

Goals:

  • Do not hard-code the use of interning. Interning is a good fit for batch compilation but not such a good fit for IDEs, which need to be able to gradually release memory on an ongoing basis.
  • Ergonomic library that is pleasant to work with.
  • Simplify Rust’s types to “their essence”, while retaining full information

Goals for this meeting

  • Discuss the “high-level plan” for the design
  • Discuss the roadmap and initial steps to work towards this goal

Non-goals

  • We do not need to decide the fine details of the type (e.g., what variants there are, what to name the methods, etc) at this time. We will work those out through PRs and refactorings.
  • Figure out the mono-repo vs poly-repo question and where the shared crate should ultimately live (I sketch some interim solutions below)

Proposed end-goal: crate structure

This idea is part of a broader ‘library-ification’ effort. The goal is to build up independent crates for the various parts of rustc, like the trait system, type checker, name resolution, etc, that can be shared by both rustc and rust-analyzer.

This particular meeting is talking about what might be part of a crate that describes Rust types, which we’ll just call ty for now. This crate would be the foundation for a lot of other crates:

                          "rust-analyzer"
                               ^
                               |
ty --> "trait system" -----+---+--> "rustc"
 |          ^              |
 |          |              |
 +---> "type checker" -----+

In other words, we are talking about ultimately replacing the Ty<'tcx> type that rustc uses today with a type defined in this ty crate, modeled roughly after chalk’s Ty type.

Proposed end-goal: the basic pattern for representing types

The proposed design is described in some detail in the chalk book, so you may wish to read there. However, I do want to emphasize that the fine details (e.g., exactly which variants we should have) should not be considered normative.

In the proposed design, Rust types will ultimately be represented by a struct that looks like this, based on chalk_ir::Ty:

pub struct Ty<I: Interner> {
    interned: I::InternedType
}

impl<I: Interner> Ty<I> { ... }

Clearly, Ty<I> itself says very little. It is just a “shell” that wraps around the interned representation, which is defined via the Interner trait:

trait Interner {
    type InternedType;
    
    fn intern_ty(&self, data: TyData<Self>, flags: TyFlags) -> Self::InternedType;
    fn lookup_ty_data<'ty>(&self, ty: &'ty Self::InternedType) -> &'ty TyData<Self>;
    fn lookup_ty_flags<'ty>(&self, ty: &'ty Self::InternedType) -> TyFlags;
}

To actually use a type, you wish to invoke the data() method (you have to supply a reference to the interner):

impl<I: Interner> Ty<I> {
    pub fn data(&self, interner: &I) -> &TyData<I> {
        interner.lookup_ty_data(&self.interned)
    }

    pub fn flags(&self, interner: &I) -> &TyFlags<I> {
        interner.lookup_ty_flags(&self.interned)
    }
}

This method returns a reference to a TyData, which is the actual enum that defines the type variants. The definition that follows is taken from Chalk:

enum TyData<I: Interner> {
    Apply(ApplicationTy<I>),
    Placeholder(PlaceholderIndex),
    Dyn(DynTy<I>),
    Alias(AliasTy<I>),
    Function(Fn<I>),
    BoundVar(usize),
    InferenceVar(InferenceVar),
}

As noted before, don’t focus too closely on the precise set of variants. Chalk aimed for a very minimal set, and the idea was to justify each variant by explaining clearly why some bit of code might want to look for types of that variant, but I expect that we’ll ultimately want to find some “sweet spot” in between rustc’s approach (lots of variants) and chalk’s approach (minimal set).

The key thing to discuss here:

  • A Ty<I> type with an I: Interner parameter that defines its representation

Interner in Rustc

The interner in Rustc would be TyCtxt<'tcx>, and hence we’d have something like

impl<'tcx> Interner for TyCtxt<'tcx> {
  type InternedType = &'tcx TyData<Self>;
  
  fn intern_ty(&self, data: TyData<Self>) -> &'tcx TyData<Self> {
      self.interners.allocate(data)
  }
}

How to get there from here

Presently, rustc, chalk, and rust-analyzer each have their own type definitions. The goal is eventually that all three projects should share the same crate that defines the type definitions, using the above generic style. But how do we get to that goal? And where should that crate live? This ultimately plays into the mono- vs poly-repo discussion and I would rather not go there in this meeting. It seems like we can address the questions a bit later when we have made more progress.

Until then, I propose that we gradually refactor rustc so that the types in rustc approach the goal described above and ultimately factor out a rustc-ty crate that contains just the type definitions (and some supporting types). This should be a crate that can be published on crates.io (as well as living in tree). chalk can then depend on this crate. Once that is done, we could consider whether the crate should move to chalk (which would be moving towards polyrepo) or whether it should remain in tree, or whether we adopt some other, hybrid solution (as discussed here).

In more detail:

  • Rework the TyS API so that you access the variants by ty.data(tcx) and not ty.kind (similarly flags)
  • Introduce an Interner trait into rustc and refactor from type Ty<'tcx> = &'tcx TyS<'tcx> to type Ty<'tcx> = ty::Ty<TyCtxt<'tcx>>'
  • Extract the type stuff into a rustc-ty crate
    • this should have no dependencies on other parts of rustc and be something we can publish to crates.io
  • Make chalk depend on that crate

chalk integration

Note that while the above work is proceeding, we may also want to experiment with “chalk integration”. This would be done by having -Zchalk cause rustc to translate its types (however they may be represented) into the minimal chalk-ir types on demand. As rustc’s types come to match chalk-ir types (and/or chalk-ir types change to match rustc types), this would be a simpler and simpler translation and would eventually become a no-op once rustc + chalk share the same type definitions.

Where should this crate live?

This crate will start to re-raise the question of poly- vs mon-repo. I would like to dodge that particular discussion for now. What I’ve proposed above has a rustc-ty crate living in tree. This is obviously the right place to start, but ultimately I think we should consider the idea of having “type related code” living in a separate rust-lang repo (like rust-lang/chalk), which might ultimately include the representation of types, the trait solving code, and perhaps the type checker (and depend on some other repo which provides the HIR that gets type-checked). But obviously this would carry coordination costs and all of this seems far enough way I’d rather not discuss it in this meeting.