Summary

This RFC describes the design of a JSON output for the tool rustdoc, to allow tools to lean on its data collection and refinement but provide a different front-end.

Motivation

The current HTML output of rustdoc is often lauded as a key selling point of Rust. It’s a ubiquitous tool, that you can use to easily find nearly anything you need to know about a crate. However, despite its versatility, its output format has some drawbacks:

  • Viewing this output requires a web browser, with (for some features of the output) a JavaScript interpreter.
  • The HTML output of rustdoc is explicitly not stabilized, to allow rustdoc developers the option to tweak the display of information, add new information, etc. In addition it’s not generated with the intent of being scraped by users which makes converting this HTML into a different format impractical. People are still able to build cool stuff on top of it, but it’s unwieldy and limiting to parse the HTML like that. For use cases like this, a stable, well documented, easily parsable format with semantic information accessible would be far more useful.
  • As the HTML is the only available output of rustdoc, its integration into centralized, multi-language, documentation browsers is difficult.

In addition, rustdoc had JSON output in the past, but it failed to keep up with the changing language and was taken out in 2016. With rustdoc in a more stable position, it’s possible to re-introduce this feature and ensure its stability. This was brought up in 2018 with a positive response and there are several recent discussions indicating that it would be a useful feature.

In the draft RFC from 2018 there was some discussion of utilizing save-analysis to provide this information, but with RLS being replaced by rust-analyzer it’s possible that the feature will be eventually removed from the compiler. In addition save-analysis output is just as unstable as the current HTML output of rustdoc, so a separate format is preferable.

Guide-level explanation

(Upon successful implementation/stabilization, this documentation should live in The Rustdoc Book.)

In addition to generating the regular HTML, rustdoc can create a JSON file based on your crate. These can be used by other tools to take information about your crate and convert it into other output formats, insert into centralized documentation systems, create language bindings, etc.

To get this output, pass the --output-format json flag to rustdoc:

$ rustdoc lib.rs --output-format json

This will output a JSON file in the current directory (by default). For example, say you have the following crate:

//! Here are some crate-level docs!

/// Here are some docs for `some_fn`!
pub fn some_fn() {}

/// Here are some docs for `SomeStruct`!
pub struct SomeStruct;

After running the above command, you should get a lib.json file like the following:

{
  "root": "0:0",
  "version": null,
  "includes_private": false,
  "index": {
    "0:3": {
      "crate_id": 0,
      "name": "some_fn",
      "source": {
        "filename": "lib.rs",
        "begin": [4, 0],
        "end": [4, 19]
      },
      "visibility": "public",
      "docs": "Here are some docs for `some_fn`!",
      "attrs": [],
      "kind": "function",
      "inner": {
        "decl": {
          "inputs": [],
          "output": null,
          "c_variadic": false
        },
        "generics": {...},
        "header": "",
        "abi": "\"Rust\""
      }
    },
    "0:4": {
      "crate_id": 0,
      "name": "SomeStruct",
      "source": {
        "filename": "lib.rs",
        "begin": [7, 0],
        "end": [7, 22]
      },
      "visibility": "public",
      "docs": "Here are some docs for `SomeStruct`!",
      "attrs": [],
      "kind": "struct",
      "inner": {
        "struct_type": "unit",
        "generics": {...},
        "fields_stripped": false,
        "fields": [],
        "impls": [...]
      }
    },
    "0:0": {
      "crate_id": 0,
      "name": "lib",
      "source": {
        "filename": "lib.rs",
        "begin": [1, 0],
        "end": [7, 22]
      },
      "visibility": "public",
      "docs": "Here are some crate-level docs!",
      "attrs": [],
      "kind": "module",
      "inner": {
        "is_crate": true,
        "items": [
          "0:4",
          "0:3"
        ]
      }
    }
  },
  "paths": {
    "0:3": {
      "crate_id": 0,
      "path": ["lib", "some_fn"],
      "kind": "function"
    },
    "0:4": {
      "crate_id": 0,
      "path": ["lib", "SomeStruct"],
      "kind": "struct"
    },
    ...
  },
  "extern_crates": {
    "9": {
      "name": "backtrace",
      "html_root_url": "https://docs.rs/backtrace/"
      },
    "2": {
      "name": "core",
      "html_root_url": "https://doc.rust-lang.org/nightly/"
    },
    "1": {
      "name": "std",
      "html_root_url": "https://doc.rust-lang.org/nightly/"
    },
    ...
  }
}

Reference-level explanation

(Upon successful implementation/stabilization, this documentation should live in The Rustdoc Book and/or an external crate’s Rustdoc.)

(Given that the JSON output will be implemented as a set of Rust types with serde serialization, the most useful docs for them would be the 40 or so types themselves. By writing docs on those types the Rustdoc page for that module would become a good reference. It may be helpful to provide some sort of schema for use with other languages)

When you request JSON output from rustdoc, you’re getting a version of the Rust abstract syntax tree (AST), so you could see anything that you could export from a valid Rust crate. The following types can appear in the output:

ID

To provide various maps/references to items, the JSON output uses unique strings as IDs for each item. They happen to be the compiler internal DefId for that item, but in the JSON blob they should be treated as opaque as they aren’t guaranteed to be stable across compiler invocations. IDs are only valid/consistent within a single JSON blob. They cannot be used to resolve references between the JSON output of different crates (see the Resolving IDs section).

Crate

A Crate is the root of the outputted JSON blob. It contains all doc-relevant information about the local crate, as well as some information about external items that are referred to locally.

NameTypeDescription
nameStringThe name of the crate. If --crate-name is not given, the filename is used.
versionString(Optional) The version string given to --crate-version, if any.
includes_privateboolWhether or not the output includes private items.
rootIDThe ID of the root module Item.
indexMap<ID, Item>A collection of all Items in the crate*.
pathsMap<ID, ItemSummary>Maps all IDs (even external ones*) to a brief description including their name, crate of origin, and kind.
extern_cratesMap<int, ExternalCrate>A map of “crate numbers” to metadata about that crate.
format_versionintThe version of the structure of this blob. The structure described by this RFC will be version 1, and it will be changed if incompatible changes are ever made.

Resolving IDs

The crate’s index contains mostly local items, which includes impls of external traits on local types or local traits on external types. The exception to this is that external trait definitions and their associated items are also included in the index because this information is useful when generating the comprehensive list of methods for a type.

This means that many IDs aren’t included in the index (any reference to a struct, macro, etc. from a different crate). In these cases the fallback is to look up the ID in the crate’s paths. That gives enough information about the item to create cross references or simply provide a name without copying all of the information about external items into the local crate’s JSON output.

ExternalCrate

NameTypeDescription
nameStringThe name of the crate.
html_root_urlString(Optional) The html_root_url for that crate if they specify one.

ItemSummary

NameTypeDescription
crate_idintA number corresponding to the crate this Item is from. Used as an key to the extern_crates map in Crate. A value of zero represents an Item from the local crate, any other number means that this Item is external.
path[String]The fully qualified path (e.g. ["std", "io", "lazy", "Lazy"] for std::io::lazy::Lazy) of this Item.
kindStringWhat type of Item this is (see Item).

Item

An Item represents anything that can hold documentation - modules, structs, enums, functions, traits, type aliases, and more. The Item data type holds fields that can apply to any of these, and leaves kind-specific details (like function args or enum variants) to the inner field.

NameTypeDescription
crate_idintA number corresponding to the crate this Item is from. Used as an key to the extern_crates map in Crate. A value of zero represents an Item from the local crate, any other number means that this Item is external.
nameStringThe name of the Item, if present. Some Items, like impl blocks, do not have names.
spanSpan(Optional) The source location of this Item.
visibilityString"default", "public", or "crate"*.
docsStringThe extracted documentation text from the Item.
linksMap<String, ID>A map of intra-doc link names to the IDs of the items they resolve to. For example if the docs string contained "see [HashMap][std::collections::HashMap] for more details" then links would have "std::collections::HashMap": "<some id>".
attrs[String]The unstable stringified attributes (other than doc comments) on the Item (e.g. ["#[inline]", "#[test]"]).
deprecationDeprecation(Optional) Information about the Item’s deprecation, if present.
kindStringThe kind of Item this is. Determines what fields are in inner.
innerObjectThe type-specific fields describing this Item. Check the kind field to determine what’s available.

Restricted visibility

When using --document-private-items, pub(in path) items can appear in the output in which case the visibility field will be an Object instead of a string. It will contain the single key "restricted" with the following values:

NameTypeDescription
parentIDThe ID of the module that this items visibility is restricted to.
pathStringHow that module path was referenced in the code (like "super::super", or "crate::foo").

kind == "module"

NameTypeDescription
items[ID]The list of Items contained within this module. The order of definitions is preserved.

kind == "function"

NameTypeDescription
declFnDeclInformation about the function signature, or declaration.
genericsGenericsInformation about the function’s type parameters and where clauses.
headerString"const", "async", "unsafe", or a space separated combination of those modifiers.
abiStringThe ABI string on the function. Non-extern functions have a "Rust" ABI, whereas extern functions without an explicit ABI are "C". See the reference for more details.

kind == "struct" || "union"

NameTypeDescription
struct_typeStringEither "plain" for braced structs, "tuple" for tuple structs, or "unit" for unit structs.
genericsGenericsInformation about the struct’s type parameters and where clauses.
fields_strippedboolWhether any fields have been removed from the result, due to being private or hidden.
fields[ID]The list of fields in the struct. All of the corresponding Items have kind == "struct_field".
impls[ID]All impls (both trait and inherent) for this type. All of the corresponding Items have kind = "impl"

kind == "struct_field"

NameTypeDescription
typeTypeThe type of this field.

kind == "enum"

NameTypeDescription
genericsGenericsInformation about the enum’s type parameters and where clauses.
fields[ID]The list of variants in the enum. All of the corresponding Items have kind == "variant".
fields_strippedboolWhether any variants have been removed from the result, due to being private or hidden.
impls[ID]All impls (both trait and inherent) for this type. All of the corresponding Items have kind = "impl"

kind == "variant"

Has a variant_kind field with 3 possible values and an variant_inner field with more info if necessary:

  • "plain" (e.g. Enum::Variant) with no variant_inner value.
  • "tuple" (e.g. Enum::Variant(u32, String)) with "variant_inner": [Type]
  • "struct" (e.g. Enum::Variant{foo: u32, bar: String}) with "variant_inner": [ID] which is a list of this variant’s “struct_field” items.

kind == "trait"

NameTypeDescription
is_autoboolWhether this trait is an autotrait like Sync.
is_unsafeboolWhether this is an unsafe trait such as GlobalAlloc.
items[ID]The list of associated items contained in this trait definition.
genericsGenericsInformation about the trait’s type parameters and where clauses.
bounds[GenericBound]Trait bounds for this trait definition (e.g. trait Foo: Bar<T> + Clone).

kind == "trait_alias"

An unstable feature which allows writing aliases like trait Foo = std::fmt::Debug + Send and then using Foo in bounds rather than writing out the individual traits.

NameTypeDescription
genericsGenericsAny type parameters that the trait alias takes.
bounds[GenericBound]The list of traits after the equals.

kind == "method"

NameTypeDescription
declFnDeclInformation about the method signature, or declaration.
genericsGenericsInformation about the method’s type parameters and where clauses.
headerString"const", "async", "unsafe", or a space separated combination of those modifiers.
has_bodyboolWhether this is just a method signature (in a trait definition) or a method with an actual body.

kind == "assoc_const"

These items only show up in trait definitions. When looking at a trait impl item, the item where the associated constant is defined is a "constant" item.

NameTypeDescription
typeTypeThe type of this associated const.
defaultString(Optional) The stringified expression for the default value, if provided.

kind == "assoc_type"

These items only show up in trait definitions. When looking at a trait impl item, the item where the associated type is defined is a "typedef" item.

NameTypeDescription
bounds[GenericBound]The bounds for this associated type.
defaultType(Optional) The default for this type, if provided.

kind == "impl"

NameTypeDescription
is_unsafeboolWhether this impl is for an unsafe trait.
genericsGenericsInformation about the impl’s type parameters and where clauses.
provided_trait_methods[String]The list of names for all provided methods in this impl block. This is provided for ease of access if you don’t need more information from the items field.
traitType(Optional) The trait being implemented or null if the impl is “inherent”, which means impl Struct {} as opposed to impl Trait for Struct {}.
forTypeThe type that the impl block is for.
items[ID]The list of associated items contained in this impl block.
negativeboolWhether this is a negative impl (e.g. !Sized or !Send).
syntheticboolWhether this is an impl that’s implied by the compiler (for autotraits, e.g. Send or Sync).
blanket_implString(Optional) The name of the generic parameter used for the blanket impl, if this impl was produced by one. For example impl<T, U> Into<U> for T would result in blanket_impl == "T".

kind == "constant"

NameTypeDescription
typeTypeThe type of this constant.
exprStringThe unstable stringified expression of this constant.
valueString(Optional) The value of the evaluated expression for this constant, which is only computed for numeric types.
is_literalboolWhether this constant is a bool, numeric, string, or char literal.

kind == "static"

NameTypeDescription
typeTypeThe type of this static.
exprStringThe unstable stringified expression that this static is assigned to.
mutableboolWhether this static is mutable.

kind == "typedef"

NameTypeDescription
typeTypeThe type on the right hand side of this definition.
genericsGenericsAny generic parameters on the left hand side of this definition.

kind == "opaque_ty"

Represents trait aliases of the form:

type Foo<T> = Clone + std::fmt::Debug + Into<T>;
NameTypeDescription
bounds[GenericBound]The trait bounds on the right hand side.
genericsGenericsAny generic parameters on the type itself.

kind == "foreign_type"

inner contains no fields. This item represents a type declaration in an extern block (see here for more details):

extern {
    type Foo;
}

kind == "extern_crate"

NameTypeDescription
nameStringThe name of the extern crate.
renameString(Optional) The renaming of this crate with extern crate foo as bar.

kind == "import"

NameTypeDescription
sourceStringThe full path being imported (e.g. "super::some_mod::other_mod::Struct").
nameStringThe name of the imported item (may be different from the last segment of source due to import renaming: use source as name).
idID(Optional) The ID of the item being imported.
globboolWhether this import ends in a glob: use source::*.

kind == "macro"

A macro_rules! declarative macro. Contains a single string with the source representation of the macro with the patterns stripped, for example:

macro_rules! vec {
    () => { ... };
    ($elem:expr; $n:expr) => { ... };
    ($($x:expr),+ $(,)?) => { ... };
}

TODO: proc macros

Span

NameTypeDescription
filenameStringThe path to the source file for this span relative to the crate root.
begin(int, int)The zero indexed line and column of the first character in this span.
end(int, int)The zero indexed line and column of the last character in this span.

Deprecation

NameTypeDescription
sinceString(Optional) Usually a version number when this Item first became deprecated.
noteString(Optional) The reason for deprecation and/or what alternatives to use.

FnDecl

NameTypeDescription
inputs[(String, Type)]A list of parameter names and their types. The names are unstable because arbitrary patterns can be used as parameters, in which case the name is a pretty printed version of it. For example fn foo((_, x): (u32, u32)){…} would have an parameter with the name "(_, x)" and fn foo(MyStruct {some_field: u32, ..}: MyStruct){…}) would have one called "MyStruct {some_field, ..}".
outputType(Optional) Output type.
c_variadicboolWhether this function uses an unstable feature for variadic FFI functions.

Generics

NameTypeDescription
params[GenericParamDef]A list of generic parameter definitions (e.g. <T: Clone + Hash, U: Copy>).
where_predicates[WherePredicate]A list of where predicates (e.g. where T: Iterator, T::Item: Copy).

Examples

Here are a few full examples of the Generics fields for different rust code:

Lifetime bounds

pub fn foo<'a, 'b, 'c>(a: &'a str, b: &'b str, c: &'c str)
where
    'a: 'b + 'c, {…}
"generics": {
  "params": [
    {
      "name": "'a",
      "kind": "lifetime"
    },
    {
      "name": "'b",
      "kind": "lifetime"
    },
    {
      "name": "'c",
      "kind": "lifetime"
    }
  ],
  "where_predicates": [
    {
      "region_predicate": {
        "lifetime": "'a",
        "bounds": [
          {
            "outlives": "'b"
          },
          {
            "outlives": "'c"
          }
        ]
      }
    }
  ]

Trait bounds

pub fn bar<T, U: Clone>(a: T, b: U)
where
    T: Iterator,
    T::Item: Copy,
    U: Iterator<Item=u32>, {…}
"generics": {
  "params": [
    {
      "name": "T",
      "kind": {
        "type": {
          "bounds": [],
          "synthetic": false
        }
      }
    },
    {
      "name": "U",
      "kind": {
        "type": {
          "bounds": [
            {
              "trait_bound": {
                "trait": {/* `Type` representation for `Clone`*/},
                "generic_params": [],
                "modifier": "none"
              }
            }
          ],
          "synthetic": false
        }
      }
    }
  ],
  "where_predicates": [
    {
      "bound_predicate": {
        "ty": {
          "generic": "T"
        },
        "bounds": [
          {
            "trait_bound": {
              "trait": {/* `Type` representation for `Iterator`*/},
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    },
    {
      "bound_predicate": {
        "ty": {/* `Type` representation for `Iterator::Item`},
        "bounds": [
          {
            "trait_bound": {
              "trait": {/* `Type` representation for `Copy`*/},
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    },
    {
      "bound_predicate": {
        "ty": {
          "generic": "U"
        },
        "bounds": [
          {
            "trait_bound": {
              "trait": {/* `Type` representation for `Iterator<Item=u32>`*/},
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    }
  ]
}

GenericParamDef

NameTypeDescription
nameStringThe name of the type variable of a generic parameter (e.g T or 'static)
kindObjectEither "lifetime", "const": Type, or "type: Object" with the following fields:
NameTypeDescription
bounds[GenericBound]The bounds on this parameter.
defaultType(Optional) The default type for this parameter (e.g PartialEq<Rhs = Self>).

WherePredicate

Can be one of the 3 following objects:

  • "bound_predicate": {"ty": Type, "bounds": [GenericBound]} for T::Item: Copy + Clone
  • "region_predicate": {"lifetime": String, "bounds": [GenericBound]} for 'a: 'b
  • "eq_predicate": {"lhs": Type, "rhs": Type}

GenericBound

Can be either "trait_bound" with the following fields:

NameTypeDescription
traitTypeThe trait for this bound.
modifierStringEither "none", "maybe", or "maybe_const"
generic_params[GenericParamDef]for<> parameters used for HRTBs

Type

Rustdoc’s representation of types is fairly involved. Like Items, they are represented by a "kind" field and an "inner" field with the related information. Here are the possible contents of that inner Object:

kind = "resolved_path"

This is the main kind that represents all user defined types.

NameTypeDescription
nameStringThe path of this type as written in the code ("std::iter::Iterator", "::module::Struct", etc.).
argsGenericArgs(Optional) Any arguments on this type such as Vec<i32> or SomeStruct<'a, 5, u8, B: Copy, C = 'static str>.
idIDThe ID of the trait/struct/enum/etc. that this type refers to.
param_namesGenericBoundIf this type is of the form dyn Foo + Bar + ... then this field contains those trait bounds.

GenericArgs

Can be either "angle_bracketed" with the following fields:

NameTypeDescription
args[GenericArg]The list of each argument on this type.
bindingsTypeBindingAssociated type or constant bindings (e.g. Item=i32 or Item: Clone) for this type.

or "parenthesized" (for Fn(A, B) -> C arg syntax) with the following fields:

NameTypeDescription
inputs[Type]The Fn’s parameter types for this argument.
outputType(Optional) The return type of this argument.

GenericArg

Can be one of the 3 following objects:

  • "lifetime": String
  • "type": Type
  • "const": Object where the object has a single key "constant" with value that’s the same object as the inner field of Item when kind == "constant"

TypeBinding

NameTypeDescription
nameStringThe Fn’s parameter types for this argument.
bindingObjectEither "equality": Type or "constraint": [GenericBound]

kind = "generic"

"inner"' is a String which is simply the name of a type parameter.

kind = "tuple"

"inner" is a single list with the Types of each tuple item.

kind = "slice"

"inner" is the Type the elements in the slice.

kind = "array"

NameTypeDescription
typeTypeThe Type of the elements in the array
lenStringThe length of the array as an unstable stringified expression.

kind = "impl_trait"

"inner" is a single list of the GenericBounds for this type.

kind = "never"

Used to represent the ! type, has no fields.

kind = "infer"

Used to represent _ in type parameters, has no fields.

kind = "function_pointer"

NameTypeDescription
is_unsafeboolWhether this is an unsafe fn.
declFnDeclInformation about the function signature, or declaration.
params[GenericParamDef]A list of generic parameter definitions (e.g. <T: Clone + Hash, U: Copy>).
abiStringThe ABI string on the function.

kind = "raw_pointer"

NameTypeDescription
mutableboolWhether this is a *mut or just a *.
typeTypeThe Type that this pointer points at.

kind = "borrowed_ref"

NameTypeDescription
lifetimeString(Optional) The name of the lifetime parameter on this reference, if any.
mutableboolWhether this is a &mut or just a &.
typeTypeThe Type that this reference references.

kind = "qualified_path"

Used when a type is qualified by a trait (<Type as Trait>::Name) or associated type (T::Item where T: Iterator).

NameTypeDescription
nameStringThe name at the end of the path ("Name" and "Item" in the examples above).
self_typeTypeThe type being used as a trait (Type and T in the examples above).
traitTypeThe trait that the path is on (Trait and Iterator in the examples above).

Examples

Here are some function signatures with various types and their respective JSON representations:

Primitives

pub fn primitives(a: u32, b: (u32, u32), c: [u32], d: [u32; 5]) -> *mut u32 {}
"decl": {
  "inputs": [
    [
      "a",
      {
        "kind": "primitive",
        "inner": "u32"
      }
    ],
    [
      "b",
      {
        "kind": "tuple",
        "inner": [
          {
            "kind": "primitive",
            "inner": "u32"
          },
          {
            "kind": "primitive",
            "inner": "u32"
          }
        ]
      }
    ],
    [
      "c",
      {
        "kind": "slice",
        "inner": {
          "kind": "primitive",
          "inner": "u32"
        }
      }
    ],
    [
      "d",
      {
        "kind": "array",
        "inner": {
          "type": {
            "kind": "primitive",
            "inner": "u32"
          },
          "len": "5"
        }
      }
    ]
  ],
  "output": {
    "kind": "raw_pointer",
    "inner": {
      "mutable": true,
      "type": {
        "kind": "primitive",
        "inner": "u32"
      }
    }
  }
}

References

pub fn references<'a>(a: &'a mut str) -> &'static MyType {}
"decl": {
  "inputs": [
    [
      "a",
      {
        "kind": "borrowed_ref",
        "inner": {
          "lifetime": "'a",
          "mutable": true,
          "type": {
            "kind": "primitive",
            "inner": "str"
          }
        }
      }
    ]
  ],
  "output": {
    "kind": "borrowed_ref",
    "inner": {
      "lifetime": "'static",
      "mutable": false,
      "type": {
        "kind": "resolved_path",
        "inner": {
          "name": "MyType",
          "id": "5:4936",
          "args": {
            "angle_bracketed": {
              "args": [],
              "bindings": []
            }
          },
          "param_names": null
        }
      }
    }
  }
}

Generics

pub fn generics<T>(a: T, b: impl Iterator<Item = bool>) -> ! {}
"decl": {
  "inputs": [
    [
      "a",
      {
        "kind": "generic",
        "inner": "T"
      }
    ],
    [
      "b",
      {
        "kind": "impl_trait",
        "inner": [
          {
            "trait_bound": {
              "trait": {
                "kind": "resolved_path",
                "inner": {
                  "name": "Iterator",
                  "id": "2:5000",
                  "args": {
                    "angle_bracketed": {
                      "args": [],
                      "bindings": [
                        {
                          "name": "Item",
                          "binding": {
                            "equality": {
                              "kind": "primitive",
                              "inner": "bool"
                            }
                          }
                        }
                      ]
                    }
                  },
                  "param_names": null
                }
              },
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    ]
  ],
  "output": {
    "kind": "never"
  }
}

Generic Args

pub trait MyTrait<'a, T> {
    type Item;
    type Other;
}

pub fn generic_args<'a>(x: impl MyTrait<'a, i32, Item = u8, Other = f32>) {
    unimplemented!()
}
"decl": {
  "inputs": [
    [
      "x",
      {
        "kind": "impl_trait",
        "inner": [
          {
            "trait_bound": {
              "trait": {
                "kind": "resolved_path",
                "inner": {
                  "name": "MyTrait",
                  "id": "0:11",
                  "args": {
                    "angle_bracketed": {
                      "args": [
                        {
                          "lifetime": "'a"
                        },
                        {
                          "type": {
                            "kind": "primitive",
                            "inner": "i32"
                          }
                        }
                      ],
                      "bindings": [
                        {
                          "name": "Item",
                          "binding": {
                            "equality": {
                              "kind": "primitive",
                              "inner": "u8"
                            }
                          }
                        },
                        {
                          "name": "Other",
                          "binding": {
                            "equality": {
                              "kind": "primitive",
                              "inner": "f32"
                            }
                          }
                        }
                      ]
                    }
                  },
                  "param_names": null
                }
              },
              "generic_params": [],
              "modifier": "none"
            }
          }
        ]
      }
    ]
  ],
  "output": null
}

Unstable

Fields marked as unstable have contents that are subject to change. They can be displayed to users, but tools shouldn’t rely on being able to parse their output or they will be broken by internal compiler changes.

Drawbacks

  • By supporting JSON output for rustdoc, we should consider how much it should mirror the internal structures used in rustdoc and in the compiler. Depending on how much we want to stabilize, we could accidentally stabilize the internal structures of rustdoc. We have tried to avoid this by introducing a mirror of rustdoc’s AST types which exposes as few compiler internals as possible by stringifying or not including certain fields.
  • Adding JSON output adds another thing that must be kept up to date with language changes, and another thing for compiler contributors to potentially break with their changes. Hopefully this friction will be kept to the minimum because the JSON output doesn’t need any complex rendering logic like the HTML one. All that is required for a new language item is adding an additional field to a struct.

Alternatives

  • Status quo. Keep the HTML the way it is, and make users who want a machine-readable version of a crate parse it themselves. In the absence of an accepted JSON output, the --output-format flag in rustdoc remains deprecated and unused.
  • Alternate data format (XML, Bincode, CapnProto, etc). JSON was selected for its ubiquity in available parsers, but selecting a different data format may provide benefits for file size, compressibility, speed of conversion, etc. Since the implementation will lean on serde then this may be a non-issue as it would be trivial to switch serialization formats.
  • Alternate data structure. The proposed output very closely mirrors the internal clean AST types in rustdoc. This simplifies the implementation but may not be the optimal structure for users. If there are significant improvements then a future RFC could provide the necessary refinements, potentially as another alternative output format if necessary.

Prior art

A handful of other languages and systems have documentation tools that output an intermediate representation separate from the human-readable outputs:

  • ClangDoc has the ability to output either rendered HTML, or tool consumable YAML.
  • PureScript uses an intermediate JSON representation when publishing package information to their Pursuit directory. It’s primarily used to generate documentation, but can also be used to generate etags files.
  • DartDoc is in the process of implementing a JSON output.
  • Doxygen has an option to generate an XML file with the code’s information.
  • Haskell’s documentation tool, Haddock, can generate an intermediate representation used by the type search engine Hoogle to integrate documentation of several packages.
  • Kythe is a “(mostly) language-agnostic” system for integrating documentation across several languages. It features its own schema that code information can be translated into, that services can use to aggregate information about projects that span multiple languages.
  • GObject Introspection has an intermediate XML representation called GIR that’s used to create language bindings for GObject-based C libraries. While (at the time of this writing) it’s not currently used to create documentation, it is a stated goal to use this information to document these libraries.

Unresolved questions

  • What is the stabilization story? As language features are added, this representation will need to be extended to accommodate it. As this will change the structure of the data, what does that mean for its consumers?
  • How will users be able to manipulate the data? Is it a good idea to host a crate outside the compiler that contains the struct definitions for all the types that get serialized so that people could easily hack on the data without the compiler? Should that crate be the source of truth for those types and be depended on by librustdoc, or should it be a mirror that gets updated externally to reflect the changes to the copy in the compiler?
  • How will intra-doc links be handled?
    • Supporting struct.SomeStruct.html style links seems infeasible since it would tie alternative front-ends to rustdoc’s file/folder format.
    • With the nightly intra-rustdoc link syntax it’s debatable whether we should resolve those to HTML links or leave that up to whatever consumes the JSON. Leaving them unresolved seems preferable but it would mean that consumers have to do markdown parsing to replace them with actual links.
    • In the case of items from the local crate vs external crates should the behavior be different?
    • If there’s an html_root_url attribute/argument for an external crate should the behavior be different?

Output structure questions

These aren’t essential and could be deferred to a later RFC. The current implementation does include spans, but doesn’t do any of the other things mentioned here.

  • Should we store Spans in the output even though we’re not exporting the source itself like the HTML output does? If so is there a simple way to sanitize relative links to the files to avoid inconsistent output based on where rustdoc is invoked from. For example rustdoc --output-format json /home/user/Downloads/project/crate/src/lib.rs would include that absolute path in the spans, but it’s probably preferable to have it just list the filename for single files or the path from the crate root for cargo projects.
  • The proposed implementation exposes a strict subset of the information available to the HTML, backend: the clean types for Items and some mappings from the Cache. Are there other mappings/info from elsewhere that would be helpful to expose to users?
  • There are some items such as attributes that defer to compiler internal symbols in their clean representations which would make them problematic to represent faithfully. Is it OK to simply stringify these and leave their handling up to the user?
  • Should we specially handle Deref trait impls to make it easier for a struct to find the methods they can access from their deref target?
  • Should we specially handle auto-traits? They can be included in the normal set of trait impls for each type but it clutters the output. Every time a user goes through the impls for a type they need to filter out those synthetic impls.