Finish the libtest json output experiment

Metadata
:-----------------	-----------------------------
Point of contact	Ed Page
Teams	cargo, libs-api, testing-devex
Task owners	Ed Page
Status	Proposed
Zulip channel	N/A
Tracking issue	rust-lang/rust-project-goals#255

libtest is the test harness used by default for tests in cargo projects. It provides the CLI that cargo calls into and enumerates and runs the tests discovered in that binary. It ships with rustup and has the same compatibility guarantees as the standard library.

Before 1.70, anyone could pass --format json despite it being unstable. When this was fixed to require nightly, this helped show how much people have come to rely on programmatic output.

Cargo could also benefit from programmatic test output to improve user interactions, including

Wanting to run test binaries in parallel, like cargo nextest
Lack of summary across all binaries
Noisy test output (see also #5089(https://github.com/rust-lang/cargo/issues/5089))
Confusing command-line interactions (see also #8903(https://github.com/rust-lang/cargo/issues/8903), #10392(https://github.com/rust-lang/cargo/issues/10392))
Poor messaging when a filter doesn't match
Smarter test execution order (see also #8685(https://github.com/rust-lang/cargo/issues/8685), #10673(https://github.com/rust-lang/cargo/issues/10673))
JUnit output is incorrect when running multiple test binaries
Lack of failure when test binaries exit unexpectedly

Most of that involves shifting responsibilities from the test harness to the test runner which has the side effects of:

Allowing more powerful experiments with custom test runners (e.g. cargo nextest) as they'll have more information to operate on
Lowering the barrier for custom test harnesses (like libtest-mimic) as UI responsibilities are shifted to the test runner (cargo test)

The status quo

The next 6 months

Experiment with potential test harness features
Experiment with test reporting moving to Cargo
Putting forward a proposal for approval

The "shiny future" we are working towards

Reporting shifts from test harnesses to Cargo
We run test harnesses in parallel

Design axioms

Low complexity for third-party test harnesses so its feasible to implement them
Low compile-time overhead for third-party test harnesses so users are willing to take the compile-time hit to use them
Format can meet expected future needs
- Expected is determined by looking at what other test harnesses can do (e.g. fixture, paramertized tests)
Format can evolve with unexpected needs
Cargo perform all reporting for tests and benches

Ownership and team asks

This section defines the specific work items that are planned and who is expected to do them. It should also include what will be needed from Rust teams. The table below shows some common sets of asks and work, but feel free to adjust it as needed. Every row in the table should either correspond to something done by a contributor or something asked of a team. For items done by a contributor, list the contributor, or ![Heap wanted][] if you don't yet know who will do it. For things asked of teams, list and the name of the team. The things typically asked of teams are defined in the Definitions section below.

Task	Owner(s) or team(s)	Notes
Discussion and moral support	testing-devex, cargo, libs-api
Prototype harness	Ed Page
Prototype Cargo reporting support	Ed Page
Write stabilization report	Ed Page

Definitions

Definitions for terms used above:

Discussion and moral support is the lowest level offering, basically committing the team to nothing but good vibes and general support for this endeavor.
Author RFC and Implementation means actually writing the code, document, whatever.
Design meeting means holding a synchronous meeting to review a proposal and provide feedback (no decision expected).
RFC decisions means reviewing an RFC and deciding whether to accept.
Org decisions means reaching a decision on an organizational or policy matter.
Secondary review of an RFC means that the team is "tangentially" involved in the RFC and should be expected to briefly review.
Stabilizations means reviewing a stabilization and report and deciding whether to stabilize.
Standard reviews refers to reviews for PRs against the repository; these PRs are not expected to be unduly large or complicated.
Prioritized nominations refers to prioritized lang-team response to nominated issues, with the expectation that there will be some response from the next weekly triage meeting.
Dedicated review means identifying an individual (or group of individuals) who will review the changes, as they're expected to require significant context.
Other kinds of decisions:
- Lang team experiments are used to add nightly features that do not yet have an RFC. They are limited to trusted contributors and are used to resolve design details such that an RFC can be written.
- Compiler Major Change Proposal (MCP) is used to propose a 'larger than average' change and get feedback from the compiler team.
- Library API Change Proposal (ACP) describes a change to the standard library.

Rust Project Goals