- Feature Name: n/a
- Start Date: 2015-03-15
- RFC PR: rust-lang/rfcs#979
- Rust Issue: rust-lang/rust#23911
Summary
Make the count
parameter of SliceExt::splitn
, StrExt::splitn
and
corresponding reverse variants mean the maximum number of items
returned, instead of the maximum number of times to match the
separator.
Motivation
The majority of other languages (see examples below) treat the count
parameter as the maximum number of items to return. Rust already has
many things newcomers need to learn, making other things similar can
help adoption.
Detailed design
Currently splitn
uses the count
parameter to decide how many times
the separator should be matched:
let v: Vec<_> = "a,b,c".splitn(2, ',').collect();
assert_eq!(v, ["a", "b", "c"]);
The simplest change we can make is to decrement the count in the
constructor functions. If the count becomes zero, we mark the returned
iterator as finished
. See Unresolved questions for nicer
transition paths.
Example usage
Strings
let input = "a,b,c";
let v: Vec<_> = input.splitn(2, ',').collect();
assert_eq!(v, ["a", "b,c"]);
let v: Vec<_> = input.splitn(1, ',').collect();
assert_eq!(v, ["a,b,c"]);
let v: Vec<_> = input.splitn(0, ',').collect();
assert_eq!(v, []);
Slices
let input = [1, 0, 2, 0, 3];
let v: Vec<_> = input.splitn(2, |&x| x == 0).collect();
assert_eq!(v, [[1], [2, 0, 3]]);
let v: Vec<_> = input.splitn(1, |&x| x == 0).collect();
assert_eq!(v, [[1, 0, 2, 0, 3]]);
let v: Vec<_> = input.splitn(0, |&x| x == 0).collect();
assert_eq!(v, []);
Languages where count
is the maximum number of items returned
C#
"a,b,c".Split(new char[] {','}, 2)
// ["a", "b,c"]
Clojure
(clojure.string/split "a,b,c" #"," 2)
;; ["a" "b,c"]
Go
strings.SplitN("a,b,c", ",", 2)
// [a b,c]
Java
"a,b,c".split(",", 2);
// ["a", "b,c"]
Ruby
"a,b,c".split(',', 2)
# ["a", "b,c"]
Perl
split(",", "a,b,c", 2)
# ['a', 'b,c']
Languages where count
is the maximum number of times the separator will be matched
Python
"a,b,c".split(',', 2)
# ['a', 'b', 'c']
Swift
split("a,b,c", { $0 == "," }, maxSplit: 2)
// ["a", "b", "c"]
Drawbacks
Changing the meaning of the count
parameter without changing the
type is sure to cause subtle issues. See Unresolved questions.
The iterator can only return 2^64 values; previously we could return 2^64 + 1. This could also be considered an upside, as we can now return an empty iterator.
Alternatives
-
Keep the status quo. People migrating from many other languages will continue to be surprised.
-
Add a parallel set of functions that clearly indicate that
count
is the maximum number of items that can be returned.
Unresolved questions
Is there a nicer way to change the behavior of count
such that users
of splitn
get compile-time errors when migrating?
-
Add a dummy parameter, and mark the methods unstable. Remove the parameterand re-mark as stable near the end of the beta period.
-
Move the methods from
SliceExt
andStrExt
to a new trait that needs to be manually imported. After the transition, move the methods back and deprecate the trait. This would not break user code that migrated to the new semantic.