Intro
I’m not on top of traits or generics but found myself looking some of them up anyhow, and came across the Sum
trait.
Here is the Std Lib documentation on Sum
(I believe).
And I guess all of the generics and/or type logic and how they interoperate has thrown me for a bit of a spin … so I thought I’d put my thoughts here. Maybe I’ll work things out in writing it or maybe someone here can help me/us out?
A bit long … sorry
Trait Definition
From the docs and source, here is the trait’s signature:
// core::iter::Sum
pub trait Sum<A = Self>: Sized {
// Required method
fn sum<I: Iterator<Item = A>>(iter: I) -> Self;
}
First thoughts: Defined on elements not iterators?
- The part that confused me at first was what
Self
is actually. Naively, I imagined it was referring to the iterator (or type that’d implementedIterator
) … but that clearly can’t be true because the return type isSelf
. - So …
Sum
is implemented not on any collection but on the element type?! - If so, why not rely on the
Add
Trait at the element level, which is responsible for the addition operator (see docs here)?
Kinda seems so?
- So, in trying to understand this, I thought I’d look at the source of
Iterator::sum()
first figuring that it’d be the main implementation.- See docs on sum() here and source code here
- This is the
sum
you’d be calling in something likevec![1, 2, 3].into_iter().sum()
to get6
.
core::iter::Iterator::sum
fn sum<S>(self) -> S
where
Self: Sized,
S: Sum<Self::Item>,
{
Sum::sum(self)
}
- Ok, so the call of
Sum::sum(self)
clearly indicates that this is not whereSum
is defined (instead it must be inSum::sum()
somehow). - Moreover,
self
is being passed intoSum::sum()
, withself
being theIterator
here … which means there’s no method being called onIterator
itself but something from another module. - Additionally, this method is bound by the generic
<S>
which is defined in thewhere
clause asSum<Self::Item>
… which … wait WTF is going on?- So this method (
Iterator::sum()
) must return a type that has implemented the traitSum
?? - If that’s correct, then that confirms my suspicion that
Sum
is implemented on the elements of an iterator (where I’m sure those comfortable with the generics syntax of the definition above are yelling YES!! OF course!!) - That’s because the return type of
sum()
would generally have to be the same type as the summed elements, soS
is both the type of the elements in the iterator and the return type ofsum
. All good. - And indeed, in the definition of the
type
aliasS
we’ve gotSum<Self::Item>
which binds the return type ofIterator::sum()
to the type of the iterator’s elements (ieSelf::Item
)Self::Item
is technically theItem
type of theIterator
which can, AFAIU, be defined as distinct from the type of the elements of the collection from which the iterator is derived but that’s another story.
- So this method (
Back to the beginning
- So back to trying to understand the definition of
core::iter::Sum
(which I believe is the definition of the trait):
// core::iter::Sum
pub trait Sum<A = Self>: Sized {
// Required method
fn sum<I: Iterator<Item = A>>(iter: I) -> Self;
}
- The trait itself is bound to
Sized
. I don’t know the details aroundSized
(see docs here and The book, ch 19.4 here) but it seems fundamental likely that it applies to vectors and the like. - The generic
A = Self
and its occurrences in the generics for thesum()
function and its return type … are a lot:- AFAIU,
Self
, ie the type onSum
is implemented for, must be theItem
type for theIterator
that will be passed into thesum
method. - But it must also be the return type of
sum()
… which makes sense.
- AFAIU,
- So the confusing part here then is the generic type of the
sum()
method:<I: Iterator<Item = A>>
.- Remember,
A = Self
, so it’s really<I: Iterator<Item = Self>>
(right?) - This generic type is any
Iterator
whoseItem
(ie, the type that is returned each iteration) is the same type asSelf
.
- Remember,
- Which means that if I want to sum a vector if
i32
numbers, I’d have to make sure I’ve implementedSum
not onVec
but oni32
and defined it as a method that takes any iterator ofi32
(ieSelf
) elements to then return ani32
element. - Ok …
Confirmation
- We can look at the implementors of
core::iter::Sum
( see docs here) and check the source for thei32
implementation … - Which gives us this source code:
integer_sum_product! { i8 i16 i32 i64 i128 isize u8 u16 u32 u64 u128 usize }
- Which is using this macro defined in the same file:
macro_rules! integer_sum_product {
(@impls $zero:expr, $one:expr, #[$attr:meta], $($a:ty)*) => ($(
#[$attr]
impl Sum for $a {
fn sum<I: Iterator<Item=Self>>(iter: I) -> Self {
iter.fold(
$zero,
#[rustc_inherit_overflow_checks]
|a, b| a + b,
)
}
}
- which … uses
fold()
(basicallyreduce
but with an initial value) and plain addition in the anonymous/closure function|a, b| a + b
. What!?
Why? How?
-
Ok that was a long way to go to find the addition operator at the bottom of the heap of traits!
-
Hopefully I’ve grasped the mechanics?!
-
I’m not quite clear on why it’s build this way. I’m guessing there’s some flexibility baked into the way that the relevant implementation of
Sum
depends on the element type, which can be flexibly defined as theItem
type of anIterator
independently of the type of the collection’s elements. That is, an iterator can utilise a type different from the actual elements of a collection and then rely on its particular implementation of sum. And then this can be independent fromAdd
. -
But that feels like a lot of obscure flexibility for a pretty basic operation, no?
-
For example, this code doesn’t compile because a type needs to be specified, presumably type inference gets lost amongst all the generics?
// doesn't compile
let x = vec![1i32, 2, 3].into_iter().sum();
// These do compile
let x2 = vec![1i32, 2, 3].into_iter().sum::<i32>(); // turbofish!!
let x3: i32 = vec![1i32, 2, 3].into_iter().sum();
- Design choices aside …
- I’m still unclear as to how
Iterator::sum()
works
fn sum<S>(self) -> S
where
Self: Sized,
S: Sum<Self::Item>,
{
Sum::sum(self)
}
- How does
Sum::sum(self)
work!? self
is theIterator
(sovec![1i32, 2, 3].iter()
).- And
Sum::sum()
is the essential trait addressed above. - How does rust go from a call of a trait’s method to using the actual implementation on the specific type? I guess I hadn’t really thought about it, but it makes sense and it’s what all those
Self
s are for. - In this case though, it’s rather confusing that the relevant implementation isn’t on the type of
self
, but because of the definition ofSum
, the implementation is on the type of the elements (orItem
specifically) ofself
. Sighs
Thoughts??
You made me curious and I found some discussion on the subject: https://github.com/rust-lang/rust/issues/27739. Short version is that you could do that if you had some other trait that would tell you what the zero value of the type is, so you know what is the sum of
vec![]
. Originally the standard library did just that, the trait was literally calledZero
. But there were some issues with it and it has been removed in favor of the current design.Unfortunately with this design of the
Sum
trait it is impossible to guess the result type from the iterator type. For example, see https://godbolt.org/z/c8M7eshaM.Cheers! And thanks for digging up the github issue!
I’d also caught mentions of the whole zero thing being behind the design. Which is funny because once you get down to the implementation for the numeric types, zero seems (I’m not on top of macro syntax) to be just a parameter of the macro, which then gets undefined in the call of the macro, so I have to presume it defaults to
0
somehow??. In short, the zero has to be provided in the implementation ofsum
for a specific type. Which I suppose is flexible. Though in this case I can’t discern what the zero is for the integer types (it’s explicitly0.0
for floats).Guessing Result Types
This bit, I don’t understand!
// core::iter::iterator fn sum<S>(self) -> S where Self: Sized, S: Sum<Self::Item>, { Sum::sum(self) }
The definition is pretty clear here right? The generic here is
Sum<Self::Item>
, abbreviated toS
… which AFAIU … means that the element type of the iterator — hereSelf::Item
— is the type that has implementedSum
… and the type that will be returned.IE, whatever the type of the iterator’s elements … that is the type of the result of
sum
… right?What’s more, rust knows when you’ve provided the wrong type.
EG … this doesn’t compile:
let x4 = vec![1i32, 2, 3].into_iter().sum::<i64>();
The error message here is: “the trait
Sum<i32>
is not implemented fori64
”.So the compiler knows the relevant types and can check whether the appropriate trait is implemented … but doesn’t do so for type inference, which AFAICT, would be fine if there is only one valid trait defined (as is the case for all of the basic numerics).
So it seems that it isn’t impossible but rather it’s just not done for some reason of efficiency?
Otherwise, it is interesting to consider the flexibility provided here (at least to me) … that one can apparently implement
Sum<i32>
for another different numeric type.However, I don’t actually understand how that could be done (I probably don’t understand traits and generics well enough here).
As the definition of the
Sum
trait is:pub trait Sum<A = Self>: Sized { fn sum<I: Iterator<Item = A>>(iter: I) -> Self; }
So I’d presume the
A = Self
followed byI: Iterator<Item = A>
for the iterator binds the implementation pretty clearly to the type of the iterator’s elements.So how could a
Sum<i32>
possibly be implemented fori64
??In the end though I think it’s informative to see that this is a pretty valid alternative to using
sum
and disambiguating the result type:vec![1i32, 2, 3].into_iter().reduce(|a,x| a+x); // or if you want to provide the 0 vec![1i32, 2, 3].into_iter().fold(0, |a,x| a+x);
It uses hardly any more characters too!
Thanks again for the response!!
Quite confusingly, the two
=
s have very different meaning here. TheItem = A
syntax just says that the iterator’s item type, which is set as the trait’s associated type, should beA
. So, you could read this as “I
should implement theIterator
trait, and theItem
associated type of this implementation should beA
”.However,
A = Self
does not actually mean any requirement ofA
. Instead, it means thatSelf
is the default value ofA
: that is, you can doimpl Sum<i64> for i32
and then you will haveSelf
equal toi32
andA
equal toi64
, but you can also doimpl Sum for i32
and it will essentially be a shorthand forimpl Sum<i32> for i32
, giving you bothSelf
andA
equal toi32
.In the end, we have the relationship that the iterator item should be the same as
A
, but we do not have the relationship thatSelf
should be the same asA
. So, given this trait, the iterator item can actually be different toA
.Note that the standard library does actually have implementations where these two differ. For instance, it has
impl<'a> Sum<&'a i32> for i32
, giving you a possibility to sum the iterator of&i32
intoi32
. This is useful when you think about this: you might want to sum such an iterator without.copied()
for some extra ergonomics, but you can’t just return&i32
, there is nowhere to store the referencedi32
. So, you need to return thei32
itself.In
Sum<Self::Item>
,Self::Item
is theA
parameter, andSum<Self::Item>
, orS
, is the type that implements the trait (which is calledSelf
in the definition of theSum
trait, but is different to theSelf
in thesum
method definition). As above,A
andS
can be different.It might be helpful to contrast this definition with a more usual one, where the trait does not have parameters:
fn some_function<S>(…) -> … where S: SomeTrait, {…} fn sum<S>(…) -> … where S: Sum<Self::Item>, {…}
Note that you might have an intuition from some other languages that in case of polymorphism, the chosen function either depends on the type of one special parameter (like in many OOP languages, where everything is decided by the class of the called object), or of the parameter list as a whole (like in C++, where the compiler won’t let you define
int f()
andfloat f()
at the same time, but will be fine withint f(int)
andfloat f(float)
). As you can see, in Rust, the return type also matters. A simpler example of this is theDefault
trait.Regarding inference, some examples (Compiler Explorer link):
vec![1i32].into_iter().sum(); // or: <_ as Sum<_>>::sum(vec![1i32].into_iter()); // error[E0283]: type annotations needed // note: cannot satisfy `_: Sum<i32>`
Compiler knows that the iterator contains
i32
s, so it looks for something that implementsSum<i32>
. But we don’t tell the compiler what to choose, and the compiler does not want to guess by itself.vec![1i32].into_iter().sum::<i32>(); // or: <i32 as Sum<_>>::sum(vec![1i32].into_iter());
As above the compiler knows that it wants to call something that implements
Sum<i32>
, but now it only has to check thati32
is such type. It is, so the code compiles.vec![1i32].iter().sum::<i32>(); // or: <i32 as Sum<_>>::sum(vec![1i32].iter());
Now we actually have a iterator of references, as we used
.iter()
instead of.into_iter()
. But the code still compiles, sincei32
also implementsSum<&i32>
.vec![1i64].into_iter().sum::<i32>(); // or: <i32 as Sum<_>>::sum(vec![1i64].into_iter()); // error[E0277]: a value of type `i32` cannot be made by summing an iterator over elements of type `i64` // help: the trait `Sum<i64>` is not implemented for `i32`
Now the compiler can calculate itself that it want to call something that implements
Sum<i64>
. However,i32
does not actually implement it, hence the error. If it did, the code would compile correctly.vec![].into_iter().sum::<i32>(); // or: <i32 as Sum<_>>::sum(vec![].into_iter()); // error[E0283]: type annotations needed // (in the second case) note: multiple `impl`s satisfying `i32: Sum<_>` found in the `core` crate: impl Sum for i32; impl<'a> Sum<&'a i32> for i32;
Now the situation is reversed. The compiler knows the return type, so it knows that
i32
should implement someSum<_>
. But it doesn’t know the iterator element type, and so it doesn’t know if it should choose the owned value, or the reference version. Note that the wording is different, the compiler wants to guess, but it can’t, as there are multiple possible choices. But if there is only one choice, the compiler does guess it:struct X {} impl Sum for X { fn sum<I: Iterator<Item = X>>(_: I) -> Self { Self{} } } vec![].into_iter().sum::<X>(); // or: <X as Sum<_>>::sum(vec![].into_iter());
builds correctly. I am not sure about the reason for the difference (I feel like it’s related to forward compatibility and the fact that outside the standard library I can do
impl Sum<i32> for MyType
but notimpl Sum<MyType> for i32
, but I don’t really know).Hope that helps :3
EDIT:
Ah, I read this, thought about this, and forgot about this almost immediately. I know almost nothing about macros, but if I understand correctly, the zero is in line 92, here:
($($a:ty)*) => ( integer_sum_product!(@impls 0, 1, #[stable(feature = "iter_arith_traits", since = "1.12.0")], $($a)*); integer_sum_product!(@impls Wrapping(0), Wrapping(1), #[stable(feature = "wrapping_iter_arith", since = "1.14.0")], $(Wrapping<$a>)*); );
The intention seems to be to take a list of types (
i8 i16 i32 i64 i128 isize u8 u16 u32 u64 u128 usize
), and then for each type to generate both the regular andWrapping
version, each time calling into the path you have seen before. For floats there is noWrapping
version, so this time0.0
is really the only kind of zero that can appear.So, just to riff on this a bit for fun …
It seems then that we could have a
Sum
trait that didn’t require providing the result type??If the trait were defined something like:
trait Sum2: Sized { fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self; }
… that is, without the a generic and with the
Item
type of theIterator
bound toSelf
.I wonder if rust could then do with
sum
being basic like this and not requiring a result type and thensum_gen
for the generic case??I had a shot at sort of quickly prototyping this and came up with the following:
fn main() { // so that there's no need to provide the result type trait Sum2: Sized { fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self; } impl Sum2 for i32 { fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self { iter.fold(0, |a,x| a + x) } } struct MyVec<T>(Vec<T>); impl<T> MyVec<T> { fn sum2(self) -> T where Self: Sized, T: Sum2, { Sum2::sum2(self.0.into_iter()) } } let z = MyVec(vec![1i32, 2, 3]).sum2(); println!("My own custom sum trait?: {z}"); // doesn't compile as `i64: Sum2` not satisfied let z2 = MyVec(vec![1i64, 2, 3]).sum2(); }
I don’t know the best way of implementing a new
sum
method forvecs
(let alone allIterators
) and so the best I could come up with was to wrapVec
and then create anIterator
directly in thesum2()
method. It seemed to work well enough though!Once I learn more about traits etc it might be a fun exercise to see how close you can get to implementing one’s own convenience sum method!
Thanks again!!
You are a fucking hero my friend!! Thanks so much!!
Thanks for clarifying the macro … I didn’t understand what was going on there, but it makes sense that that’s where the zero comes from.
And yea, as for the generics … I didn’t know about the default value (I was clearly reaching beyond what I could grasp!) … thanks again!!