Just a quick riff/hack on whether it’d be hard to make a collect() method that “collected” into a Vec without needing any turbofish (see, if you’re interested, my prior post on the turbofish.

Some grasp of traits and iteration is required to comfortably get this … though it might be a fun dive even if you’re not

Background on collect

The implementation of collect is:

fn collect<B: FromIterator<Self::Item>>(self) -> B
where
    Self: Sized,
{
    FromIterator::from_iter(self)
}

The generic type B is bound by FromIterator which basically enables a type to be constructed from an Iterator. In other words, collect() returns any type that can be built from an interator. EG, Vec.

The reason the turbofish comes about is that, as I said above, it returns “any type” that can be built from an iterator. So when we run something like:

let z = [1i32, 2, 3].into_iter().collect();

… we have a problem … rust, or the collect() method has no idea what type we’re building/constructing.

More specifically, looking at the code for collect, in the call of FromIterator::form_iter(self), which is calling the method on the trait directly, rust has no way to determine which implementation of the trait to use. The one on Vec or HashMap or String etc??

Thus, the turbofish syntax specifies the generic type B which (somehow through type inference???) then determines which implementation to use.

let z = [1i32, 2, 3].into_iter().collect::<Vec<_>>();

IE: Use the implementation on Vec!

Why not just use Vec?

I figure Vec is used so often as the type for collecting an Iterator that it could be nice to have a convenient method.

The docs even hint at this by suggesting that calling the FromIterator::from_iter() method directly from the desired type (eg Vec) can be more readable (see FromIterator docs).

EG … using collect:

let d = [1i32, 2, 3];
let x = d.iter().map(|x| x + 100).collect::<Vec<_>>();

Using Vec::from_iter()

let y = Vec::from_iter(d.iter().map(|x| x + 100));

As Vec is always in the prelude (IE, it’s always available), using from_iter clearly seems like a nicer option here.

But you lose method chaining! So … how about a method on Iterator, like collect but for Vec specifically? How would you make that and is it hard??

Making collect_vec()

It’s not hard actually

  • Define a trait, CollectVec that defines a method collect_vec which returns Vec<Self::Item>
  • Make this a “sub-trait” of Iterator (or, make Iterator the “supertrait”) so that the Iterator::collect() method is always available
  • Implement CollectVec for all types that implement Iterator by just calling self.collect()the type inference will take care of the rest, because it’s clear that a Vec will be used.
trait CollectVec: Iterator {
    fn collect_vec(self) -> Vec<Self::Item>;
}

impl<I: Iterator> CollectVec for I {
    fn collect_vec(self) -> Vec<Self::Item> {
        self.collect()
    }
}

With this you can then do the following:

let d = [1i32, 2, 3];
let d2 = d.iter().map(|x| x + 1).collect_vec();

Don’t know about you, but implementing such methods for the common collection types would suit me just fine … that turbofish is a pain to write … and AFAICT this isn’t inconsistent with rust’s style/design. And it’s super easy to implement … the type system handles this issue very well.

  • Jayjader@jlai.luM
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 months ago

    The idea & execution are great, I just don’t know that I would ever do this for collecting into Vecs myself.

    When I’m tired of writing turbofish, I usually just annotate the type for the binding of the “result”:

    let d = [1i32, 2, 3];
    let y: Vec<_> = d.iter().map(|x| x + 100).collect();
    

    So often have I collected into a vector then later realized that I really wanted a map or set instead, that I prefer keeping the code “flat” and duplicated (i.e. we don’t “go into” a specific function) so that I can just swap out the Vec for a HashMap or BTreeSet when & where the need arises.

    • maegul (he/they)OPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 months ago

      So often have I collected into a vector then later realized that I really wanted a map or set instead, that I prefer keeping the code “flat” and duplicated (i.e. we don’t “go into” a specific function) so that I can just swap out …

      Yea good point. And yea, annotating the binding rather than using the turbofish also seems more natural to me too.

      In the end though, my motivation here was to see if I could, not whether I should! 😜

      Though to be fair, I can see myself adding collect_vec() to a codebase if I know I will be collecting into a bunch of vecs. Just because it’s my little monster and I’m biased! And adding other methods for the other common collections probably wouldn’t be too hard??!!