• Avid Amoeba@lemmy.ca
    link
    fedilink
    English
    arrow-up
    25
    arrow-down
    2
    ·
    10 months ago

    Truly. I wonder if ActivityPub could be utilized to create a resilient search engine that shares the cost among federated instances. We already have something like that in Lemmy and Mastodon where federated data can be search from any instance. If the data is pages crawled by some automatic crawler which is then federated across instances which in turn allow to search through it, perhaps it might resemble a search engine. Page ranking beyond text matching could even be done by peoples up/down votes instead of some arbitrary algorithm. Similar to how voting works on StackExchange or Lemmy. 🤔 I’m sure someone is thinking about this.

    • deur@feddit.nl
      link
      fedilink
      English
      arrow-up
      35
      ·
      10 months ago

      The answer to your question is no, federation is not an appropriate model for internet scale search.

      • Sigh_Bafanada@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        10 months ago

        Yeah I think you need a centralized system with decentralized ownership, so that no single party can fuck it up by themselves

      • Avid Amoeba@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        10 months ago

        Just to be clear, what I’m referring to here is that a search would occur on a single instance. E.g. searches on lemmy.world occur on the lemmy.world instance, and load lemmy.world’s servers. The federated part is in the building the database on lemmy.world. E.g. a crawler or a user on lemmy.ca adds a new web site and that record is federated to lemmy.world to add to its database. Another user on feddit.de upvotes a search result and that upvote is federated to lemmy.world so that the search result shows higher for users searching on lemmy.world. In this kind of model individual search instances could in fact be very large based on their usage. If there’s no limit to what’s federated, that would put a lower bound on the size of instances. If there’s a limit (something dumb like federate only search records for *.fr domains) then that would allow for smaller instances that don’t have the compute and storage for the complete index.

    • umbrella
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      the biggest question would be how to defend it from spammers and corporations with potentially much more money.

      • Avid Amoeba@lemmy.ca
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        10 months ago

        One answer that’s proven to work is by involving a lot of people’s labor in the editorial/curation process. Similar to how posting/commenting/voting/moderation work on Lemmy, how it’s worked on Reddit and other human-driven platforms. Corporations have proven on multiple occasions that paying for this labor is not feasible and so a system that depends on it should be corpo-resistant or capital-resistant.

        • umbrella
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          10 months ago

          well reddit did that and was full of shills and bots, vote manipulation, and more, this approach completely failed for them.

          and they do put a lot of money into it.