Isn’t it possible to “find” the most valuable website in the web with the help of a well mixed community? I am thinking if an small browser add-on which can share the basic url of a visited websites with a website scrapper. The scrapper can then index the whole website with its sub pages. The add-on could be installed independently by users which would like to strengthen the network.
Besides setting up an own search index, one could try to export search results from #Google and Bing as a ramp up help, which are similar to #startpage and #duckduckgo. I mean I am no search engine expert, but is there so much more magic?
If you want a competitive engine, you would need more content than just the websites visited by the volunteers. What if none of the volunteers were Lemmy users or used sites that link to Lemmy instances?
How do you plan to discover tiny low-traffic sites that are still expected results? How do you plan to rank results? How do you plan to detect SEO-hacking and other spammy manipulation of your engine? How do you plan to make your service more appealing than Google/Bing/DDG/Yandex/etc.? How do you plan to host an engine that is able to crawl, process, store and search all the results, and how do you plan on paying for all of that?
As for magic, there are extra convenience features that the biggest search engines do, yes. Search “1 USD to EUR” for an example, or “define lemon”. Then there are also filters and settings, all kins of things.
Why do you ask? Is there a plan you had?
Yeah you are right with many points. I do not have any special plans - I am just annoyed by Google and Bing or also now DDG that you are always spied on different services in the web. While we have open and distributed networks such as Mastodon, Pixelfed and many more, we do not have a usabable federated search engine. I know YaCy from long time ago, but I think its also still not competitive yet. As Searx is just a meta search engine built on top of the known emgines, it is also not an alternative.
Sepia Search [link],[wikipedia] might be interesting to you, it’s for PeerTube videos across many instances.
I’m glad you noticed, I’m annoyed by all the people who kept recommending it as an alternative to DDG when a recent complaint came out about them censoring more results!
There might be some open-source search engines that you could look into to answer some of these questions, I’m a bit too tired now to research that. I know Sepia Search is, but maybe some more general purpose ones are too.
I did a little SEO work when I was starting in the industry (not my proudest). Google really was playing a cat and mouse game with the SEO folks. Like take the approach of counting inbound links to a give page from pages with lots of text. So the approach there is to submit “articles” to a bunch of sites and have a link to a client’s site in the author’s bio. Any search engine that gained significant popularity would have to retrod Google’s anti-spam steps over the course of the past 20+ years, likely on a shoestring budget because they would be competing against Google ad revenue that isn’t constrained by the same stringent privacy concerns.