In terms of privacy, this is how the Searxes (meta of meta searches) compares to DDG, Startpage, and Mojeek:

privacy factor DDG Startpage Mojeek Searxes
caught violating privacy policy yes no no no
bad track record (history of privacy abuse) yes (CEO founded Names DB) owned by targetted ad agency no
feeds other privacy abusers yes (Verizon-Yahoo, Microsoft, Amazon, CloudFlare) yes (Google, CloudFlare) no no
privacy-hostile sites in search results yes yes yes (but appears less frequent than ddg) no (CloudFlare sites filtered out)
server code is open source no no no yes
has an onion site yes (but Tor-hostile results still given) no no yes
gives users a proxy or cache no yes (using Anonymous View feature) no yes (via the favicons)

Superficially Metager is privacy respecting and there’s even an .onion host for it. So I’ll have to add it to the table in the future.

For the moment, I’ll say that Metager shares the following with advertisers:

  • first 2 blocks of your IP address
  • user-agent string
  • your search query They say it’s for non-personalised advertizing.
  • AgreeableLandscape
    link
    fedilink
    arrow-up
    2
    ·
    5 years ago

    Does anyone know where Searxes gets its search data? IIRC DuckDuckGo uses a combination of Bing and its own web crawler, and StartPage uses a variety of providers, including Google.

    • cipherpunkOP
      link
      fedilink
      arrow-up
      4
      ·
      5 years ago

      DDG is a meta-search just as every searx instance is. DDG pays MS Bing and Yahoo (Verizon) for API access. Hence the “feeds other privacy abusers” row.

      Startpage pays Google for API access to search results.

      Each searx instance does something different. Many scrape DDG/Google/Startpage/Bing etc. for results. Scraping avoids financially supporting those bad actors. But they often get a CAPTCHA, which causes an error that is sent to the user. Some searx instances run their own YaCy crawler. It’s also feasible that a searx instance could buy API access but I don’t suspect this is common.

      Searxes does a daily survey of searx nodes that are functioning well, and randomly dispatches queries to an instance. So it’s a meta-search of the decentralized searx network and it has quite good quality control.