DDG vs. Startpage vs. Searxes
12

In terms of privacy, this is how the Searxes (meta of meta searches) compares to DDG, Startpage, and Mojeek:

privacy factor DDG Startpage Mojeek Searxes
caught violating privacy policy yes no no no
bad track record (history of privacy abuse) yes (CEO founded Names DB) owned by targetted ad agency no
feeds other privacy abusers yes (Verizon-Yahoo, Microsoft, Amazon, CloudFlare) yes (Google, CloudFlare) no no
privacy-hostile sites in search results yes yes yes (but appears less frequent than ddg) no (CloudFlare sites filtered out)
server code is open source no no no yes
has an onion site yes (but Tor-hostile results still given) no no yes
gives users a proxy or cache no yes (using Anonymous View feature) no yes (via the favicons)

Superficially Metager is privacy respecting and there’s even an .onion host for it. So I’ll have to add it to the table in the future.

For the moment, I’ll say that Metager shares the following with advertisers:

  • first 2 blocks of your IP address
  • user-agent string
  • your search query They say it’s for non-personalised advertizing.
@ajz
link
42Y

Thanks for sharing. Would be nice to have some sources about the claim that DDG is not privacy friendly. Furthermore recently I found this web page about privacy friendly search engines. https://restoreprivacy.com/private-search-engine/ (a website using cloudflare itself). What is your opinion about Mojeek and MetaGer ? (MetaGer is open source and can be self-hosted).

@cipherpunk
creator
link
6
edit-2
2Y

Mojeek does their own crawling. That’s quite impressive, because unlike DDG it means they don’t have to choose from a pool of privacy abusers to buy search results. I did a test search on “petition sites” and was impressed that the first page did not contain the typical privacy abusing cloudflare results (change.org, moveon.org, etc). Mojeek also does not buy hosting from Amazon/MS/Google. I should perhaps add them to the table.

I didn’t know MetaGer was free software. I occasionally use metager.com but my first port of call is Searxes b/c searxes filters out cloudflare sites.

@ajz
link
12Y

Nice to hear this. Thanks for sharing.

@AgreeableLandscape
admin
link
22Y

Does anyone know where Searxes gets its search data? IIRC DuckDuckGo uses a combination of Bing and its own web crawler, and StartPage uses a variety of providers, including Google.

@cipherpunk
creator
link
42Y

DDG is a meta-search just as every searx instance is. DDG pays MS Bing and Yahoo (Verizon) for API access. Hence the “feeds other privacy abusers” row.

Startpage pays Google for API access to search results.

Each searx instance does something different. Many scrape DDG/Google/Startpage/Bing etc. for results. Scraping avoids financially supporting those bad actors. But they often get a CAPTCHA, which causes an error that is sent to the user. Some searx instances run their own YaCy crawler. It’s also feasible that a searx instance could buy API access but I don’t suspect this is common.

Searxes does a daily survey of searx nodes that are functioning well, and randomly dispatches queries to an instance. So it’s a meta-search of the decentralized searx network and it has quite good quality control.

@wazowski
link
3
edit-2
2Y

deleted by creator

@cipherpunk
creator
link
4
edit-2
2Y

StartPage recently was acquired by an ad agency

Good find… I’ll add it.

Also not really sure why you would include a “tor-hostile” and “privacy-abusers” websites included in the web search included section… censorship, any kind of censorship is bar regardless…

Can you clarify what you mean? The note about Tor-hostile results is on the row labeled “has onion site”. Generally, only Tor users visit onion sites. So if you give a Tor user links that they cannot interact with (e.g. b/c they’ll get a 403 access denied error or broken CAPTCHA), the content is already censored by the webmaster. Polluting the search results with such broken/unusable links actually suppresses links that would be more useful by pushing them down.

Note as well the Searxes doesn’t censor CloudFlare links. I’m not sure if that’s your concern. Search engines with a linear output have to decide on the sequence to list links. The topmost links are the first seen by the user, so it’s important to rank the sites by usefulness. You cannot add a link to the top of the page without pushing another link off the screen. CloudFlare links are poorest quality of results with lower usability because of blockades, CAPTCHAs, and other shenanigans. They don’t get censored by Searxes but they do get low positions to ensure that the more useful links appear first. That’s not censorship; it’s simply smart ranking.

Regarding privacy abuse, I can only guess that you are talking about the row “feeds other privacy abusers”. That means the search engine benefits a privacy abuser, so I see no connection to censorship in that context. E.g. DDG partners with Yahoo to deliver ads to users. That generates ad revenue for Yahoo. Yahoo has a long history of privacy abuse and DDG is helping a corporation that should be boycotted.

Note as well this is the c/privacy community, so the focus here is not censorship but rather privacy.

@wazowski
link
1
edit-2
2Y

deleted by creator

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

  • Posting a link to a website containing tracking isn’t great, if contents of the website are behind a paywall maybe copy them into the post
  • Don’t promote proprietary software
  • Try to keep things on topic
  • If you have a question, please try searching for previous discussions, maybe it has already been answered
  • Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
  • Be nice :)

Related communities

much thanks to @gary_host_laptop for the logo design :)

  • 0 users online
  • 31 users / day
  • 96 users / week
  • 195 users / month
  • 616 users / 6 months
  • 3470 subscribers
  • 1873 Posts
  • 8340 Comments
  • Modlog