The flavour of summer – Findx updates and our bot’s behaviour

We hope everyone had a fantastic summer! Even though we haven’t posted often here, we’ve been working hard in the background. We also took the opportunity to take a little time off and enjoy the delightful taste of all kinds of seasonal berries and the classic Danish “Rød grød med fløde”

Findx has been updated!

We’ve listened to the feedback you gave us in the reddit AMA – and implemented a number of things in Findx that were most often requested by reddit users:

  • There is now a “Did you mean…” feature to help you find what you are searching for, even when the spelling is tricky.
  • The front page now loads much faster.
  • Results are displayed faster too.
  • We got rid of those animations that so many of you didn’t like.
  • More relevant results on some searches e.g.. “The New York Times”

Index improvements

You’ll be seeing more relevant results with much less spam now. Millions of spam pages have been removed and our spider that crawls around the web is much faster: Up to 9.4 million new pages are added to Findx each day, and each day up to 7.7 million pages are revisited and updated if they have changed.

There are now over 2 billion pages in the Findx index – it’s a hard working spider!

Have you seen your search results improve or did you find something you didn’t expect? If you have some feedback, leave a comment for us in our online community.

Findxbot – who decides if it’s a good or bad bot?

We must also mention the reports we received about our (previously) poorly behaving bot – we apologise for the bad behaviour, and we have done our best to fix these problems. Nevertheless, we ended up on a blacklist and it has had some serious side effects on our index – Findxbot is well behaved now, but who decides if it’s a good or a bad bot? Read more details about our crawler Findxbot

We had questions from several webmasters about how our bot works, and how they can make sure their webpages are included in the Findx index. We’ve written a detailed help topic about adding your website to the indexing queue and how you should allow Findxbot in your robots.txt file – we hope you find it useful.

We were interviewed on DataEthics

DataEthics is an organisation that promotes data ethical products and services, and consults with industry leaders, educational institutions and other organisations about data privacy, ethical data practices and enabling a person to have control over their own data.

Of course, we jumped at the chance to be interviewed! We explain how Findx is putting people first and center: We listen and implement the feedback you give us, you get to rate the quality of the search results that Findx provides, the search engine code is open source for people who are interested to dive a little deeper, and we don’t store your information, track you across the web, or leave cookies on your device (unless you explicitly tell us to do so).

Read our interview on the DataEthics site

European Conference on Data Ethics

We’ll be at the first European Conference on Data Ethics on Friday September 29th 2017 at Klub.io in Copenhagen. There’ll be a range of presentations on data privacy issues, privacy by design, advertising and tracking, data ethical products and services, and more. You’ll have a chance to join open discussions, working groups and network throughout the day. We hope to see you there!

See the full conference program

Canada is pushing the boundaries of censorship and the responsibilities of search engines

At Findx, we’ve started working on GDPR compliance issues, in order to respect the so-called “right-to-be-forgotten”. This means people can request a result to be removed from a search engine index for a variety of reasons. There are several legislations in place to protect trademarks and handle other infringements, which we also must take care of.

Of course, we need to comply and handle data in a careful and ethical way, but we also see some issues around having search engines being the ones to control and censor the internet. But when you think a bit more about this issue, who or what organizations should determine what must be and must not be found in a search engine? It’s a difficult question to answer. After all, search engine content only lists links to public information that in many cases can still be found at the original source on the web, and potentially will surface in another search engine.

An interesting case to follow is in Canada: Google vs. Equustek.  The Canadian Supreme Court ruled that Google must delete search results worldwide, and now Google now is taking legal action in the US to stop Canada’s Supreme Court from controlling its search results worldwide. Read more about the case on Arstechnica