Why are some sites missing from findx?

Because we are an independent search engine and not a metasearch engine, our spider is continuously crawling the web, adding pages to the findx index. And because we are a European search engine, European and English-language sites are a priority for findx as it is growing and improving its index.

But not all websites allow every spider to trawl through their web pages – some sites allow only a few search engines to index their site.

How is access restricted to a website?

This is an issue faced by all search engines, not just findx. Every website can specifically allow or disallow spiders or other ‘user-agents’ (bots or human users) from accessing their pages, by explicitly naming them in a file called robots.txt.

Some sites may only allow Google to see their pages, others may only allow Yandex. Some sites may disallow all user-agents, or only allow traffic from one particular bot to see their pages.

It’s a common practice to limit indexing of pages by using a detailed robots.txt.

Not allowed to index

Unfortunately, the findxbot is not allowed to index some of the larger and popular English-language websites.

We have specifically asked that our spider have permission to index pages on some of these sites, but a couple have said no (Yelp and Quora), and we are waiting to hear back from some others (Github, Facebook and LinkedIn).

You can see the full list of websites that findx is not allowed to index in the help, and we have included a link to each site’s robots.txt so you can see what other user-agents these sites have also blocked.

What if I want to search on those sites?

Findx is a true search engine and wants to comply with the known robots.txt standards That’s why we can’t show you any results from the sites that don’t allow us to index.

But you can use a search exit to quickly and easily repeat your search on those specific sites. Simply type !q in the search box to jump to Quora, or !fb for Facebook.

Search Quora from findx using !q

Or, you can write to the site and ask them to give permission to findx to index their pages! 😉

