Search engine basics

Dec 24, 2007

Over at we have developed one of The Netherlands’ biggest search engines (we’re doing so many searches per day, and making it easy for buyers to find the things they are looking for is considered “core business”). A post by Tim Bray actually summarizes nicely the basics of a search engine.

The post is already 4 years old (posted in late 2003) but is still quite relevant. It focuses primarily on filtering to a relevant result set, although it doesn’t ignore sorting there is so much more one can do in that area nowadays.

For those that are absolutely new to this area, or find them selves playing with this type of technology but want to read up on it I especially recommend the following articles:

And, before you think about breaking the market and build an intelligent search engine, please read this.

Excellent write up, even 4 years later. Thanks Tim!

Another post, more related to relevance and the order your results will appear the relevance you can attach to users click behavior. In short, users are generally inclined to click on items at the top more frequently, so basing your relevance metrics on this you need to “un bias” your data.

Secondly, as users get further down the result set -in the aggregate- they are going to switch to a different mode of selecting the items they are going to click on: they will actually start reading the excerpts and decide, based on the information present on the result set, on which items to click. In other words, here you should actually not try to “un bias” your data!