Monday, April 10, 2006
Web search. the user interface
When new technologies arrive, metaphors are used to explain new behavior and functionality they bring by associating technical jargon with concepts already known. Internet and PCs are the pillars of the Information Age but it was only with the World Wide Web and links as the primary user interface metaphor that a user realized there is a window - the browser - to access information in the rest of the world. Nowadays, the knowledge society concept is not generally understood but thanks to web search - as the user interface mechanism - it is becoming clear; the results of a query provide relevant resources to any given interest which is how knowledge societies get together.
The first step to knowledge building is an adequate organization of information. 10 years ago, Netcraft Web Server Survey reported more than 150K sites active and there were two approaches used to organize the amount of information available: directories, a human-powered effort trying to classify web sites and search engines, which were gathering the text available in web pages at reach. The first approach could not handle the exponential growth of the web and the second one was in danger because the ranking mechanism was being tricked by term repetition when a popularity approach came to rescue.
Back in 1998, links were generated to lead the reader to a resource that was worth to check out if more information was needed. This human-powered action is a vote for the linked site and algorithms like HITS by Jon Kleinberg and PageRank by Larry Page and Sergey Brin use this concept to provide a better ranking of the results, this approach is one of the factors that made Google the preferred search engine.
In The Search, John Battelle explains how Google and its rivals had been transforming our culture. When you need some information about any topic it is almost a reaction to open your favorite search engine in the browser and ask for resources about that interest, sometimes you may not find exactly what you are looking for but it is a given that you will find something related at least. According to PEW search engine use is the second Internet activity on a typical day and it is probably the top activity performed in the browser.
This usage popularity makes commercial sites wish to get better visibility on search engines, to cover this need a industry called search engine marketing has been established which focuses in two areas: paid search or search advertising and search engine optimization. SEO practitioners analyze how search engines work and suggest strategies to get better ranking positions on determined queries, sometimes this knowledge is used to deliberately affect results so search engines are constantly adjusting algorithms to avoid these attempts.
Since PageRank is query independent it was implemented faster than other link-based ranking algorithms but it can also be compromised by link spamming - automatic generation of votes. Google as the industry leader has been using other refinements to maintain relevancy like the Hilltop algorithm to enhance result accuracy using a subset index to identify expert documents relevant to the query; historical data from web sites/pages to determine trust and freshness; search user behavior analysis to include personalized factors and tools for publishers like sitemaps to coordinate how pages are included in Google index.
The search box has become a user interface metaphor in web sites, and is moving from HTML code to the browser toolbar and ultimately to the desktop, making it the de-facto mechanism used to ask for information from computers. Although search engines provide a set of operators to allow query refinement, an average search user is not aware of them - only 2.7 words are usually present in a query - and this is the current search engines’ challenge: deliver the highest user satisfaction when users provide so little information about it. There are some UI experiments in place trying conversational approaches - related searches suggestion - and contextual - information supply - that are going to influence the users expectations when using the next generation of web search applications.
Web Search 1.0 established the way we request knowledge resources from Internet using common knowledge, my take is that the next innovation flow will focus on specific knowledge by improving relevancy in vertical markets, communities and private repositories - enterprise or personal - and you will read more about this on upcoming posts, in the mean time if you want to keep an eye on search engine industry news, the Search Engine Watch Blog is the place to visit.
Labels: web-search
