Saturday, April 15, 2006

Summer of Code 2006 is here, calling open source developers

Last year, Google showed their commitment to the open source community by funding 419 projects - assigned to 40 organizations - to introduce students around the world to Open Source Software Development. I was one of the lucky ones that got accepted and completed my project with a great organization: XWiki, today Chris Di Bona announced that Summer of Code 2006 is a reality and XWiki is participating this year too.

So if you develop open source this is your time to act, just read the Mentor FAQ and see how you can get your organization and favorite open source applications involved, you have until May 1st to submit interest to participate and you can even get 500 USD for your organization by mentoring a student and getting someone to work on that pending feature that you wanted so much to get done.

I haven’t seen any restriction on the number of projects, so if you have a project idea, propose it to your organization and submit the application, open source development will win. And if you are a student you have to get ready to apply on May 1st, I will write some of my experiences from last year’s program on an upcoming post, in the mean time you can read the Student FAQ

Last year’s hispanic open source community participation was low, so to increase it this year I have done an unofficial translation of the mentor FAQ for spanish readers of this blog.

Labels:


Monday, April 10, 2006

Web search. the user interface

When new technologies arrive, metaphors are used to explain new behavior and functionality they bring by associating technical jargon with concepts already known. Internet and PCs are the pillars of the Information Age but it was only with the World Wide Web and links as the primary user interface metaphor that a user realized there is a window - the browser - to access information in the rest of the world. Nowadays, the knowledge society concept is not generally understood but thanks to web search - as the user interface mechanism - it is becoming clear; the results of a query provide relevant resources to any given interest which is how knowledge societies get together.

The first step to knowledge building is an adequate organization of information. 10 years ago, Netcraft Web Server Survey reported more than 150K sites active and there were two approaches used to organize the amount of information available: directories, a human-powered effort trying to classify web sites and search engines, which were gathering the text available in web pages at reach. The first approach could not handle the exponential growth of the web and the second one was in danger because the ranking mechanism was being tricked by term repetition when a popularity approach came to rescue.

Back in 1998, links were generated to lead the reader to a resource that was worth to check out if more information was needed. This human-powered action is a vote for the linked site and algorithms like HITS by Jon Kleinberg and PageRank by Larry Page and Sergey Brin use this concept to provide a better ranking of the results, this approach is one of the factors that made Google the preferred search engine.

In The Search, John Battelle explains how Google and its rivals had been transforming our culture. When you need some information about any topic it is almost a reaction to open your favorite search engine in the browser and ask for resources about that interest, sometimes you may not find exactly what you are looking for but it is a given that you will find something related at least. According to PEW search engine use is the second Internet activity on a typical day and it is probably the top activity performed in the browser.

This usage popularity makes commercial sites wish to get better visibility on search engines, to cover this need a industry called search engine marketing has been established which focuses in two areas: paid search or search advertising and search engine optimization. SEO practitioners analyze how search engines work and suggest strategies to get better ranking positions on determined queries, sometimes this knowledge is used to deliberately affect results so search engines are constantly adjusting algorithms to avoid these attempts.

Since PageRank is query independent it was implemented faster than other link-based ranking algorithms but it can also be compromised by link spamming - automatic generation of votes. Google as the industry leader has been using other refinements to maintain relevancy like the Hilltop algorithm to enhance result accuracy using a subset index to identify expert documents relevant to the query; historical data from web sites/pages to determine trust and freshness; search user behavior analysis to include personalized factors and tools for publishers like sitemaps to coordinate how pages are included in Google index.

The search box has become a user interface metaphor in web sites, and is moving from HTML code to the browser toolbar and ultimately to the desktop, making it the de-facto mechanism used to ask for information from computers. Although search engines provide a set of operators to allow query refinement, an average search user is not aware of them - only 2.7 words are usually present in a query - and this is the current search engines’ challenge: deliver the highest user satisfaction when users provide so little information about it. There are some UI experiments in place trying conversational approaches - related searches suggestion - and contextual - information supply - that are going to influence the users expectations when using the next generation of web search applications.

Web Search 1.0 established the way we request knowledge resources from Internet using common knowledge, my take is that the next innovation flow will focus on specific knowledge by improving relevancy in vertical markets, communities and private repositories - enterprise or personal - and you will read more about this on upcoming posts, in the mean time if you want to keep an eye on search engine industry news, the Search Engine Watch Blog is the place to visit.

Labels:


Tuesday, April 04, 2006

Making it simple and useful

The move from Blogger to WordPress was motivated by a need for more flexibility (tags) and identity (own domain name), and since I was working with Javascript libraries like script.aculo.us I also wanted to have Ajax functionality so I was testing BloxPress integration with various plugins when my one month vacation began.

When I got back to work on the design of this blog I found various interesting plugins that I want to use but I don’t want it to load slow for every visitor. So, thinking from a micromedia perspective, I choose to organize the plugins following an edition approach:

With this approach a reader can select the edition to read and I can plan the integration of plugins using an iteration model. This first iteration focuses on simplicity - or what Scoble calls anti-marketing design - and usefulness - the added value for a visitor using a browser when reading this blog. Here are the plugins/features that got selected for the Simple Edition.

Publishing Essentials

WordPress 2.0 provides a light core publishing platform, the following are the must-have features I need to start this micromedia adventure:

Search Extensions

This blog is about web search so the search box is available in the header and it has a couple more features than what the WordPress basic install provides:

Ads Extensions

This blog also talks about online advertising so it should have some ads, I believe that ads should be useful for the reader so if they aren’t you have the option to hide them, the advertising features are based on Problogger Clean theme.

Data Organization Extensions

This blog embraces knowledge sharing and additional information about the post has to be included to identify relations, here are the extensions added:

Javascript generated metrics, microformats integration, post rating, sitemap generation, link tracking and OPML integration are planned for the Simple Edition and will come in next iterations, but first let’s finish the introductory posts.

Labels:


This page is powered by Blogger. Isn't yours?