Saturday, April 15, 2006
Summer of Code 2006 is here, calling open source developers
Last year, Google showed their commitment to the open source community by funding 419 projects - assigned to 40 organizations - to introduce students around the world to Open Source Software Development. I was one of the lucky ones that got accepted and completed my project with a great organization: XWiki, today Chris Di Bona announced that Summer of Code 2006 is a reality and XWiki is participating this year too.
So if you develop open source this is your time to act, just read the Mentor FAQ and see how you can get your organization and favorite open source applications involved, you have until May 1st to submit interest to participate and you can even get 500 USD for your organization by mentoring a student and getting someone to work on that pending feature that you wanted so much to get done.
I haven’t seen any restriction on the number of projects, so if you have a project idea, propose it to your organization and submit the application, open source development will win. And if you are a student you have to get ready to apply on May 1st, I will write some of my experiences from last year’s program on an upcoming post, in the mean time you can read the Student FAQ
Last year’s hispanic open source community participation was low, so to increase it this year I have done an unofficial translation of the mentor FAQ for spanish readers of this blog.
Labels: internet
Monday, April 10, 2006
Web search. the user interface
When new technologies arrive, metaphors are used to explain new behavior and functionality they bring by associating technical jargon with concepts already known. Internet and PCs are the pillars of the Information Age but it was only with the World Wide Web and links as the primary user interface metaphor that a user realized there is a window - the browser - to access information in the rest of the world. Nowadays, the knowledge society concept is not generally understood but thanks to web search - as the user interface mechanism - it is becoming clear; the results of a query provide relevant resources to any given interest which is how knowledge societies get together.
The first step to knowledge building is an adequate organization of information. 10 years ago, Netcraft Web Server Survey reported more than 150K sites active and there were two approaches used to organize the amount of information available: directories, a human-powered effort trying to classify web sites and search engines, which were gathering the text available in web pages at reach. The first approach could not handle the exponential growth of the web and the second one was in danger because the ranking mechanism was being tricked by term repetition when a popularity approach came to rescue.
Back in 1998, links were generated to lead the reader to a resource that was worth to check out if more information was needed. This human-powered action is a vote for the linked site and algorithms like HITS by Jon Kleinberg and PageRank by Larry Page and Sergey Brin use this concept to provide a better ranking of the results, this approach is one of the factors that made Google the preferred search engine.
In The Search, John Battelle explains how Google and its rivals had been transforming our culture. When you need some information about any topic it is almost a reaction to open your favorite search engine in the browser and ask for resources about that interest, sometimes you may not find exactly what you are looking for but it is a given that you will find something related at least. According to PEW search engine use is the second Internet activity on a typical day and it is probably the top activity performed in the browser.
This usage popularity makes commercial sites wish to get better visibility on search engines, to cover this need a industry called search engine marketing has been established which focuses in two areas: paid search or search advertising and search engine optimization. SEO practitioners analyze how search engines work and suggest strategies to get better ranking positions on determined queries, sometimes this knowledge is used to deliberately affect results so search engines are constantly adjusting algorithms to avoid these attempts.
Since PageRank is query independent it was implemented faster than other link-based ranking algorithms but it can also be compromised by link spamming - automatic generation of votes. Google as the industry leader has been using other refinements to maintain relevancy like the Hilltop algorithm to enhance result accuracy using a subset index to identify expert documents relevant to the query; historical data from web sites/pages to determine trust and freshness; search user behavior analysis to include personalized factors and tools for publishers like sitemaps to coordinate how pages are included in Google index.
The search box has become a user interface metaphor in web sites, and is moving from HTML code to the browser toolbar and ultimately to the desktop, making it the de-facto mechanism used to ask for information from computers. Although search engines provide a set of operators to allow query refinement, an average search user is not aware of them - only 2.7 words are usually present in a query - and this is the current search engines’ challenge: deliver the highest user satisfaction when users provide so little information about it. There are some UI experiments in place trying conversational approaches - related searches suggestion - and contextual - information supply - that are going to influence the users expectations when using the next generation of web search applications.
Web Search 1.0 established the way we request knowledge resources from Internet using common knowledge, my take is that the next innovation flow will focus on specific knowledge by improving relevancy in vertical markets, communities and private repositories - enterprise or personal - and you will read more about this on upcoming posts, in the mean time if you want to keep an eye on search engine industry news, the Search Engine Watch Blog is the place to visit.
Labels: web-search
Tuesday, April 04, 2006
Making it simple and useful
The move from Blogger to WordPress was motivated by a need for more flexibility (tags) and identity (own domain name), and since I was working with Javascript libraries like script.aculo.us I also wanted to have Ajax functionality so I was testing BloxPress integration with various plugins when my one month vacation began.
When I got back to work on the design of this blog I found various interesting plugins that I want to use but I don’t want it to load slow for every visitor. So, thinking from a micromedia perspective, I choose to organize the plugins following an edition approach:
- Feed Edition. If you are a heavy information consumer just grab the full feed and add it to your favorite aggregator. You can also subscribe to a category feed or to the comments of a post.
- Mobile Edition. Mobile devices have limited display capabilities so this blog uses Alex King’s Mobile plugin to adjust the content for mobile users.
- Simple Edition. The essential plugins, the detail is below.
- Extended Edition. Some nice to have plugins to integrate other services like image/photo hosting, automatic translation, map visualization and others are incorporated in this edition.
- Experimental Edition. Is the web the ultimate API? Hive 7, An Ajaxian Virtual World raises the bar of what can be done in a browser, Ajax design plugins/themes go to this edition.
With this approach a reader can select the edition to read and I can plan the integration of plugins using an iteration model. This first iteration focuses on simplicity - or what Scoble calls anti-marketing design - and usefulness - the added value for a visitor using a browser when reading this blog. Here are the plugins/features that got selected for the Simple Edition.
Publishing Essentials
WordPress 2.0 provides a light core publishing platform, the following are the must-have features I need to start this micromedia adventure:
- Bilingual. I want to write in English and Spanish so I choose Gengo Multi-language plugin’s.
- Spam Protection. I like Akismet web service but I decided to protect this blog with the stable Spam Karma 2 plugin.
- Syndication. This blog expresses points of view but doesn’t cover breaking news, to fill the void I get headlines from web sites like memeorandum using the inlineRSS plugin.
- Edition Selection. To change between editions, I am using the Theme Switcher plugin.
Search Extensions
This blog is about web search so the search box is available in the header and it has a couple more features than what the WordPress basic install provides:
- Extended Search. To allow search in comments and other pages with an Ajax interface I use the Ajaxified Search Everything plugin.
- Highlighting. When you are searching for something, you skim - instead than read - the content. The Search Terms Highlighter plugin allows to detect terms from internal/external queries to help you skim faster.
- Metrics. To store what readers are searching on the blog the Search Meter plugin is used.
Ads Extensions
This blog also talks about online advertising so it should have some ads, I believe that ads should be useful for the reader so if they aren’t you have the option to hide them, the advertising features are based on Problogger Clean theme.
Data Organization Extensions
This blog embraces knowledge sharing and additional information about the post has to be included to identify relations, here are the extensions added:
- Tagging. A post usually covers various topics and tags allow to describe these semantic associations, I choose the powerful Ultimate Tag Warrior plugin for this functionality.
- Location. A post can mention different places but you need geographic coordinate information to have a better identification, Geo plugin is used for this matter.
- Comment Reading. When a post has various comments it’s not so easy to follow the conversation, Brian’s Threaded Comments plugin enhances the reading experience.
- Metrics. Statistical attention metadata is useful to identify trends, basic information is how many times a post had been viewed and WP-PostViews plugin is used for this.
Javascript generated metrics, microformats integration, post rating, sitemap generation, link tracking and OPML integration are planned for the Simple Edition and will come in next iterations, but first let’s finish the introductory posts.
Labels: internet
