Monday, April 30, 2007

We will live in a mixed source world

Microsoft, Apple and Amazon reported this week great first quarters which are in sync with last week’s earnings results reported by Google, IBM and eBay, confirming a healthy IT/Internet industry (besides Yahoo). So the question in the air is who rules this sector?, some are saying that Google does, but comparing current revenue figures it is not that clear yet, so let’s take a look at some IT history:

IBM sold their PC business division and nowadays focuses in enterprise software-based services and corporate middleware which are almost unknown for general consumers, Microsoft’s newer versions of Windows and Office are showing strong sales as Apple’s iPods and Macs but the preference among general consumers is for a company that provides free advertising-supported services: Google.

Also this week, Millward Brown crowned Google as the most powerful brand coming from the 7th position last year; this fact combined with comScore’s yet to be released figures for March that have Google sites as the most visited, the dominating figures in the international search landscape and US’s search engine market share, and the estimated 80% market share of online ads serving that DoubleClick’s buy brings, are some indications that we are living in a Google’s consumer web world; although some traditional media companies still do no understand the business model.

Google is a mixed-source company, they use open source in their infrastructure and propietary code for business secrets; with a different model than controversial’s Novell approach: free web apps/services, they support and encourage OSS development with programs like the Summer of Code. The well-established server stack AMP and the rising client stack OFT have confirmed OSS is ready for corporate environments so mixed source infrastructures are a reality and companies are paying more attention to this trend.

Open source web servers are still leading the market, PHP, Rails and MySQL are continuously powering Web 2.0 applications with scalable architectures, Ruby and Python are in the top 10 of most popular programming languages; these facts confirm Nat Torkington’s OSS trend: Web 2.0 is FL/OSS, presented earlier this month at China OSS Summit. But in the corporate world Microsoft platforms are still leading; although Enterprise 2.0 adoption is growing, current infrastructures are not going to change so fast, but since OSS is leading innovation in the social web, closed source companies have no other option than to start to embrace OSS - one of 2006’s trends - and mixed sources infrastructures are inevitable.

One of the things you probably didn’t know about Open Source is that it is older than proprietary software, it is a development model that promotes open access to the generation processes of products, resources or content embracing collaboration and freedom of choice. This disruptive model is a bit difficult to understand, mainly because the current perception of success in the software business keeps us looking for the Open Source Billonaires or the survival of the strongest. But the value of OSS is not only related to costs’ reduction, there are other strong reasons to use OSS like access to faster innovation as the study on the economic impact of OSS on innovation shows, every year code valued at 800 million euros is available for anyone to use; so, in theory, everyone is a potential millionaire, all we need is a business plan.

Red Hat has a working business model resistant to competitive attacks from Oracle and Novell-endorsed-by-Microsoft as this analysis of Open Source Business Models presented at EclipseCon 2007 shows. Brent Williams provides evidence for 2 interesting statements that invalidate some myths of OSS business models: software is not price competitive at market level and the software market does not behave like a commodity market. Other successful companies are MySQL and SugarCRM who built their business strategies around an architecture of participation.

The research community is also increasing their attention towards FL/OSS; 6 EU-funded projects are working actively and one of them - FLOSSMETRICS - has identified the available data sources on OSS projects. In their study of business models adopted by 80 FLOSS-based companies 6 categories were identified: twin licensing, split OSS/proprietary plugins, badgeware, pure FLOSS, platform providers and consulting companies; this gives us some guidance to the different attempts companies are trying to monetize open source.

Another interesting research explores the changes OSS created in the economic interaction among players in the software ecosystem. It identifies the economic motivation for system integrators (more potential customers), software vendors (services company) and developers (free agency) to embrace open source. Of the three groups the ones that are being disrupted are software vendors; while we are moving to a user-centric world, where some people think there is no customer value in proprietary software, closed source companies need to adjust to maintain their leadership and it looks like Microsoft is realizing that open source is in their future.

Transparency is driving OSS to become a natural part of the IT landscape, we may never look at the source code of the applications installed but if we want/need to do it at some point we can. The first thing anyone needs to know about open source is the differences with the free software movement, which considers proprietary software as the enemy and their evangelists live up to the principles; OSS leaves the door open for capitalists funding and co-existence with closed source infrastructures. Other interesting tips to consider are the rules to start working with developers’ communities provided by Tobias Schlitt and the rules for running a project provided by Greg Beaver.

There are interesting predictions on what the future of the web will be: Mozilla’s Web OS, the Web 4.0, Gartner’s Goog-Azon pretailer and life in Virtual Worlds. The next big innovation in the browser market is offline/desktop support for web applications and Adobe’s donation Tamarin VM seems like a key component for this expected feature; other options that embrace OSS include the Dojo Offline Toolkit for AJAX applications and Joyent’s Slingshot for Rails applications. Even Adobe has partially accepted Joyent’s invitation to open source Apollo by open-sourcing the Flex SDK and this move may force Microsoft to open source some of Silverlight at Mix07, if this happens is a validation of the growing impact OSS has and break-through applications like the Web VM g.ho.st will be easier to develop.

The next big applications on the Internet are virtual worlds; they bring a more realistic, interactive and social experience than any social network interaction through the web browser now. This virtual 3D web has the potential to establish a virtual economy bringing proven traditional business models to the virtual worlds. Second Life is the most prominent virtual world and some interesting newcomers to this market include Entropia Universe and Red Light Center, other alternatives are listed by Onder Skall; while this market keeps maturing Second Life is taking the OSS path by opening the code for its client software and planning the release of the server code too, in an effort to use the community’s knowledge towards the establishment of a fully distributed 3D network.

Open approaches have changed industries as Sean Ammirati shows in his Open Ethos examples related to the publishing and software industries, but there are still some things to figure out. The Free Software Fundation is pushing some changes in the GPLv3 license to be prepared for Microsoft/Novell-like deals sending a clear sign what the free software movement wants to protect. In a similar tone, activeCollab’s project founder moves to try a different business model have strong resistance from the community, specially if he expects to keep the brand, raising the question of the value of open source to open services. With Microsoft now showing more interest than ever in open source, probably not yet with the motivation to start a Microsoft Summer of Code, it is important to start finding the best practices applicable to mixed source environments, nothing is going to stop this shift now.

Labels: ,


Saturday, April 21, 2007

Web 2.0 Expo: The Real Live Web is among us

Those of us who are not yet Web 2.0 entrepreneurs to justify a trip to the event where the second generation of Internet technology is presented to mainstream people: Web 2.0 Expo and did not win the Web 3.0 definition contest to get tickets, were preparing to follow the conference in a time-shifted fashion like how we did it for previous Web 2.0 events: reading Web 2.0 Summit coverage, finding live-blogging posts or listening and watching podcasts of selected presentations. But this year, thanks to live streaming, the virtual experience was enhanced.

Jeremiah Owyang’s business casting and Robert Scoble’s live streaming were two experiments to show the behind-the-scenes experience of this event and with Chris Pirillo made Web 2.0 history by streaming the Social Media Revolution panel as this 15-minute video mashup can attest.

The World Live Web is a term coined by Alex Searls and has been used to differentiate static content production from content that stimulates conversations and participation - blogs, wikis, podcasts, vlogs. But services like Twitter/Jaiku (status microblogging), Talkshoe (live online talk radio shows) and Ustream/Stickam (live online television broadcasts) are going to re-define the term as they mature.

And here are some of the resources found around the web with coverage related to this event, presentation files and slide shows are being added as available and the talks are ordered by popularity using the session stats.

These are also some selected highlights from the conference.

There was also a parallel unconference which was open to anyone: Web2.Open, some slideshows are also published. Michael Wesch, the creator of the famous video “The Machine is us/ing us” which was used for the introduction to the first keynote of Web 2.0 Expo gave a talk in Web 2.0 Open, a video clip explaining how the well-known video was conceived is available.

Labels: ,


Saturday, April 14, 2007

Google’s Advertising Distribution System grows

Last Sunday I was reading Eric Schmidt’s interview and it was the first time I did not see a explicit mention about search when he was describing Google. As John Battelle precisely understood from Google’s CEO answer, their business goal is to be the foundation for advertising and commerce on the web and DoubleClick’s acquisition price paid in cash ($3.1 billion) is a strong signal of what Google is going to do to maintain leadership in the online advertising market.

Two years ago Microsoft decided to build their own advertising network to provide display, search and content ads. adCenter has been launching its official operations progressively: France & Singapore (September 2005), US (May 2006), UK (August 2006) and Canada (February 2007); a contextual advertising pilot: ContentAds and introducing some breakthroughs in their adCenter Labs. While the system is maturing Microsoft needs to gain market presence; Google’s dominance in US search traffic with 64.13% of market share played some role in the decision to create the Search and Ad Platform group which could had some influence in Microsoft’s interest in acquiring DoubleClick.

DoubleClick is one of the leaders in display advertising serving. Last year revenues’ estimates are between $100 million - according to the Wall Street Journal - and $300 million - according to the New York Times - which are in tune with the $383 million reported by ValueClick’s Media division and $122.4 million reported by aQuantive’s Digital Marketing unit. DoubleClick just launched an advertising exchange service to connect top-tier advertisers and agencies with leading online publishers which raises the company’s value but $3.1 billion is too much.

Google has its own display advertising network but this one is not as resistant as Adwords/Adsense auction-based to competitive attacks, so when Microsoft’s interest in DoubleClick to gain market position was known, Google had to bid to protect its momentum. The initial asking price of $2 billion was already high so it looks like both giants really fought this one; the firm Hellman & Friedman is the big winner of this acquisition.

I am not sure if Google can declare this a win, they secured their leader position for now, gained access to data of a interesting marketplace and some patents but they may had never acquired DoubleClick if it wasn’t because of Microsoft’s intentions, at least not at the final price accorded. For Microsoft it was a loss, but this was also the case even if they would have outbid Google so they choose the minor loss.

This is just the beginning, Google is the czar of a $16.8 billion online advertising market, but Microsoft, Google, Yahoo and others are getting ready to compete for the worldwide $500 billion advertising market. Text content search and advertising is where the competition has been occurring but for audio/video content the market is still wide open. Companies like Nexidia and its phonetic based search for video advertising and search, technologies for image recognition like Google’s Neven Vision, Microsoft’s gender detection, Riya’s visual search and for video recognition like Audible Magic will generate new ways to accurately target advertising therefore providing added value - instead of annoyance - when displayed.

Labels:


Wednesday, April 04, 2007

Search engine ranking factors analysis

Everyone involved in the SEO industry wants to know what factors are taken into consideration when search engines rank pages, we are not going to be able to know all the factors or how they are combined because that is the secret sauce of search engines but that does not stop us to try to figure out which ones are the most relevant.

We have some hints about the components considered by Google thanks to their related patent applications: Information Retrieval based on Historical Data, Ranking Blog Documents, Ranking Businesses at a Location in local searches, Personalization and agents that analyze content aggregation sources’ reputation; but these are too many factors to explain to our customers who just need to get better positioning on search engines.

So when the SEO’s A-list share their opinions on the most-known ranking factors you have a must-read article: Search Engine Ranking Factors V2 is the document that represents the collective wisdom of 34 SEO leaders, it is digg-worthy.

Danny Sullivan takes a look at the level of difficulty to control the most important factors ranked by the experts, really useful to introduce SEO to beginners. The factors ranked by importance and categorized as presented in the document are the following:

Keyword Use Factors

  1. Keyword use in title tag
  2. Keyword use in the body text
  3. Relationship of body text content to keywords
  4. Keyword use in H1 tag
  5. Keyword use in domain name
  6. Keyword use in URL
  7. Keyword use in H2, H3, H(x) tags
  8. Keyword use in alt tags and image titles
  9. Keyword use in bold/strong tags
  10. Keyword use in meta description tag
  11. Keyword use in meta keywords tag

Page Attributes

  1. Link popularity within the site’s internal link structure
  2. Quality/Relevance of links to external Sites/Pages
  3. Age of document
  4. Amount of indexable text content
  5. Quality of the document content
  6. Organization/Hierarchy of document flow
  7. Frequency of updates to page
  8. Number of trailing slashes (/) in URL
  9. Accuracy of spelling & grammar
  10. HTML validation of document (to W3C Standards)

Site/Domain Attributes

  1. Global link popularity of Site
  2. Age of Site
  3. Topical relevance of inbound links to Site
  4. Link popularity of Site in topical community
  5. Rate of new inbound links to Site
  6. Relevance of Site’s primary subject matter to query
  7. Historical performance of Site as measured by time spent on page, clickthroughs from SERPs, direct visits, bookmarks, etc.
  8. Manual Authority/Weight given to Site by Google
  9. Top Level Domain extension of Site
  10. Rate of new pages added to Site
  11. Number of queries for Site/Domain over time
  12. Verification of Site with Google Webmaster Central

Inbound Link Attribute

  1. Anchor text of inbound link
  2. Global link popularity of linking Site
  3. Topical relationship of linking Page
  4. Link popularity of Site in topical Community
  5. Topical relationship of linking Site
  6. Age of link
  7. Text surrounding the link
  8. Internal link popularity of linking page within Host Site/Domain
  9. PageRank (as measured by the Google Toolbar) of linking page
  10. Domain Extension of linking Site
  11. Temporal link attributes

Negative Crawling/Ranking Attributes

  1. Server is Often Inaccessible to Bots
  2. Content Very Similar or Duplicate of Existing Content in the Index
  3. External Links to Low Quality/Spam Sites
  4. Participation in Link Schemes or Actively Selling Links
  5. Duplicate Title/Meta Tags on Many Pages
  6. Overuse of Targeted Keywords
  7. Very Slow Server Response Times
  8. Inbound Links from Spam Sites
  9. Low Levels of Visitors to the Site

Labels:


Monday, March 19, 2007

Twittervision and Talkshoe

This blog has been quiet for some time, mainly because I have been building a new site that will come soon and this URL will be used for more general stuff. But today I break the silence to mention two cool services. I haven’t heard of TalkShoe before but it was coming on my river of twitts today with Leo Laporte’s announcement that this week’s TWiT was going to be recorded live, the experiment called TWiT Live had Robert Scoble, Doc Searls, Wil Harris and Alex Lindsay.


The cool stuff with TalkShoe is that you can call in and join the conversation, Jason Calacanis - who is moving from blogging to microblogging - called and share some of his thoughts on monetizing Twitter and there are also other interesting discussions proposed by the callers, I haven’t heard Doc since Gillmor Gang’s last episode and it was really good to listen to him again. The interaction got enhanced with another experiment launched today: Twittervision, which is a Google Maps mashup using the Twitter API, that it’s going to make it easier to explain the value potential provided by Twitter.


So today’s TWiT was an interesting experience with comments popping in Twittervision as some subjects were mentioned in the show. My favorite podcast is the Daily SearchCast and they have a chat room to receive feedback when recording the show live, since Twitter is the evolution of IRC I think live shows will need this kind of interaction so we can send our comments from IM. The next step is Twitter meme tracking and personalized views, I am sure someone is working on it right now

Labels:


Friday, May 26, 2006

Search is the unifying solution

This is the phrase used by Eric Schmidt to explain Google’s portal-like moves when the new products to improve web search were announced and I am beginning to use it to explain my interest in search. I definitely don’t have a decade in search industry like Danny and I am not writing an afterward to a book that it’s the best reference to introduce search to people unfamiliar with the industry but I have some time following search engines technologies, in this post I will share my little journey with web search so far.


I was looking for a topic to write my engineering thesis back in 1998 and XML had just become a W3C recommendation so it got my attention. While looking where to apply it on the web I got frustrated with search engines so I figured I could do something with XML to help improve the search experience. I thought XML was going to change everything so I decided to build a search engine that indexes XML documents with a natural language interface.


The topic was relatively new and my allies in this journey were books like the XML Black Book, Natural Language Understanding, Managing Gigabytes to understand documents indexing and subscriptions to the Search Engine Watch and Sociedad Española para el Procesamiento del Lenguaje Natural to keep informed. After two years of dealing with the slow maturing of XML parsers I got a small set of news pages in Spanish which I manually marked up into NewsML format and build a thesaurus in XML format to get the synonyms of query terms.


The idea was simple, a user formulates a question which is analyzed and related terms are added to the query that is sent to the XML indexing engine, the results were ranked using term frequency. Every data structure was stored in XML so performance was slow but the experiments proof the concept that with XML markup and natural language query analysis relevant documents that were ignored became visible to the user.


In my thesis I concluded that the future search engine will be a meta-search with natural language capabilities that will query various vertical/specialized XML indexes and rank the results according to the question formulated and some link analysis using XLink, my assumption was that XML was going to be the dominant format in the web but it didn’t happen yet. With feed (RSS/Atom) search engines becoming more popular, meta search engines like gada.be getting more attention, XLink’s new Candidate Recommendation and a new NewsML 2 Architecture this scenario can still happen.


With Google delivering good enough results I was just a search power user until last year. Back to school and while looking for a topic for my MSc thesis I got interested in concept-based search and joined the Personal Digital Library Project in my school. My current thesis is about concept-based ranking using user contributed tags/labels and attention metadata in personal repositories.


And to stay up-to-date I listen to The Daily SearchCast and other WebmasterRadio shows, read the excellent coverage of the SES conferences available at the Search Engine Roundtable - read my own coverage of SEW Live Seattle in the next post - and subscribe to various search related blogs, if you don’t want to wait for Danny’s OPML this blogroll is a good starting point. Why search is so important now? After all, a lot of the information generated is searchable.

Labels:


Monday, May 01, 2006

Online advertising, the monetization strategy

The World Wide Web transformed the media industry; before its arrival, you had to use one of the traditional mass media services to get exposure but with the Web any individual can get a comparable audience with his/her website. The low-barrier to entry led to a proliferation of websites and in this scenario, the number of visits (traffic) became the success metric.

Advertising has been paying the bills for traditional media and banner advertising did the same for the web before the dot-com burst, it was the tool that businesses used to promote and brand their sites and companies like DoubleClick - through ad serving using the eyeballs model (CPM) - were mentioned by web entrepreneurs as the tactic to generate cash flow, something similar of what happens nowadays with Google AdSense.

In the web, advertisers have the metrics to evaluate their spending and banner ads were not being effective in generating traffic, so ad providers try to force exposure - following the tactics of traditional advertising model - using pop-ups; this move backfired because the most used web browsers incorporated pop-up blockers functionality in the application or via toolbars listening to users’ request who were being annoyed by the banner ads. This example of consumers rising power in the web is also impacting traditional media, advertisers are feeling that TV ads had become less effective and technologies like DVRs and iTV are beginning to impact the way TV advertising works.

What has shown effectiveness in the Web is the pay-per-click model introduced by Bill Gross, which is the success factor for the top public company in Computer Services industry. Google reported revenues of $2.23 billion generated by online advertising in the first quarter of 2006, with 41% of their total revenues coming from their AdSense partner network and 58% being generated by users who click ads in Google-owned sites. Yahoo, the second in market cap, reported revenues of $1.38 billion generated by their marketing services division in their financial results for first quarter.

The IAB reports that Internet Advertising revenues grow 30% in 2005 in the US and search spending has a 33.5% growth compared to 2004. With these healthy results from the leaders in paid search advertising, the projected increase in search marketing spending may fall short but they still do not show that the optimistic prediction made by Mark Kvamme in a keynote on Ad:Tech San Francisco this week - indicating that Internet Advertising will reach $35 billion in 2008 - is going to happen.

There is enough market for more players so the unusual spending projected by Microsoft for 2007 to support the software+services strategy and the upcoming announcements in next week’s MSN Summit related to their advertising program AdCenter combined with the Primetime exposure Ask is getting to gain attention could drive up even more online advertising market share numbers in US advertising, currently at 4.7 percent.

Web globalization is also validated with the figures reported, Google’s outside of the US revenues represent 42% of total and Yahoo’s international revenues are 30%; given that USA Internet users are only 18.3% of more than 1 Billion users worldwide and the growth of online advertising spending in Western Europe - with a 3.1% market share in media ad spending - and other regions, the future of the global online advertising industry looks promising, specially if you consider that local online advertising is just getting awareness.

Media economics were based on linear replication, but with current technologies easing the duplication of material there is no control on the generation of copies; while other business models get more mature the fact that Chris Dobson told Mark Evans in the Sympatico/MSN Digital Ad summit that online will be an ad world, not a subscription one is an indication that online advertising is recognized as the preferred monetization strategy in the Web nowadays.

Labels:


This page is powered by Blogger. Isn't yours?