Entries in 'The mainstream problem' ↓
Posted by Morten Blaabjerg, October 7th, 2008 in Copyfight, Information filtering, The mainstream problem
A few months ago, on July 8th 2008, I shot this video at Odense harbour. I didn’t manage to edit it until the 25th. Then my dog died from one week to the other, and I didn’t do any real work on the Kaplak Blog for a complete month. I didn’t at all feel like presenting myself on video on the web, like nothing ever happened. I was ripped to pieces.
I am coming back, though, and what I say in this video is of core importance to what we do in Kaplak. It’s what makes sense of what we do, even when the outside world can’t make sense of it and even when we sometimes ourselves lose focus, when we discuss or dive into technicalities of niche products, long tail distribution, web filtering methods, free software, bittorrent seedboxes and twitter tools.
Here’s the full quote :
The question is, if the tools we have right now are sufficient for us to find relevant information, which we need for our lives, for our businesses, for our children’s educations – and everything in our lives. If these tools are sufficient to survive this onslaught of material which is added to the internet every month. There are millions of new websites created every month, and seach engines can only show a limited amount of results on a results page. So there’s a lot of things which are lost in the filters we use right now to filter the internet. Luckily, there are a lot of new filters and new tools, which are being developed all over the world. So some of these new tools will help us find the information that we need. But the question is, who is it going to be, and what are those tools going to be like, and who is going to control those tools? Those are the really big questions, as we see it.
What’s at stake, in other words is how we filter the web and find information. That’s one thing, and we’re working on it – and so are a lot of very talented people, all over the world.
The other thing is who is going to control these architectures of information. This part is a lot more tricky. This is where free software, the copyfight, DRM activism and ‘cloud computing ideology‘ comes into the picture. This is also why we don’t really like social networks, but love RSS feeds.
To get at the second thing, however, we need to create a sustainable business on the first. But these things are connected, and each day we walk the delicate path between falling into the trap of entrusting our information to proprietary designs, on the one hand – and on the other hand, our vision of a future, where each peer in a global peer-to-peer network of everyone of us is capable of reaching out to whoever he or she wants to connect to. Where even marginal products can be sold, and unpopular messages get out to the people who wants them, without being filtered by the centralized algoritms of corporate monopolies or crude filters of nasty regimes, or without, what is at least equally as bad, being buried in mountains of spam or mainstream crap.
Tags : free software, rss, video
Posted by Morten Blaabjerg, July 17th, 2008 in Kaplak on the web, The mainstream problem, What is kaplak?
This early sketch illustrates how a product/widget from a niche producer is made visible in a niche context somewhere else on the web :
A web user and niche producer (A) encounters a Kaplak widget on a website, he knows and trusts (B). The producer finds Kaplak can be used to distribute a product of his own. He decides to sign up, and subsequently uploads a product and submits basic product information.
The Kaplak interface (C) spits out a widget a.k.a. a “kaplaklink” for the product. The widget is also published to the Kaplak market network, from where it may be fed via RSS or other means, to subscribers within particular channels or categories.
A website-owner (D) run what we may term a “filtersite” (E). D feeds or filters widgets from the Kaplak network from a range of categories or tags, in order to capitalize on sales, i.e. earn a share of kaplak from each sale made on E. His motive is primarily of commercial character. Among the widgets filtered is the widget for A’s new product.
In order to avoid what we term the mainstream problem, i.e. that just a handful of “hits” are prominently displayed and amplified, Kaplak depends on filtering sites of all kinds, i.e. index websites which seek to filter Kaplak’s feeds according to particular specialized interests or criteria. We have a lot of this kind of websites in the online landscape today, many of which are financed by advertising. Kaplak will offer one more type of income for index type sites, and one which may allow a sharper edge in filtering, because the size of income streams may not always be proportional with the amount of traffic generated by a site. A large site may suffer from greater problems in making the “slim end of the long tail” presentable, than a smaller and more well-defined niche-friendly site will. Both may be filtering sites, though, basically performing the same task of feeding and filtering.
The widget from A on D’s site is now discovered by (F), who puts the link into her blog, because she finds that the product is interesting and relevant to the article she’s about to publish. F’s blog is visited by a much more select crowd than D’s site, who rely mainly on search as a source of traffic. F gains a lot of attention through a social networking site popular within her field of expertise (G). Motives here weighs more heavily towards the professional, contextual, idealist side than the money side. F earns a fair share from her Kaplak widgets though, as her choice in widgets is much more finetuned to her readers, than the bulk filtering of D, which earns from a few sales of a lot of products (the “pure” Chris Anderson model).
Finally, a friend from G alerts another friend, who happens to be the owner of a nichesite (H), which deals particularly with A’s subject and finds the new product intensely interesting. The regulars of H knows the deal and can instantly see the value of A’s product. A’s product finds a potential market here, he otherwise wouldn’t have found.
None of H’s users would have discovered A’s product without Kaplak, even if it was accessible via Google or filesharing networks. First, none of them would know about the project. Had one of them actively searched for the product, she would have had to pick very delicate keywords, endure the timeconsuming process of browsing search results to page 7 or 8, only to discover a dead link to a torrent, which may have been alive and kicking, but of which there are no seeders.
The owner of website H publishes A’s widget from both professional and financial motives. The professional, interested motives weighs in the heaviest, but since the site engages A’s target group, the collective sales pays off decently in kaplak, which contribute to financing the site. H’s traffic may be slight – if the group of “regulars” is sufficiently interested and the price right, then H need not care greatly about the amount of traffic.
The producer A expands his market with H’s users and anyone who made a transaction along the widget’s “route”, who wouldn’t otherwise know about the product. The process repeats itself, this time with one of H’s users in the role as producer A, who discovers she may use Kaplak to distribute one of her own products. This process happens across Kaplak’s entire global network, with the intensity dependant on the demand for the products offered by users, and on the ease or difficulty by which a product/widget can gain an entrance into the niche environments and markets “in the other end”.
The sketch illustrates what Kaplak’s primary product is. As we’re on the web, all sites and actors in the above diagram are accessible to everyone all the time, from anywhere they may be situated in the world. The problem is knowing the product exists and next, to find where it is. Search engines such as Google and others offer one model, filesharing index sites such as The Pirate Bay and others offer another. Both however, are primarily based on active search for information, from the buyer’s end.
Kaplak offers a third model, which brings the product to the target group, through the web services and communities the target group uses every day. When Kaplak works, web users will find interesting links/widgets on sites and services they regularly visit and trust, before they even know they want the particular product – and long before anyone even thought of using Google or something else to go look for it. Finally, the Kaplak model can be fully financed by the market, which is opened up, rather than rely on upfront payments from our niche producer, before he or she knows if there is a market.
Tags : blog, filesharing, Google, long tail, Pirate Bay, search, visibility
Posted by Morten Blaabjerg, June 12th, 2008 in Identify challenges, The mainstream problem
This just in from Chris Anderson :
Bootstrapping the Long Tail in Peer to Peer
Bernardo Huberman and Fang Wu from HP labs have just released a paper describing a way to help P2P networks deal well with niche content. “It is difficult to satisfy the diversity of demand without having to resort to client server architectures and specialized network protocols… We solve this by creating an incentive mechanism that ensures the existence of a diverse set of offerings regardless of content and size. While the system delivers favorite mainstream content, it can also provide files that constitute small niche markets which only in the aggregate can generate large revenues.”
Going to dive into the research of Huberman and Wu during the following days, as their work seem to complement the thinking about p2p incentives we’re doing in Kaplak. This is what I call important stuff.
Tags : Bernardo Huberman, bittorrent, Chris Anderson, Fang Wu, long tail, p2p, research
Posted by Morten Blaabjerg, June 9th, 2008 in Copyfight, Kaplak on the web, The mainstream problem, What is kaplak?
If you’re reading this, you belong to a select group of people who have managed to find their way along intricate paths into this new home for the Kaplak Blog. Kaplak’s first site was since it’s inception last summer born as a temporary website for Kaplak. It’s primary purpose was to host the blog and the mailing list until we had developed our first online strategy. Now, we’re in the process of implementing this strategy for our online presence. This mindmap roughly illustrates what this entails :
Kaplak is not just one website – we’re building a presence on a number of different platforms, from Twitter and del.icio.us to YouTube and Facebook, and on countless others. Many of these platforms are tied together by RSS, which makes it (which is the goal) comparatively easy or convenient to travel (i.e. follow links) between these different platforms and communities.
One important step in the process has been to move the blog to it’s own domain, with new powerful software (WordPress) and plugins, so that we could ‘free up’ the main domain for a complete revamp. The purpose of Kaplak.com changes to become a key entry point on the web for the “signup and upload” process for new customers. This will be closely connected to the Kaplak Marketplace, which will be Kaplak’s main original contribution to the web. We have some clever ideas in Kaplak about how to avoid what we have termed the mainstream problem and look very much forward to showing this part of our activities off to the world.
The next step in the implementation of our strategy will be setting up a decent skin for and opening up our public Kaplak Wiki.
Tags : blog, delicious, Facebook, kaplak.com, rss, strategy, Twitter, wiki, wordpress, YouTube
Posted by Morten Blaabjerg, March 10th, 2008 in Information filtering, The mainstream problem
Or do algorithmic search really scale better, work faster and ensure better quality than ‘socially produced’ services? A few days ago, I had an interesting exchange of POV’s with Danish SEO Mikkel deMib Svendsen, known among other things from the SEO radio show Strikepoint.
I replied to Mikkel’s blog post on ‘Search – before, now and in the future’, where I tried to make the point that search as a communications solution suffers from key preconditions, which are far from optimal. Among these the fact that in order to search for something, you need to know what to look for before you search, and you need to deliberately and consciously use a search engine to look for it.
In other words, search is a deliberate and conscious affair. This makes it difficult, for instance, to use search to market products, which are not well known, such as niche products, or to address problems or needs, which are not yet consciously thought or expressed.
Add to this the present growth rates of information on the web. For each new website added to the web, you increasingly risk that your information will never reach the queries seeking it. We’re talking exponential growth rates in several millions of new websites, added each month to the web. As Search faces these increasingly greater amounts of information, this problem, which we’ve so far dubbed the mainstream problem, will only become more apparent.
Mikkel, however, firmly believes in the future of algorithmic search, so these claims didn’t go uncommented. First, he argues that machines will always work faster and scale better than social services, which has great filtering and quality challenges :
I am completely in line with Louis Monier [founder of Altavista], and am 100% certain that algorithmic search will remain dominant. Manual data processing, like in the [online] social services, simply suffers from too many scaling and quality assessment-issues to compete in the long run. Only machines are scalable on the necessary scale and with a continued central quality assessment. … [my own translation, MB]
I have a few problems with these arguments for the quality of search as a communications method. I wanted to analyze them a bit here, in order to make them part of our process to find out more about the effectiveness of online communication and it’s niche/longtail effects. Over the past months, I’ve come to question the widespread naturalization of search as ‘the best’ and ‘natural’ method of making information available and visible online. True, right now Search is the dominant method for obtaining information online. It is also a billion dollar business, although this is mostly due to the success of Google AdWords/AdSense, which is quite a different product. However, this may not be so in the future.
The part of Mikkel’s argument which makes a distinction between social services on one hand, which are manually created and processed (by humans), and search on the other, which uses machines and algorithms, and therefore scale better etc, is fundamentally flawed.
First, search results are ‘peer production’ almost as much as any online social bookmarking service is, i.e. they are socially produced. Peer production is a term coined by Harvard professor Yochai Benkler (in The Wealth of Networks (2006) – which can be downloaded freely here). That search results are peer production means that they create value by putting together websites from different peers (i.e. companies, organizations or individuals) in order to respond to a search query. Google does this without even asking the peers first (it’s an opt-out, not an opt-in system), and so the peers used to create value may not even know that they contribute value to this system. However, this doesn’t detract from the fact that it is the sourcing and pooling together of the work of different peers as a response to a human search query, which creates value in a search result. A search result is socially produced, even though the work done filtering and presenting it in few seconds is done by advanced programming, software, hardware – and cables.
These advanced architectures however, are also created by humans, which means there are someone sitting and using their human capabilities to decide what categories and what variables should factor in with how much weight in the algorithms which control the process of finding and delivering information, when somebody searches for something particular.
The problem with this is not that the process is just as human-influenced as any online social bookmarking service for instance, but that the someone deciding what variables should factor in, is (most likely) not an expert on what the someone at the other end typing in a search query is looking for. The other someone is. It’s one size fits all. One architecture (in principle) for all queries in the world.
I tried asking Mikkel how he could be sure, that a query actually met with a usable result. Even if a query is answered by a number of search results, this doesn’t mean that the search results are actually usable and delivers the answer to the query. If this user experience is bad, search fails in delivering an answer, even if there are a million hits on the query.
Let’s take a look at something completely different, i.e. this page at Wikipedia. Notice the edits happening to the article “Tsunami” in december 2004? A page which before december 2004 had minimal contributions and edits made to it, literally exploded with new information, when a tsunami this month devastatingly hit the coasts of Sri Lanka and Thailand. Everything was frequently updated as events rolled along and people in different parts of the world found out new things about what had happened, complete with a small animation to go along with it.
Wikipedia aims to make knowledge freely accessible to anyone on the planet. Like providers of algorithmic search, Wikipedia uses lots of machinery to deliver it’s information, as well as an advanced complex of software architectures. Wikipedia’s articles are peer produced, but much more directly and consciously so than the algorithmically created search result we saw earlier. Even the software is peer produced. MediaWiki is free software, which can be copied and worked upon by anyone who wishes to do so, and any changes may be adopted by the main package.
A second point is the difference in value created. With an example from his own work, Mikkel illustrates how Search Engine Optimization (SEO) done right directly creates great surplus of value for the companies he and other SEO’s work for. Regular SEO maneuvres help direct lots of relevant traffic to the corporate websites.
That SEO helps create value, however, by more directly targeting traffic at corporate websites can’t be said to be an argument for the quality of search as a communications solution in and of itself, but rather for the quality of Mikkel’s and his colleagues’ work. There’s a lot of money in SEO, and that’s not because search is a brilliant solution to a communications problem. It is rather because search is inherently insufficient as a solution to the problem of connecting a query/demand with an answer/product, especially for a company which wants to stay alive and gain a competitive edge. And this problem will grow a lot bigger. I predict that Mikkel and his SEO colleagues will be paid even better in years to come.
It is first and foremost a problem of visibility, not particularly of search. We need to create better ways to make information accessible to the people who need it, without swamping those who don’t. Second, it is a problem of speed, because we need information fast, to better meet the challenges we face, as individuals, organizations and societies.
As a non-profit, Wikipedia doesn’t make any money on the processes involved in creating and building a quality article, but the value that an improved Wikipedia article (such as the tsunami article) provides for millions of journalists, for instance, and the newspapers and media companies which employ them, is indispensable. I know for a fact that reporters use Wikipedia a lot, and with good reason. It is the fastest and most scalable source of information online, beyond any doubt. And when as many contributors as in the tsunami article come together, it also proves a highly reliable and credible source. It beats the crap out of trawling search results pages without finding what you’re looking for. But it is only a small example of what peer production is capable of, given the right architectures and tools.
Tags : peer production, search, seo, speed, time, tsunami, video, visibility, wikipedia
Posted by Morten Blaabjerg, February 25th, 2008 in Identify challenges, The mainstream problem
I’ve had a few days these past weeks where I’ve been kicked out by a fever and a sore throat. When you’re sick you’re not up to much. And when your 8-month daughter is sick too, it’s really no fun at all being sick, if that wasn’t enough.
On the bright side, this gave me some deserved time to finally get into Carsten Jensen‘s epic Vi, de druknede (in English, We, the Drowned, appearing later this year). I’ve been looking forward to reading this novel for a long time since it was first published in 2006, and I am thoroughly enjoying it.
It is an epic about the history of a 100 years from 1848-1945, not through the eyes of kings or generals, but from the perspective of the sailor, the adventurer, the flogged, the fugitive, the runaway, the outcast, the drowned (in all kinds of meanings of that word), and the wives and children who were left behind without being asked, all native to Marstal, a Danish port town on the island Ærø. With the obvious exception of the shrunken head of James Cook, which figures prominently in book. The novel leaves one with a few interesting perspectives on things global and local, which is inspiring, not least in the context of the global internet, and in the context of Kaplak.
In a Danish context the novel is not exactly marginal. It received rave reviews and has been extensively marketed by it’s publisher and by booksellers, and has sold well. In this sense, it is an industrial product, mass produced and sold via traditional book selling channels. The book’s IP (i.e. translation and distribution rights etc.) has been sold to more than a handful of other countries.
On the global web, however, the novel is a marginal niche product. It exists at the mercy of search and an exponential growth of information on the web. In this sense it faces precisely the same challenges as a completely unknown novel by a completely unknown author, if it wants to move beyond the local context and use the internet as a marketing or distribution channel. Like many similar products, Vi, de druknede has it’s own website, but to find it one almost has to search for the book’s exact title. At least, one doesn’t find it by searching for the author’s name, which was my first choice, because apparently Carsten Jensen doesn’t have his own website! The first hit is to an architect of the same name, and another to a LinkedIn profile for a CEO with the same name. And Carsten Jensen is even supposedly the most prominent “Carsten Jensen” in a Danish context, which would lead one to think that he had a greater amount of links pointing to information about him and thus a higher Google PageRank. The most authoritative (international) source on Carsten Jensen remains a stub on Wikipedia.
Even if one does manage to find the book’s website, one will find that it is only available in Danish. Apparently the publisher has thought only about using the global internet for targeting the book to a Danish audience, even if the book rights have long since been sold to a number of other countries. This of course just underpins the status of the novel as an industrial product, which seeks to appeal to a national, mainstream audience. An English reader will learn more from this article, which appears as the second hit, when one searches for keywords such as marstal + sailors.
As a niche product, Carsten Jensen’s novel doesn’t fare much better on the web than most niche products, despite local rave reviews and traditional marketing campaigns via conventional channels. It is as less or as much seen, as it has customers who search for it. Making this easier for potential readers has apparently been of very little concern to the publisher, if one takes this superficial analysis at face value.
We, the drowned is an obvious metaphor for all the unanswered queries of the web. When writing this article I had to find out what a “shrunken head” was in English. It is easy when you know it, but how do you show a search engine what you mean? I knew the Danish word, “skrumpehoved”, but finding the English term was pretty tricky. Kaplak doesn’t have any ambitions for creating new or more intelligent ways to search, but we do think the activity of our network will likely help generate more relevant and context-rich web results, which will more likely cover a much much longer tail of niche interests and pursuits, than is the case today.
Tags : books, Carsten Jensen, Google, industrial schisma, search, Vi de druknede
Posted by Morten Blaabjerg, January 14th, 2008 in The mainstream problem
I’ve previously referred to a phenomena, which I’ve chosen to term the mainstream problem. The mainstream problem describes the effect that distribution of information and cultural expressions acquires “hitlist” characteristics, when subjected to limited space, time or attention.
Chris Anderson, spokesperson for the advantages of the online niche economy in his book The Long Tail, describes ‘mainstream’ as that which many people are moderately interested in, while ‘niche’ describes that which passionately interest few people.
In industrial mass media such as the publishing, newspaper or television industry the scarcity of ressources means that one produces the product which sells well enough to finance it’s production. Since most people collectively demand the mainstream product, this product sells best and is therefore the one produced. This does not imply, however, that the mainstream product is the best. But it is the best possible product given a specific set of economical conditions, borne by specific means of production, which are too expensive to fulfill the needs of the niches.
A limited space (such as a webpage, the frontpage of a newspaper, television air time or the size of a screen) leaves space for just some information, in place of other information. Given the economical constraints discussed above, this space will be distributed according to ‘most popular’ hitlist criteria, meaning that the mainstream information, i.e. the information which hits the most people moderately, but none passionately, takes up the space.
The effect of displaying information this way is often amplified, since more people will take a closer look at the contents of the frontpage and further strengthen the visibility of the mainstream information. On the web, social recommendations strengthens this hit economy, in what has been termed the Justin Timberlake effect. On websites such as YouTube it has the effect, that few videos have millions of views, while millions of videos count below one hundred views.
As the amount of information on the internet grows (millions of new websites are created every month globally) the mainstream problem becomes a greater and greater problem for our access to relevant information on the web. The information may well be accessible somewhere on the net, but it is no good, if noone sees it or is capable of finding it – or rather, if people who wants it doesn’t see it or is capable of finding it.
Even Google will have a problem showing search results which are more than just moderately interesting for the websurfer, unless he or she has the patience to trawl the search results for the results which are passionately interesting. A main component of Google’s PageRank-algoritm is how many incoming links a given website has. This makes Google vulnerable to the same problem. The more who link to a website, the more visible the site will be on Google, all other things equal. The more visible it becomes, the more people will likely link to the site.
What is interesting to us, is what happens, when the economics change. Because they have already changed, and they are changing fast. There are no expensive means of production, which justify the limitations imposed on cultural production. The means of cultural production today equals the costs of a computer and an internet connection. But it is only slowly dawning on us. We have become so accustomed to the economics of limitations, that it is difficult adjusting to the economics of abundance.
Tags : attention, Chris Anderson, Google, limited shelf space, niche ecology, search
Posted by Morten Blaabjerg, January 5th, 2008 in The mainstream problem
“By the end of 1992 there were only 50 web-sites in the World and a year later the number was still no more than 150. … In 1994 there were 3,2 mln hosts and 3,000 web-sites. Twelve months later the number of hosts had doubled and the number of web-sites had climbed to 25,000.” (Griffiths, 2002)
In this way, internet historian Richard Griffiths accounts the explosive growth of the web from 1994 onward with the development of the first popular graphical browser, Mosaic. Mosaic was created by Netscape-founder Marc Andreessen, who went on to other projects after the browser wars. In recent times he’s begun writing a terrific blog on entrepreneurship, and he has co-founded and funded social networks service provider NING, a project and company we’ll keep a close eye on and get back to later, as NING opens up vast new opportunities for niche communities.
I recently wrote about attributing value to the context of finding information, rather than on any particular piece of information (which is what copyright is based on). One type of company and services, which so far has been very good at attributing value to the context of finding information, is search engines. Search engines have so far provided great ease and comfort of obtaining information online, which have given them a prominent position on the web.
What search engines has so far been able to deliver, however, they’ll find increasingly more difficult, as the amount of accessible information increase. Here’s what search engines has to deal with :
Total Sites Across All Domains August 1995 – December 2007.
Netcraft is a well-respected British internet company which among other things performs regular web server surveys. One of the nice side results of this work is a pretty decent idea of how many websites there really are in the world.
‘In the December 2007 survey’, Netcraft reports, ‘we received responses from 155,230,051 sites. This is an increase of 5.4 million sites since last month, continuing the very strong growth seen during this year; the web has grown by nearly 50 million sites since December 2006.’ The curve of the ‘active sites’, excluding Blogger sites and MySpace accounts, shows an even more solid exponential tendency. This kind of growth in accessible information on the internet spells huge challenges for search engines, which already now shows, especially if you do ‘weak’ searches on little known subjects.
If, for example, you do a search for the girl name ‘Britney’ on Google, 9 out of 10 results relates to the popsinger Britney Spears. This relates to what we may term the mainstream problem, which is basically the problem of a hit-driven industrial economy : limited shelf space. Google can display only a limited number of results on the first page (10 results as standard). The more interesting Britneys down the search results may hide the one page you’re looking for, but you won’t find it with any ease and comfort. But let’s say, which is not unreasonable, that this girl’s name is the one piece of information, you’re in possession of, before you search for something a Britney did or someone named Britney (but not Spears). If one has to browse 17 pages of search results before finding their particular piece of information (as is not rarely the case if you often look for obscure information online), one real quickly loses one’s patience with “search” as an effective means of finding information online.
Thus, you have a great market for social bookmarking services such as del.icio.us, Digg, Reddit, and to some extent, Wikipedia and other great collaborative databases of ‘further information’. Kaplak will add to this pallet of services, by attributing value to the context of finding information. This happens when a producer consciously designates a percentage of his price in Kaplak to pay for the context, in which his product is sold.
Tags : browser, Google, Marc Andreessen, mosaic, netcraft, search, understand premises