Kaplak is changing it’s course again. Since the inception of the first kaplak idea, we’ve come a long humbling way to only realize over and over again, how much we still have to learn. But slowly, we also realize what kind of knowhow we have and are building, and how Kaplak can help crack the problems and meet the challenges, which we set out to originally. Hence we also begin to understand what kind of value we add – and just as importantly, what we don’t add. Among many other things, this is key to learn what kind of business model we want to build – and, just as importantly, what kind of business we don’t want.
Let’s take a look at what happened with our traffic since the somewhat bumpy rolling-out of Kaplak Stream in 2008, from November 1st last year to February 1st this year :
The above is a screenshot from the Google Analytics Dashboard for Kaplak.com including subdomains. Following the launch of Kaplak Stream, sometime in November our traffic started to take off. Kaplak Stream basically consists of the present WordPress MU installation of which the Kaplak Blog is also part, along with a handful of customized plugins, of which the most important one is FeedWordPress. The idea (as sketched out in this previous blog post) is that items in the stream can be “fed out” from the stream again, which will reveal new contexts, which didn’t exist before. When two separate items which are both tagged “Barack Obama” are fed from the stream, they create a new “Barack Obama” context, even though the original items may have been produced and published in wildly different contexts.
The first installment of Kaplak Stream came with just about fifteen feeds, of which a handful were submitted by owners of niche websites. Others were feeds from sites such as YouTube, Amazon.com, Twitter (tracking particular subjects or keywords) and Boing Boing. Enough to provide the stream with some variety and “head” which would also test the autotagging performed by Open Calais via a modified version of Dan Grossman’s WordPress plugin.
Kaplak Stream managed to aggregate well over about 15.000 items, i.e. about 1000 items from each feed on average. Grossly more tweets than regular blog posts were aggregated, but posts attracted the greater amount of traffic, given that they worked much better with the autotagging functionality in place. Since they had more text, the tagging tended to be more precise – although some times tags were wildly misleading and out of place. Room for lots of improvement. Most, about 90-95% of all traffic came from search, notably Google. Visitors tended to not stay long, but quickly be on their way again. This could seem to suggest that only few found what they were looking for. However, reports also came in from feed owners, that our traffic managed to produce a meaningful sample of visits on the actual sites aggregated. This was really good news, as it suggests that a sample of our visitors actually found what they were looking for, or was curious enough to click through.
So what pulled the rise in traffic? No subject in particular, but the variety of subjects covered. What attracted users were more often than not pretty obscure pages and topics. For example, top result were the “tag page” for the tag “university-of-illinois-arctic-climate-research-center” with 641 views, and there was absolutely no recoginzable pattern in the rest of the more popular pages reached by visitors. I have not given our sample here substantial analysis, but my guess would be that there would be a neat power law graph, if one dotted in the number of visits to each page in Kaplak Stream and ranked them besides each other. But there is no discernable pattern as to what determined what aggregated items were more popular than others.
While some things seem to work, albeit still just barely, there are also problems. One of these is that apparently something happened on January 26th, which made our traffic drop drastically to before Kaplak Stream levels. Presumably this drop was caused by a Google penalty from duplicate content, which Google have been known to give websites which carry identical content across different domains. While Kaplak’s goals are somewhat aligned with Google’s, although not completely, I’m not unsure the penalty (if there was one) was not “right” in the sense that there were clearly limits to how informative and appropriate the search results which led visitors to our site, were. At least to justify the dramatically beneficial position we gained by aggregating just 15 feeds.
Another problem is the “noise” level, in our tagging, and in the combinations of feed items tagged with similar tags. Tags can be and mostly are very local. A post only remotely connected with a person and a piece which is solely about that person are usually tagged identically. My instinct tells me we need to use automated tools for what they are good for, and let filtering be more in the hands of expert users, in the contexts where it matters.
Clearly, more experiments are needed, and we need much more sustained analysis and methods to analyze our data. All this takes time and costs money. Right now Kaplak has no business model except what we can put into it of our own pockets (meaning mine) – and these are rapidly emptied. This means, for the time being, i.e. for several months now – and several months (and perhaps even years) ahead, I will not be able to work and develop Kaplak on full time. Thanks to the benevolence of our host, we can keep and continue to work on all Kaplak’s sites and projects, but we’ll make some changes which prepares us best to run Kaplak as a part-time operation.
We’ll convert the Kaplak setup to a setup more similar to that of the UMW Edublogs set up by Jim Groom at the University of Mary Washington. Among other things, this means we’ll focus more on building each smaller site in the network, and keep each site focused on it’s subject or theme. We’ll focus more on aggregating what happens within the Kaplak network of sites than what is going on outside the Kaplak WPMU install. We’ll still use aggregation tools to track very particular subjects, keywords and tags, but each different subject will be treated in a site of it’s own, to make things more manageable (it’s a mess cleaning up a large site based on aggregated items). In other words, we’ll run a network of small, very low-maintenance sites, and delay bigger experiments and improvements for a while. Meanwhile, Kaplak Stream will still be able to track tags across all sites and offer feeds from particular tags used in the network.
Reducing the amount of my time which goes into actual development of Kaplak also means I can focus better at building a new constellation of ressourceful people and (real) investors, which we will need to come back stronger with a revived Kaplak at a later time. This is what I hope to achieve, while I work simultanously on other things, making a living.
However, there is also a risk, that we don’t. That our ways may go in other directions. This is not necessarily all bad. See this video with Tim O’Reilly in a previous post to see why. I will try very hard to keep an open mind and attitude and not get stuck in ideas I ought better to leave behind. That said, I can’t see any companies or services which presently really cracks the problems we set out to – and this means we still need to fill that space, one way or the other. And more than anything, I can’t stay away.

