Entries in 'speed' ↓

The Scary Part of Risking Yourself on the Web

Satheesh Kumar, developer of the Yet Another Autoblogger plugin recently wrote this post on the difficulties of conveying your enthusiasm for blogging to others around you. I can relate a lot to Satheesh’s experience, as he describes it here :

I have made a lot of fruitless attempts to bring them to the world of blogging. I have offered them free blog resources, free themes, add-ons etc. But no one was interested.

I found that my school has a high page rank .gov.in website kept useless with only a few HTML files. I have asked the principal to set up blog hosting and offer free blogs to the students. It will not only develop their communication skills but inculcate a new culture in them. I have offered all helps. But none was interested ( both students, teachers, and the administration )

I have asked a lot of senior doctors with good practice and knowledge to start blogs in their favourite topics. Most of them said some unclear reasons for not blogging. One of these senior guys ( he was my teacher too ) said ” I knows how to send emails and to use orkut, but I haven’t entered complex things like blogging.” !!

I tried a lot to confess him that its simple like email and Orkut. I clarified that he can publish a blog by just sending an email to a secret email id. But no one was interested !!!

Resistance to new technology, new services and new ways of thinking is natural. We are all animals of habit, who hate unneccessary disturbances and like rhytms, customs and habits, which we have become accustomed to. It’s easy to perceive of the internet or particular phenomena related to the internet as threats best to be avoided.

On a personal level, one reason blogging is scary is because you put yourself on the line. If you write something and put it out for public consumption, you risk looking stupid, ignorant or otherwise become exposed. Most people don’t like to be exposed. They like to hide. They like to let others go first, so that they can watch from a distance and enter the new domain, once it’s been defined and secured by others.

But does this do it for the internet? I doubt there is or will be such a thing as a defined and secure internet. You have to risk it. You have to expose yourself. There’s no going away, no hiding behind others. Because the internet is about meeting other people. Some of these you already know, others you enjoy more distant relations with, and yet others you have yet to meet. You can’t hide if you want to connect with someone. It is the real you, you want to show, if you want to be taken seriously. And it is the real you, others want to connect with.

At least if you want to yield the power of this new space and learn to embrace new ways of thinking, working and communicating, you have to risk yourself, like Satheesh, myself and millions of other bloggers, twitterers, wiki editors, and other participants of the digitally networked information economy.

There’s a slight danger that the prejudices and fears about online activities such as blogs, twittering or wikis will widen the gulf between people who resist new technology and those of us who are rapidly getting sucked in and fast learning new ways.

On the other hand, I’m hoping we can do a lot to attract others to “jump in”, even though it’s uphill a long way. I find Facebook is a good place to start, so I use every opportunity to post links there for my blogposts, and to crosspost tweets to Facebook as well, in order to make people in my network curious about what’s going on in other places. Curiosity is king, I hope. But ultimately, I want people I know to leave the confines and false safety of Facebook and enjoy the full range of opportunities available to them, once they learn to embrace them. Because this, I feel, will empower them. They can be the ones who define who they are in this space, and what they’ll use this new space for.

Ultimately, resistance is futile. However, there’s nothing to be scared of. How could there be?

We’re not going to be senseless web junkies. To the contrary, what is happening is an awakening, an image often invoked by Lawrence Lessig, like in this great, thoughtful article on Lessigs talk in Dona in Qatar in 2007. We’re in the process of extending our methods and communication on a truly global scene and unprecedented scale. There are grand shifts in power taking place right now – from those who rely on the tested and tried methods and institutions of yesterday, and those who embrace and develop new methods and institutions, rooted in use of new technology and new social opportunities which arise from the clever use of new technologies. The order of the political landscape is changing. And it is changed by you and me.

Then again, this is really scary to a lot of people, especially if you insist on your old ways in spite of what’s going on. This is scary, if you do not feel anything in your heart. If you have become so accustomed to living by another man’s rules and definitions of the world. If you are not curious to learn about the world. If you’ve got enough in yourself and do not want to embrace other people. But I can’t believe that is really the case.

Contextualized Search

I’ve previously written about the merits of attributing value to the context of finding information, rather than on any particular piece of information. This makes sense in an environment which literally explodes with new information, and shows no signs it’s gonna stop in any foreseeable future.

Google seems to think so too. After all, this is what Google do, and do really well. But it’s true no less of a somewhat overlooked product of Google’s. I’m talking about Google’s Custom Search. This service allows anyone to composit their own search engine, and place it on their own website. More accurately, your custom search engine filters Google’s index of webpages. Say you want a search engine on your site about your niche subject only to return results which relates to your site. It’s simple : type in your site name, and allow Google to show results from your site as well as all the sites your site links to. Or you can be even more specific, or list a range of sites you want results to be taken from. Or you’d like Google to still show results from the web, but emphasize results from your own site – this is also easily doable.

The only problem so far with Google’s Custom Search has been on the one hand that Google’s crawlers don’t seem to index every website too tightly and too frequently, and on the other, that results are still based on PageRank. Say you want your users to find a great piece on your blog about a particular subject, when they search for that subject, but that piece isn’t greatly linked to by other sites or articles. Chances are, that Custom Search will show a largely irrelevant, but greatly linked to article from another site, or simply not show that post at all, if it hasn’t been properly indexed. Your built-in blog search, such as WordPress’ search, will find that article very fast, because it searches your database directly. For smaller sites, local search as we know it, is still much more effective.

However, as sites grow and we as internet users and bloggers spread our activities over many sites and platforms, platform-specific search is too limited. We begin to look for more tailormade solutions. Google’s Custom Search is one, but there are others who want a piece of the action.

New kid on the block

Lijit is an internet startup based in Boulder, Colorado, which offers a promising version of “local” or “contextualized search”, which searches one’s blog, “content” (on sites such as YouTube, Flickr and many others) and the network of sites and “friends” your online activities connect you to. We’ve already created a Kaplak search engine powered by Lijit, and the Lijit widget is featured in the outer right column on this blog. I think Lijit could potentially be a very useful addition to the Kaplak toolbox. I plan to expand this search engine with further feeds and sites as our network and activities grow.

When I first tried Lijit, I wasn’t satisfied with the search results. I searched for a direct title in one of our blog posts, and it didn’t come up. As the impatient web customer I am, not hesitant to make a fuss about my problems with a free online service – on another free online service, I posted my quibbles on Twitter. It turns out, Lijit is on Twitter too, and so is Micah Baldwin, who works for Lijit and took time out to answer my quibbles.

It turned out Lijit based their first version on Google’s Custom Search, while developing their own web crawler. Switching Kaplak’s search to Lijit’s own crawler was a huge improvement from Google’s occasional crawl, and made me look much more enthusiastically at what this small team of extremely talented people are doing. I take my hat off for a company which acts so swiftly in response to “customer” sentiments, and make it a priority to help their users along with such friendliness. There are a lot of companies who could learn so much from Lijit. Micah and Lijit gives the expression “listening to the groundswell” a whole new meaning.

I like the freshness of Lijit and I like the results after being switched to their own crawler. I have only a few quibbles with it now. It’s got what I’d call some weaknesses in the versatility department, because I can’t control and finetune texts, messages and included sites/webpages as much as I’d like to and was quickly getting accustomed to in my short period of experience using Google’s Custom Search. For instance, I found all of my del.icio.us network automatically included in the search engine, where I’d like the opportunity to handpick whose links got to be included. Lijit’s search engine also wants to categorize results very neatly into “my blog” (even though the Kaplak Blog is not precisely “mine” – it’s the company blog and maintained by me, but not “mine”), “my content” and “my network”. What if we (which we’re probably going to) put the widget on our wiki? – that’s not exactly “mine” either. Our Kaplak universe is not so neatly organized, and while I do like the “Lijit picks” category, I prefer being able to scrap all categorization schemes altogether, get our own adsense stuff on the search results and just get on with finetuning and putting in more sites and feeds to give our visitors the best possible experience.

Lijit can potentially be a great key to tying together the many different platforms we operate on in Kaplak – and one we’d even pay for, if they included premium options we needed. As a company, we still do need search, and if Lijit could potentially even crawl user and product profile pages on our later-upcoming Kaplak Marketplace, we’d have something here, which we’d probably like to pay good human money for.

Conversational search

You can find most of my conversation with Micah via Summize, an online service which has built a search engine on top of Twitter, searching conversations on Twitter in realtime.

Imagine a service which have taken upon itself the daunting task of searching all things on Twitter instantly and is capable of threading and translating posts to and from numerous languages – globally. Then you have Summize.

Using Twitter a lot these last few months, I’ve found Summize indispensible to keep track of tweets, users and subjects. I’ve also used it for market research, i.e. “listening” to what other users are twittering. I find this stuff utterly incredible. There’s a lot of things happening in the search business these days.

I’m sure this is only the beginning.

[EDIT : Twitter's acquisition of Summize has broken the above link to the Summize search with my conversation with Micah. Here's a similar search on the new http://search.twitter.com which supposedly replaces Summize...]

Is there a future for Search?

Or do algorithmic search really scale better, work faster and ensure better quality than ’socially produced’ services? A few days ago, I had an interesting exchange of POV’s with Danish SEO Mikkel deMib Svendsen, known among other things from the SEO radio show Strikepoint.

I replied to Mikkel’s blog post on ‘Search – before, now and in the future’, where I tried to make the point that search as a communications solution suffers from key preconditions, which are far from optimal. Among these the fact that in order to search for something, you need to know what to look for before you search, and you need to deliberately and consciously use a search engine to look for it.

In other words, search is a deliberate and conscious affair. This makes it difficult, for instance, to use search to market products, which are not well known, such as niche products, or to address problems or needs, which are not yet consciously thought or expressed.

Add to this the present growth rates of information on the web. For each new website added to the web, you increasingly risk that your information will never reach the queries seeking it. We’re talking exponential growth rates in several millions of new websites, added each month to the web. As Search faces these increasingly greater amounts of information, this problem, which we’ve so far dubbed the mainstream problem, will only become more apparent.

Mikkel, however, firmly believes in the future of algorithmic search, so these claims didn’t go uncommented. First, he argues that machines will always work faster and scale better than social services, which has great filtering and quality challenges :

I am completely in line with Louis Monier [founder of Altavista], and am 100% certain that algorithmic search will remain dominant. Manual data processing, like in the [online] social services, simply suffers from too many scaling and quality assessment-issues to compete in the long run. Only machines are scalable on the necessary scale and with a continued central quality assessment. … [my own translation, MB]

I have a few problems with these arguments for the quality of search as a communications method. I wanted to analyze them a bit here, in order to make them part of our process to find out more about the effectiveness of online communication and it’s niche/longtail effects. Over the past months, I’ve come to question the widespread naturalization of search as ‘the best’ and ‘natural’ method of making information available and visible online. True, right now Search is the dominant method for obtaining information online. It is also a billion dollar business, although this is mostly due to the success of Google AdWords/AdSense, which is quite a different product. However, this may not be so in the future.

The part of Mikkel’s argument which makes a distinction between social services on one hand, which are manually created and processed (by humans), and search on the other, which uses machines and algorithms, and therefore scale better etc, is fundamentally flawed.

First, search results are ‘peer production’ almost as much as any online social bookmarking service is, i.e. they are socially produced. Peer production is a term coined by Harvard professor Yochai Benkler (in The Wealth of Networks (2006) – which can be downloaded freely here). That search results are peer production means that they create value by putting together websites from different peers (i.e. companies, organizations or individuals) in order to respond to a search query. Google does this without even asking the peers first (it’s an opt-out, not an opt-in system), and so the peers used to create value may not even know that they contribute value to this system. However, this doesn’t detract from the fact that it is the sourcing and pooling together of the work of different peers as a response to a human search query, which creates value in a search result. A search result is socially produced, even though the work done filtering and presenting it in few seconds is done by advanced programming, software, hardware – and cables.

These advanced architectures however, are also created by humans, which means there are someone sitting and using their human capabilities to decide what categories and what variables should factor in with how much weight in the algorithms which control the process of finding and delivering information, when somebody searches for something particular.

The problem with this is not that the process is just as human-influenced as any online social bookmarking service for instance, but that the someone deciding what variables should factor in, is (most likely) not an expert on what the someone at the other end typing in a search query is looking for. The other someone is. It’s one size fits all. One architecture (in principle) for all queries in the world.

I tried asking Mikkel how he could be sure, that a query actually met with a usable result. Even if a query is answered by a number of search results, this doesn’t mean that the search results are actually usable and delivers the answer to the query. If this user experience is bad, search fails in delivering an answer, even if there are a million hits on the query.

Let’s take a look at something completely different, i.e. this page at Wikipedia. Notice the edits happening to the article “Tsunami” in december 2004? A page which before december 2004 had minimal contributions and edits made to it, literally exploded with new information, when a tsunami this month devastatingly hit the coasts of Sri Lanka and Thailand. Everything was frequently updated as events rolled along and people in different parts of the world found out new things about what had happened, complete with a small animation to go along with it.

Wikipedia aims to make knowledge freely accessible to anyone on the planet. Like providers of algorithmic search, Wikipedia uses lots of machinery to deliver it’s information, as well as an advanced complex of software architectures. Wikipedia’s articles are peer produced, but much more directly and consciously so than the algorithmically created search result we saw earlier. Even the software is peer produced. MediaWiki is free software, which can be copied and worked upon by anyone who wishes to do so, and any changes may be adopted by the main package.

A second point is the difference in value created. With an example from his own work, Mikkel illustrates how Search Engine Optimization (SEO) done right directly creates great surplus of value for the companies he and other SEO’s work for. Regular SEO maneuvres help direct lots of relevant traffic to the corporate websites.

That SEO helps create value, however, by more directly targeting traffic at corporate websites can’t be said to be an argument for the quality of search as a communications solution in and of itself, but rather for the quality of Mikkel’s and his colleagues’ work. There’s a lot of money in SEO, and that’s not because search is a brilliant solution to a communications problem. It is rather because search is inherently insufficient as a solution to the problem of connecting a query/demand with an answer/product, especially for a company which wants to stay alive and gain a competitive edge. And this problem will grow a lot bigger. I predict that Mikkel and his SEO colleagues will be paid even better in years to come.

It is first and foremost a problem of visibility, not particularly of search. We need to create better ways to make information accessible to the people who need it, without swamping those who don’t. Second, it is a problem of speed, because we need information fast, to better meet the challenges we face, as individuals, organizations and societies.

As a non-profit, Wikipedia doesn’t make any money on the processes involved in creating and building a quality article, but the value that an improved Wikipedia article (such as the tsunami article) provides for millions of journalists, for instance, and the newspapers and media companies which employ them, is indispensable. I know for a fact that reporters use Wikipedia a lot, and with good reason. It is the fastest and most scalable source of information online, beyond any doubt. And when as many contributors as in the tsunami article come together, it also proves a highly reliable and credible source. It beats the crap out of trawling search results pages without finding what you’re looking for. But it is only a small example of what peer production is capable of, given the right architectures and tools.