Contextualized Search

I’ve previously written about the merits of attributing value to the context of finding information, rather than on any particular piece of information. This makes sense in an environment which literally explodes with new information, and shows no signs it’s gonna stop in any foreseeable future.

Google seems to think so too. After all, this is what Google do, and do really well. But it’s true no less of a somewhat overlooked product of Google’s. I’m talking about Google’s Custom Search. This service allows anyone to composit their own search engine, and place it on their own website. More accurately, your custom search engine filters Google’s index of webpages. Say you want a search engine on your site about your niche subject only to return results which relates to your site. It’s simple : type in your site name, and allow Google to show results from your site as well as all the sites your site links to. Or you can be even more specific, or list a range of sites you want results to be taken from. Or you’d like Google to still show results from the web, but emphasize results from your own site – this is also easily doable.

The only problem so far with Google’s Custom Search has been on the one hand that Google’s crawlers don’t seem to index every website too tightly and too frequently, and on the other, that results are still based on PageRank. Say you want your users to find a great piece on your blog about a particular subject, when they search for that subject, but that piece isn’t greatly linked to by other sites or articles. Chances are, that Custom Search will show a largely irrelevant, but greatly linked to article from another site, or simply not show that post at all, if it hasn’t been properly indexed. Your built-in blog search, such as WordPress’ search, will find that article very fast, because it searches your database directly. For smaller sites, local search as we know it, is still much more effective.

However, as sites grow and we as internet users and bloggers spread our activities over many sites and platforms, platform-specific search is too limited. We begin to look for more tailormade solutions. Google’s Custom Search is one, but there are others who want a piece of the action.

New kid on the block

Lijit is an internet startup based in Boulder, Colorado, which offers a promising version of “local” or “contextualized search”, which searches one’s blog, “content” (on sites such as YouTube, Flickr and many others) and the network of sites and “friends” your online activities connect you to. We’ve already created a Kaplak search engine powered by Lijit, and the Lijit widget is featured in the outer right column on this blog. I think Lijit could potentially be a very useful addition to the Kaplak toolbox. I plan to expand this search engine with further feeds and sites as our network and activities grow.

When I first tried Lijit, I wasn’t satisfied with the search results. I searched for a direct title in one of our blog posts, and it didn’t come up. As the impatient web customer I am, not hesitant to make a fuss about my problems with a free online service – on another free online service, I posted my quibbles on Twitter. It turns out, Lijit is on Twitter too, and so is Micah Baldwin, who works for Lijit and took time out to answer my quibbles.

It turned out Lijit based their first version on Google’s Custom Search, while developing their own web crawler. Switching Kaplak’s search to Lijit’s own crawler was a huge improvement from Google’s occasional crawl, and made me look much more enthusiastically at what this small team of extremely talented people are doing. I take my hat off for a company which acts so swiftly in response to “customer” sentiments, and make it a priority to help their users along with such friendliness. There are a lot of companies who could learn so much from Lijit. Micah and Lijit gives the expression “listening to the groundswell” a whole new meaning.

I like the freshness of Lijit and I like the results after being switched to their own crawler. I have only a few quibbles with it now. It’s got what I’d call some weaknesses in the versatility department, because I can’t control and finetune texts, messages and included sites/webpages as much as I’d like to and was quickly getting accustomed to in my short period of experience using Google’s Custom Search. For instance, I found all of my network automatically included in the search engine, where I’d like the opportunity to handpick whose links got to be included. Lijit’s search engine also wants to categorize results very neatly into “my blog” (even though the Kaplak Blog is not precisely “mine” – it’s the company blog and maintained by me, but not “mine”), “my content” and “my network”. What if we (which we’re probably going to) put the widget on our wiki? – that’s not exactly “mine” either. Our Kaplak universe is not so neatly organized, and while I do like the “Lijit picks” category, I prefer being able to scrap all categorization schemes altogether, get our own adsense stuff on the search results and just get on with finetuning and putting in more sites and feeds to give our visitors the best possible experience.

Lijit can potentially be a great key to tying together the many different platforms we operate on in Kaplak – and one we’d even pay for, if they included premium options we needed. As a company, we still do need search, and if Lijit could potentially even crawl user and product profile pages on our later-upcoming Kaplak Marketplace, we’d have something here, which we’d probably like to pay good human money for.

Conversational search

You can find most of my conversation with Micah via Summize, an online service which has built a search engine on top of Twitter, searching conversations on Twitter in realtime.

Imagine a service which have taken upon itself the daunting task of searching all things on Twitter instantly and is capable of threading and translating posts to and from numerous languages – globally. Then you have Summize.

Using Twitter a lot these last few months, I’ve found Summize indispensible to keep track of tweets, users and subjects. I’ve also used it for market research, i.e. “listening” to what other users are twittering. I find this stuff utterly incredible. There’s a lot of things happening in the search business these days.

I’m sure this is only the beginning.

[EDIT : Twitter's acquisition of Summize has broken the above link to the Summize search with my conversation with Micah. Here's a similar search on the new which supposedly replaces Summize...]

flattr this!

Tags : , , , , , , , , , ,  

Google Torrent Search

As some readers will be aware, earlier this month a Danish court ordered the Internet Service Provider Tele2 to block it’s users’ access to the bittorrent-sharing site The Pirate Bay. Mike Masnick sums it up pretty well.

One may very well wonder (as Masnick does too), if The Pirate Bay, which is essentially a search engine and consists of nothing but metadata, should be blocked, other search engines where one may find torrent-files leading to copyright-infringing material ought to be blocked too. Now Mikkel DeMib Svendsen, renowned Danish SEO-expert, internet entrepreneur and columnist, has responded in kind, to illustrate precisely this point. His column is available here, in Danish only, but his point transcends all languages.

Torrent Search is simply a custom search engine built using Google’s own tools, which trawls all of Google’s index for torrent files. DeMib’s point being of course to illustrate the absurdity of the block and of the court’s findings. If The Pirate Bay should be blocked, so should Google. And so should every other search engine or index of metadata, which allows one to find hyperlinks to material, which someone deems infringing on someone’s copyrights.

flattr this!

Tags : , , , , , , , ,