… implemented in an hour (!)
blekko’s search engine has a feature called slashtags, which can be used to either restrict a search to a list of websites, or remove that list of websites from the results. We typically use this feature for human curation, for example, picking out the best health websites. Hm, I thought, what an interesting hack! I’ll take that list of the most popular websites, and make slashtags which can be used to either search or exclude the most popular 10, 100, 1000, 10,000, or 100,000 websites. Our current effective limit to slashtag size is 100,000 websites, so I couldn’t do the most popular 1,000,000 sites.
The way you search with a slashtag on blekko is to add /slashtagname to your search. In this case the top-100,000 site slashtag is named /top10/top100k, and we use a minus sign to exclude those sites:
And, after about an hour of work, voilà! Here are a couple of examples of our new bottom-website search engine in action.
This query happens to show another blekko feature in action, autoboosted slashtags. When you search for great concert, we automatically figure out that you’re interested in music, and so we use our curated /music slashtag to improve the results:
But, for the purposes of this experiment, let’s turn off this autoboost feature by adding /web on the end of our query:
Now let’s look at the web, minus the top 10, 100, etc. websites:
great concert -/top10/top10 – Voilà! amazon.com disappears.
great concert -/top10/top10k – mtv.com and emusic.com disappear.
The Hugo Award is given annually to the best science fiction book of the year. Let’s see what the bottom of the web thinks about it:
hugo winner -/top10/top100 – flickr and about.com disappear; davidbrin.com appears.
hugo winner -/top10/top1000 – goodreads disappears, librarything appears
hugo winner -/top10/top10k – io9, boingboing, and Esquire disappear; dpsinfo, sfwriter.com, and mabfan.livejournal.com appear
hugo winner -/top10/top100k – no change
Try it for yourself!
|Search the top 100,000:
||Search all but the top 100,000:
|Search the top 10,000:
||Search all but the top 10,000:
|Search the top 1000:
||Search all but the top 1000:
|Search the top 100:
||Search all but the top 100:
|Search the top 10:
||Search all but the top 10: