blekko donates search data to Common Crawl

At blekko, we believe the web and search should be open and transparent — it’s number one in the blekko Bill of Rights. To make web data accessible, blekko gives away our search results to innovative applications using our API. Today, we’re happy to announce the ongoing donation of our search engine ranking metadata for 140 million websites and 22 billion webpages to the Common Crawl Foundation.

Common Crawl has built an open crawl of the web that can be accessed and analyzed by everyone. The goal is building a truly open web, with open access to information that enables more innovation in research, business, and education. Common Crawl will use blekko’s metadata to improve its crawl quality, while avoiding webspam, porn, and the influence of excessive SEO (search engine optimization). This will ensure that Common Crawl’s resources and engineering time are spent on webpages that are written by, and are useful to, humans.

We’re putting our full-fledged support behind Common Crawl’s crawl and mission with this donation. We’re not doing this because it makes us feel good (OK, it makes us feel a little good), or because it makes us look good (OK, it makes us look a little good), we’re helping Common Crawl because Common Crawl is taking strides towards our shared vision of an open and transparent Internet.

Just take a look at this excerpt from Common Crawl’s website:

“As the largest and most diverse collection of information in human history, the web grants us tremendous insight if we can only understand it better. For example, web crawl data can be used to spot trends and identify patterns in politics, economics, health, popular culture and many other aspects of life. It provides an immensely rich corpus for scientific research, technological advancement, and innovative new businesses. It is crucial for our information-based society that the web be openly accessible to anyone who desires to utilize it.”

Who could disagree with that?

(Discuss on Hacker News)

MIT Review: A Free Database of the Entire Web May Spawn the Next Google (Hey, isn’t blekko the next Google?!)

About Greg

Greg is the CTO of blekko
This entry was posted in Partnerships, Technology. Bookmark the permalink.

4 Responses to blekko donates search data to Common Crawl

  1. Pingback: SearchCap: The Day In Search, December 17, 2012

  2. Pingback: SearchCap: The Day In Search, December 17, 2012 : eMarketing Wall

  3. Erik S says:

    Thanks Greg and thanks Blekko! This is very nice of you.

  4. Dave Mackey says:

    Very impressed! Thanks so much for doing this.