From the category archives:

Search

Google Blog Search: 93 Percent “Pollution”?

by Steve Broback on October 21, 2007

Our client Jott launched their Jott the Vote initiative Oct 17, and I’ve been comparing the blog buzz surrounding it and other launches that occurred that day. Lumenos Health was one of the companies that also announced a new product on the 17th. I plugged their name into Google Blog Search and was surprised by the results. 69 entries found, and almost all of it spam or “pollution” as Jason Calacanis called it at Gnomedex.

google_blog_search_spam

I’ve seen a gradual degradation of the quality of results since the service appeared, but this seemed over the top to me. I entered the result set into excel and categorized the results. Then I set it up for filtering and made pivot tables and charts. The file is
here.

The result set included a significant number of blogspot blogs (44) and almost all were splogs (43.) Sadly, all the wordpress.com hosted results (6) were worthless(!) Hopefully, Matt and the gang aren’t being overwhelmed and will be able to delete these offensive sites quickly.

One result was not a blog but a bulletin board entry whose link led to a generic entry page on the site. I gave that one partial credit.

Why can’t algorithms be built to solve this? Spamsieve is terrific at filtering the crap out of my mail algorithmically. Is human-filtered search the answer?

{ 2 comments }

Sponsored links

advertise here