Tagged web brings whole new form of keyword spam

I just noticed a trackback on a recent post which led me to a site called news.blogcarnival.com. The trackback comment itself was kind of nonsensical:

Don’t take it from me Excuse me, but how do you get this again:…

and so, it turns out, was the site - just dozens of quotes from other blogs in syndicated blog format, each beginning with a very generic intro line like “Check this out before you go to sleep:” or “Yeah:”. There was a Google Adwords section at the very top, and an Amazon ad on the side, and a selection of topics which closely matched what can only be described as the Google Zeitgeist, with a political slant. Soon it dawned on me what was happening - the blogcarnival blog software was scanning one or more of the popular tagged web sites, such as Technorati, for all new posts tagged with the keyword president, first_lady, election, pope_john_paul, politics, etc., fetching the first few paragraphs of those articles and adding them as a post. I guess when they called it blog-carnival they meant one of those spooky, “haunted” carnivals where the rides start up by themselves.

Not only that - a visit to penis.blogcarnival.com reveals much the same thing, but with one important difference: the introductory sentences are tailored to the content rather than generic. Clearly the blogcarnival proprietor has more interest in some topics than others. There are also blogs on their domain for holiday gifts, fast cars, skin care, music etc.

In tangling itself so deeply in the web of tagged blogs, the site is piggybacking on the stream of human traffic that flows throughout it. A smart move, and one that I would imagine is generating plenty of ad revenue. It could be argued that blogcarnival is performing a service by aggregating these links together. So the question is: Is this a problem?

Hell Yes, and for one simple reason: It reduces the signal-to-noise ratio of the web. Unless this marketing tactic is tackled somehow, its continued profitability will damage peoples’ ability to find unique, authored content. We’d end up with a few lone original voices and a cacophony of echoes, all repeating each other until the blogosphere drowns in useless automated content.

I doubt very much that the latter scenario will be allowed happen, but the question remains: who will stop it, and how?

P.S. A horrible irony of this situation is that this kind of “intelligent” spam is only technically feasible because of the low signal-to-noise ratio of tagged blogs. They are normally a safe-haven from the fragility of automated search engine ranking systems. I guess nature abhors a vacuum.

Update: The blogcarnival website is run by two guys called Brad Rubenstein and Steve Damron (read their respective announcements here and here).

You know, they seem like nice enough guys. They both did serious time at Sun Microsystems. Brad has a PhD in Computer Science and sings Opera pretty well. Steve is also musical, playing the violin, and is active in the gay pride movement. But from my point of view it’s people like them who are screwing up the web for everyone else, by chasing that destructive, manipulative “American Dream”: Have one good idea, and then sit back and watch the money roll in. It’s a sociopathic “passive income” mindset, and it sucks that two creative, intelligent people are resorting to this kind of lazy snake-oil merchantry when there is so much amazing, interesting, useful stuff still to be done on the web.

Update 2: Brad has mentioned me on his site (see trackback below). Sorry Brad, I haven’t heard you sing, but feel free to quote me in publicity. I once dated a beautiful up-and-coming opera singer called Elena Xanthoudakis (damn, how did I let her get away - oh, that’s right, I never appreciate people’s worth until they’re on the other side of the world…), and my mother is a choral singer, so perhaps that gives my comments some weight. I just wish some of their talent had rubbed off on my own primitive voicebox.