I woke up to ten or so first-time comments* in the moderation queue at Speed Force this morning. As I started reading them I was briefly confused: they were well-written, specific comments about comic books….that had nothing to do with the posts they were attached to. Complaining about Bendis’ writing on an interview with Paul Ryan (the artist, not the politician). Gushing about an Ultra-Humanite figure on a review of a Flash comic. Tips on finding exclusive Aquaman figures on a Flash TV episode review.

Then I felt strangely nostalgic, because I hadn’t seen this sort of spam in a long time.

As near as I can tell, the spammer finds a related site, scrapes comments from it, and pastes them into the target site. To what end I’m not sure, because the comments all linked to Facebook profiles. Most comment spam seems to be about link generation to prop up a spamvertised site in search rankings. But sure enough, when I searched for phrases from the spammy comments, I found the originals on a Daredevil fan blog, an action figure site, an artist’s blog, and so on.

I’ve got to give the spammer a little credit for two things:

  1. Finding actual comics-related blogs to scrape comments from.
  2. Inserting typos to make it harder to match. Though Google’s pretty good at fixing those.

In the end, though…

*plonk!*

*I have WordPress set up so that first-time commenters always go through moderation, while returning commenters are allowed through unless they trips a filter.

I found a comment in the spam folder for Speed Force that, on first glance, looked like an actual, relevant comment…to a different post. It was a coherently-written paragraph about how someone had “considered getting a second Captain Cold” action figure to customize it, but it was posted to an article about stalled miniseries. The author’s name and link were obvious spam, though (seriously, “watch full movies” is the best you can do?).

My first thought: They’d copied the text from another comment on the site. I’ve seen that happen before, but usually it’s comments on the same post. A search through existing comments didn’t turn up any matches, though.

So then I did a search on the rest of the web, and found the original comment on a review of an Atom Smasher toy.

Someone had gone looking for a site with a similar topic (comic books about super-heroes, action figures made from super-heroes), copied text from there, and pasted it onto mine…and yet they hadn’t bothered to match up specifics (like pasting it on a post about action figures or Captain Cold). So it’s not quite as sneaky as the one who followed a link in my post and pasted in text from the other page, but it’s pretty close.

Judging by a quartet of comments posted this evening, 3 of which slipped past Spam Karma, someone’s started outsourcing comment spam to India. (I’m serious, the IP addresses were assigned to Bharti Airtel and BSNL Internet, both ISPs based in New Delhi.)

They were posted quickly, as if they’d been composed in another editor and pasted into the form. More importantly, they were actually posted through the form, not just sending data directly to the handler. And most tellingly, the posters had gone to the effort to fill out the CAPTCHA that Spam Karma provides to allow human commenters to recover from a false positive.

The one I liked best, from a technical perspective, was posted on Tall Ships of San Diego. The spammer had followed my link to the San Diego Maritime Museum, then followed that to a page describing one of the ships, the Californian, and generated a post by stringing together sentences from that page. The whole thing linked to a student loan site.

At first glance, it looked like a garbled, on-topic comment from someone who maybe didn’t speak English as their first language. That happens, and if it’s a legit comment, I leave it. In fact, I considered leaving the comment but deleting the author URL, until I looked up the ship. (It wasn’t one of the ships we toured on our visit, and I didn’t recognize the name.) As I looked at the ship’s profile, I started recognizing text from the comment. At that point it became clear what was going on, and I started looking at the other comments posted over the last few hours.

Project Honeypot recently started tracking comment spammers as well as email harvesting bots. Oddly enough, even though they have data going back to March 22, and even though Bad Behavior and Spam Karma have blocked an incredible number of spam comments on this site (Bad Behavior has blocked 3807 connections in the past week alone)…none of the honeypots I manage have trapped a single comment spam.

And no, the honeypot on this site isn’t protected by those plugins.