I found a sneaky type of spambot this morning. It was impersonating regular commenters on Speed Force, using their names and (at first glance) email addresses to blend in.

The names weren’t terribly surprising, but the email addresses were. Where had it gotten them? WordPress shouldn’t reveal them, unless there’s a bug somewhere. Was one of my plugins accidentally leaking email addresses? Had someone figured out a way to correlate Gravatar hashes with another database of emails?

As I looked through the comments, I realized that in most cases, it wasn’t the commenter’s usual email address. Here’s what the spambot was doing:

  1. Extract the author’s name and website from an existing comment.
  2. Construct an email address using the author’s first name and the website’s domain name.
  3. Post a comment using the extracted name, the constructed email, and a link to the spamvertised site.

The actual content (if you can call it that) of the comments was just a random string of numbers, and the site was a variation on “hello world,” leading me to suspect that it might be a trial run. Certainly they could have been a lot sneakier: I’ve seen comment spam that extracts text from other comments, or from outbound links, or even from related sites to make it look like an actual relevant comment.

I’d worry about giving them ideas, but I suspect it’s already the next step in the design.

Update: They came back for a second round, this time here at K2R, and I noticed something else: It only uses the first name for the constructed email address, but does so naively, just breaking the name by spaces. This is particularly amusing with names like “Mr. So-and-so,” where it creates an address like mr@example.com, and pingbacks, where the “name” is really the title of a post.

Some suspicious pingbacks this morning tipped me off that there’s a splog (spam blog) automatically copying posts from K-Squared Ramblings to their own site. I sent them a complaint this morning, but they don’t seem to care much: They’ve scraped the RSS feed again, and reposted the same 15 articles nine times today!

It seems extremely likely that they’ll repost this article as well. If you’re reading this on “Attorney Legal Blog” (great irony there), the site is ripping off content from other websites — and clumsily, too!

For the record, the site doing the copying, which I won’t link to directly, is “www – dot – legal – dash – attorney – dot – info”. And it looks like a lot of other sites are being copied…just as badly, repeats and all.

Wow. Email addresses really do stay on spam lists forever. The postmaster account just picked up a non-delivery report for a message sent to a server that’s been offline for 7 years!

I hate receiving an email ad from a company that I recognize when I can’t remember for certain whether I signed up for email or not. Did they just reactivate an old list that I’d forgotten, or have they been acting shady, picking up email addresses but not permission?

If it’s an old, forgotten list that I really did sign up for, I should just unsubscribe and have done with it. Reporting them as spam would be irresponsible and would actually make spam filters less reliable.

On the other hand, if they harvested or bought my email address, I should report it as spam…and I shouldn’t trust the unsubscribe link.

While cleanning out the comment spam folder on Speed Force, I found this gem:

Hi this is a attempt to get noticed on the world wide web and hopefully spread the word about our services. It would be kind of you if you allow me to share my online marketing one the site. The company name is [REDACTED]. Thanks

I suppose you’ve got to give them points for honesty.