How Thunderbird’s Scam Detection Works (2005)

NOTE: This article is out of date and likely obsolete.

Since upgrading to Mozilla Thunderbird 1.5 beta 2, I’ve seen a number of messages slapped with a warning label that “Thunderbird thinks this message might be an email scam.” It appears at the top of the message, in the same style as the junk mail notice bar or the warning that remote images have been blocked, and there’s a button to mark the message as “Not a Scam.”

There’s only one problem. Since SpamAssassin and ClamAV do such a good job of catching the phishing scams before they reach my inbox, Thunderbird has yet to catch any actual phish. But there’ve been a lot of false positives. It’s hit LiveJournal reply notices, newsletters from IEEE and Golden Key, a Spam Karma notice from my own blog, and I’ve seen it on both outbid notices and updates to saved searches from eBay.

I found myself wondering just how Thunderbird’s phishing detection decides that a message is suspicious—and how to teach it that the next LJ notice isn’t a scam.

The Thunderbird support website doesn’t seem to have been updated yet. Most of the articles I’ve found only talk about TB adding the feature, not how it works. The best information I found was this Mozillazine forum thread, which included a link to the actual code that makes the decision, in phishingDetector.js. Thunderbird looks at the following:

Links that only use an IP address, including dotted decimal, octal, hex, dword, or some mixed encoding.
Links that claim to go to one site, but actually go to another. (Phishers do this to fool you into going to their site. Legit mailing lists sometimes do this with redirectors for tracking purposes.)
Forms embedded in the email. (This explains the LiveJournal notices.)

It also appears to trap text URLs containing HTML-escaped characters, which explains the Spam Karma reports. In this case the report includes a spammer’s link with  in the hostname. The message is plain text, so Thunderbird leaves the entity as-is when displaying it…but decodes it when it creates the link. Result: a link where the text and URL don’t match.

The easiest way to prevent it from freaking out over the next message? Add the sender to your address book. I’m not sure that’s a great idea, since a phisher could guess which addresses you have saved and spoof them, but it’s at least simple. I guess I’ll find out whether it works the next time I get a reply notice from LJ. Update: Adding the sender to your address book doesn’t seem to have any effect.

Update 2 (July 12, 2006): The comment thread’s gotten long enough that I can see people might miss this, so here’s how to disable it:

Open Options or Preferences (this will be under the Tools menu on Windows, Thunderbird on Mac, or Edit on Linux).
Click on Privacy (there should be a big padlock icon).
Click on the E-mail Scams tab.
Disable the “Check mail messages for email scams” option and click on Close.

That’s it.