I’ve been meaning to disconnect from Jetpack for a while now. This seems like a good time to do it, and to finally clear out the older Tumblr and WordPress.com blogs I don’t use anymore.

Tumblr and WordPress to Sell Users’ Data to Train AI Tools404 Media

It’s the kind of thing that you expect from Google or Facebook, or from any number of start-ups, but there’s been this sense that Automattic should know better — and with Tumblr being login-walled and ad-saturated, and the push to upsell in their WordPress plugins, and now this…it’s looking like they don’t.

I don’t think they’ve hit the “trust thermocline” yet, but selling user data is a pretty clear line.

As for AI access to the Firehose: My previous understanding of the firehose is that it’s basically an aggregation of what you’d see in a bunch of blogs’ public RSS feeds. Which, OK, fine. Analyze your heart out. Display my posts in your RSS reader. Just make sure private posts and comments don’t leak.

But LLM training isn’t the same as analytics, or showing a properly attributed post in a reader. And quietly changing the terms to allow more kinds of re-use on something most people using the service don’t know about? Not cool.

And not making it clear what is and isn’t included for which purposes? That breaks down trust.

Before this, I wasn’t worried about the Firehose. But now I’m not sure I can trust Akismet, never mind Jetpack, and I’m looking for a new spam filter.

Originally posted across several threads through my GoToSocial test site.

Update: Automattic did clarify that self-hosted blogs with Jetpack are not included in the training data. Only company-hosted blogs on Tumblr and WordPress.com. But I still uninstalled Jetpack from this site, just to be sure. Like I said, I’d been meaning to for a while.

Experimenting with the new Automattic Stats Plugin that uses the WordPress.com statistics infrastructure to track traffic. So far, so good… except for one problem. Titles and links are missing from all the “most visited” posts. They’re just listed as numeric IDs.

Update: Actually, today’s posts seem OK. The plugin seems to just send the blog ID and post ID. I’ve been trying to figure out how the central server is retrieving the permalink and title. It doesn’t look like Bad Behavior is blocking it. And it doesn’t seem to be using the RSS feed, since posts that are still on the front page (and presumably still in the feed) are also showing up as numbers. *grumble*

Update 2: I just noticed that all of the number-only posts show the same placeholder graph showing “Region A” vs. “Region B” for 2003-2005.

Update 3: It’s a problem with WordPress’ XMLRPC interface, and affects other uses (like connecting with Flock). I’ve got a workaround, though (see comments).

Update 4 (May 10): Thanks to the pingback below from dot unplanned, it’s confirmed to be a bug in PHP 5.2.2. With any luck, the workaround will cease to be necessary when the next PHP bugfix is released.