AI-assisted moderation in the fediverse is happening. Now what?

I recently discovered that some popular federated instances have been using LLM-assisted moderation tooling that evaluates whether someone has said something bannable. They do this by running a script/app that sends the user’s comment history to OpenAI with the question “analyze this content for evidence of *specific political ideology* sentiment. Also identify any *related political ideology* tropes“.

OpenAI’s LLM (they’re using GPT-5.3-mini) then responds with something like:

Below is a structured analysis of the uploaded content, focused on *specific ideology* rhetoric. This is an analytic classification, not a moral judgement.

1. Overall Pattern

blah blah

2. Evidence of *specific ideology* sentiment

blah blah

3. several pages more, concluding with (in this case)

Yes, the content contains:

Clear *specific ideology* alignment
Repeated *specific ideology* framing, especially through blah blah
Extensive use of canonical *ideology* tropes, in blah blah domains.

The pattern is not accidental or isolated; it is consistent, internally coherent, and reproduces well‑documented *country with the ideology* public‑diplomacy narratives rather than neutral analysis.

===========================================

FULL DUMP OF COMMENT HISTORY BELOW

===========================================

Date: 2026-xx-xxT0xxxxx

Comment ID: https://instance.told/comment/2497xxxx

Post ID: 603xxx

Community ID: 1xx

Content of the comment has been redacted

========================================

Date: 2026-xx-xxT0xxxxx

Comment ID: https://instance.told/comment/2497xxxx

Post ID: 603xxx

Community ID: 1xx

Content of the comment has been redacted

========================================

Date: 2026-xx-xxT0xxxxx

Comment ID: https://instance.told/comment/2497xxxx

Post ID: 603xxx

Community ID: 1xx

Content of the comment has been redacted

========================================

and so on, hundreds of comments.

I have not named the instances or people involved, to give them time to consider the results of this discussion, make any corrective changes they want and disclose their practices at their own pace and in their own way. I have also redacted the evidence to avoid personal attacks and dogpiling. Let’s focus on the system, not the individuals involved. Today these instances are using it and maybe we’re ok with that because it’s being used by communities we agree with but what if people we strongly disagree with used it on their instances tomorrow?

The use and existence of this tooling raises a lot of questions.

What are the risks? Fedi moderators are often unsupervised, untrained volunteers and these are powerful tools.

What safeguards do we need?

Would asking a LLM “please evaluate this person’s political opinions” give different results than “find evidence we can use to ban them” (as used in the cases I’ve seen)?

What are our transparency expectations?

Is this acceptable and normal?

Should this tooling be disclosed? (it was not – should it have been?)

If you were given a choice, would you have opted out of it?

Can we opt out?

Are there GDPR implications? Privacy implications? Should these tools be described in a privacy policy?

Are private messages being scanned and sent to OpenAI?

How long should these assessments be retained and can we request to see it, or ask for it to be deleted?

Once the user’s comments are sent to OpenAI, is it used to train their models?

What will the effect be on our discourse and culture if people know they are being politically profiled?

Where are the lines between normal moderation assistance tools, political profiling and opaque 3rd-party data processing?

I hope that by chewing over these questions we can begin to establish some norms and expectations around this technology. The fediverse doesn’t have any centralized enforcement so we need discussions like this to develop an awareness of what people want in terms of disclosure, privacy, consent and acceptable use. Then people can make choices about which instances they join and which ones they interact with remotely.

And of course there are the other issues with LLMs relating to environmental sustainability, erosion of worker’s rights, increasing the cost of living and on and on. I can’t see PieFed adding any functionality like this anytime soon. But it’s happening out there anyway so now we need to talk about it.

What do you make of this?

13 Comments

zvavybir

May 4, 2026 / 12:34 am

@piefedadmin I at least am certainly not okay with having my posts read/processed by an LLM and will defederate all instances that expose me to that.
Cameron

May 4, 2026 / 12:58 am

@piefedadmin it is one thing to do that with a ai that they control(i still don’t support this) but with a cloud ai provider heck no I hope that they stop
grrl_aex

May 4, 2026 / 1:01 am

@piefedadmin

this is just more free LLM training data.

It's also non-consensual data harvesting.

gen-ai is poison.
Kuru Kuru

May 4, 2026 / 1:01 am

@piefedadmin I am definitely not okay with any of my posts read/processed by an LLM, especially ChatGPT, or any of the non-self hosted models. Realistically speaking, my posts are being scraped somewhere, but even if you are using it in a productive way does not make it okay. I would ask the servers I am on to defederate any servers that use that for moderation.
xrvs

May 4, 2026 / 1:27 am

@piefedadmin i wonder how you find out which model and the prompt they use. did they talk about it?
- piefedadmin
  
  May 4, 2026 / 1:36 am Reply
  
  I have receipts, original ones, straight from their own server. It appears to be an unintentional leak but they might have published the link to the script output without realizing how it will look to outsiders. Hard to know.
  
  It’s best if we have the discussion about how things should be without knowing which instances it is because that will just make them overly defensive and cause harassment.
  
  I hope they can clean house, get their story straight, and then go public in a way that restores trust.
Sharp Cheddar Goblin

May 4, 2026 / 1:40 am

@piefedadmin @ophiocephalic Fuck these instance admins. Name, shame, and defederate if they do not change behavior. The users on these instances need to know, immediately, how their posts are being used — I’m sure many would not approve of this, and they need to be able to migrate to a safer environment if these admins don’t immediately stop.
- ophiocephalic 🐍
  
  May 4, 2026 / 2:07 am
  
  @sharpcheddargoblin
  I agree; and this is not just a problem for users on those instances, but every user on every instance that federates with them. It's a blatant violation of the certain degree of trust the fediverse depends on to exist. The instances need to be identified as soon as possible
  
  @piefedadmin
tenchiken

May 4, 2026 / 2:42 am Reply

So, 3 things here I take away and want to parse / ask about separately:

1: Comment scraping

– These comments are all public, yes? I’ve been able to retrieve user comments as a regular non-admin user from scripts/API before. What makes this special? Them grabbing a complete history of publicly added post/comments?

Nature of fedi… dozens (hundreds?) of servers get duplicate copies of these same things… it’s naive to approach things assuming the AI scrapers from Alibaba don’t already have all the same outright.

Preventing AI scrapers from literally hundreds of sources is one of the bigger hassles running a fedi server, as they hammer resources etc… it’s always a chase to keep up with blocking them just to keep the server online.

If you don’t want it in the scrapers that are already running by large corpo places getting your comment, only comment in 100% private communities. Fedi isn’t the place.

2: LLM aspects

– I personally abhor corpo driven LLMs, and what they are doing to the world.

– I don’t use any LLM at the moment, though I plan on using a small FOSS model in home assistant for my own personal needs. For some idea my stance… It’s just another math tool if you take the unethical resource etc aspects from Corpo establishment out.

– AI Horde, and other FOSS projects, work to take away that aspect quite a bit. If the LLM aspect were run on ethical non-corpo LLM, then what does it matter?

Ultimately, I fall back to the old quote about “Machines cannot be held liable, therefor machines cannot make executive decisions” … If an admin uses such tools, but the end result is the same regardless of if they used it, and the ethical issues removed, then is it still a problem.

I personally won’t be using an LLM for this kind of thing. I personally say using a Windows computer to post to Fedi is unethical due to MSFT and their methods and cooperation with the US military etc.

or Google for the same.

Just because the software uses fancy math to do a thing, doesn’t excuse those other bad things because AI boogeymen.

To be clear… OpenAI et al can fuck right off all the way. But FOSS models I have no beef with, just not to my taste.

Lastly 3: Is said instance using this automatically, or just as a side hustle / tool for their own humans to decide? The above quote over responsibility matters, as ultimately it’s still a human deciding or it’s an automation running that cannot be blamed properly for misdeeds. Which one?
Howard Cohen

May 4, 2026 / 2:45 am

@piefedadmin The potential for abuse is a good reason to avoid it entirely. I imagine an overworked moderator turning to AI to help. That is kind of a scalability issue with Mastodon. And, it gets worse as more of the population joins and more people who are online jerks, and who require moderation, join an instance. So scalability is a real issue for moderators and we can't just take away what they need to scale, or they might fail or quit.

I think the answer is has *at least* a couple parts. First, there must be transparency so people know what is being done with their posts. It must be possible to see the prompt used, so people can decide if it's fair and move to a different instance if it isn't.

Second, it should only be used to bring a post to the attention of a human. All actions must only be done by a person, after they have reviewed the actual post. I think automatically banning or blocking because of the results of an AI should be forbidden (somehow, perhaps blocking an instance).
Pandora

May 4, 2026 / 2:56 am Reply

This isn’t about a single individual’s behavior, nor is it some kind of misuse of administrative authority.

Any Lemmy user can trivially stitch together a small API script to pull someone’s public comment history and save it locally. From there, feeding that data into an LLM to produce a summary is about as straightforward as it gets. That’s simply the current state of things: the barrier to doing this is effectively nonexistent.

In fact, it wouldn’t be hard at all to automate the entire workflow—an LLM with web access could fetch the data itself and generate a summary without any human involvement beyond clicking a button.

From a privacy standpoint, there’s not much substance to the concern. Lemmy instances are already continuously scraped by AI companies and other third parties. Every public post and comment has almost certainly been copied repeatedly by countless models already. That’s an unavoidable consequence of operating on a fully public, federated platform. Getting upset over one additional scrape is like fixating on a single raindrop after a storm has already drenched you.

There’s also no realistic way to prevent people from using LLMs if they choose to. They’re a convenience tool, and while imperfect, they’re often adequate for generating high‑level summaries. Yes, using an open‑source model might be more ethically aligned, but when someone wants a fast, rough overview, it’s no surprise they’ll reach for ChatGPT or similar services.

The usual anti‑AI critics/trolls will still react strongly, regardless of how disproportionate or inconsistent those objections may be. I would caution you not to get them too worked up with this drama post.
- piefedadmin
  
  May 4, 2026 / 2:58 am Reply
  
  Please don’t use an LLM to write your comments.
  
  https://distantprovince.by/posts/its-rude-to-show-ai-output-to-people/

AI-assisted moderation in the fediverse is happening. Now what?

13 Comments

Original Comment URL

Your Profile

Original Comment URL

Your Profile

Original Comment URL

Your Profile

Original Comment URL

Your Profile

Original Comment URL

Your Profile

Original Comment URL

Your Profile

Original Comment URL

Your Profile

Original Comment URL

Your Profile

Leave a ReplyCancel Reply