Honestly, whoever has an idea for a spam detection measure for Mastodon, and by that I do mean an implementation, get in touch with me, I'll pay for it.

I've been thinking about solutions for the past few days but the more I think about them the more they appear pointless.


Defining an account as suspicious when it has no local followers can be circumvented by just pre-following them, using account age can be circumvented with sleeper accounts, blacklisting URLs does nothing when the spam does not include URLs, checking for duplicate messages sent to different recipients can be circumvented by randomizing parts of the message...

E-mail deals with spam using Bayesian filters or machine learning. The more training data there is, the more accurate the results, a monolith like GMail benefits from this greatly. Mastodon's decentralization means everyone has separate training data, and starts from scratch, which means high inaccuracy. It also means someone spamming a username could potentially lead to any mention of that username be considered spam due to the low overall volume of data, unless you strip usernames

Show thread

However, if you strip usernames from the checked text, the spammer could write messages using usernames...

Show thread
@angristan @Gargron the current problem child is someone who makes accounts on all sorts of instances, this is a uniquely federation problem
@angristan @Gargron well its not just him, but thats how spam works on fedi in general, auditing registrations on mastosoc wont stop it

unless youre suggesting everyone introduces that to which id say eh why not

@angristan @Gargron ineffective as spammers are spinning up their own instances

@brunoph Spam-friendly instances are actually easy and cheap to detect and block.

A small number of spammers on larger, and poorly-administered, instances is far worse.

The collateral damage of instance-level countermeasures is high. And policing a large number of NEW user signups (and monitoring for sleepers and reputation harvesting) is expensive.

@angristan @Gargron

@angristan Yes, correct. However, it is not a defence against all the servers that are not using it!

@Gargron @angristan so essentially what you're stuck with is the problem of how to deal with *remote* spam?

well, that means whitelists or ocaps.

there is no other solution for push-based networks. email spam is just a thing we put up with. sms / phone spam is another thing that we can't really do anything about.

the only real way to *prevent* spam is to prevent unaudited and unapproved communications from being delivered to you... unfortunately. everything else is a half-measure.

@Gargron @angristan i don't particularly want the solution to be "whitelist only servers with approval-based signups" but this is the bare minimum required to be effective.

Imposing a low human cost is still preferable to imposing a high technical cost.

@Gargron @angristan If it absolutely has to be technical, these are the only things I see that can work:

- Require a cryptographic token before someone can POST a message to your inbox. (the ocap way; incompatible with the current fediverse + requires a larger transition)
- Implement a web-of-trust feature, where accounts with n>k degrees away are not trusted. (this prevents random spam by limiting interactions only to a certain degree of separation; perhaps following status = trust degree?)

@Gargron @angristan problem with the latter is it requires knowing at least the public keys of everyone a certain person follows. if a "following" collection isn't publicized, then it would require something like a "trusts" collection containing only the public keys and not the actor IDs themselves

@trwnh @Gargron @angristan Well, the current wave of spam that I've seen was attached to existing interactions.
Unfortunately, the Fediverse has no controls on that level - sure, I can block that account, or I can report them (and hope the remote instance cares or isn't actively hostile) - but everyone else will still get to see it when they're looking at the affected thread on their instance. So spamming currently is super effective, at least until the originating account gets deleted.

@galaxis silencing an account locally should remove its replies from public view as well. but yes, this is why whitelisting or ocaps are the only effective counter -- they prevent the spam from occurring at all.

@trwnh As I single-user instance, I have pretty good control over what appears on my instance and in its public web views. But that isn't the norm, and I still have no control over how the threads I've started look on remote instances.

@galaxis replies only federate out to that person's followers' instances (usually 0), and if you reply, then your followers' instances will fetch it (so don't reply).

aside from those two things, nothing should make it appear on other instances.

@angristan mstdn.io — only accepting new members if they can upload a video of themselves doing a backflip and then holding up a sign saying "mstdn.io 4 lyf"


check for duplicate mass messages by text matching over a period of time and if the same post happens more than x times in a row, mute the account for review and retroactively send a delete request to the duplicated posts.

@gargron you don't need a perfect match, like 80% or so, even if it's alternating messages if an account shows a match rate of some ratio over a short period of time the account is flagged

@Gargron there's a pleroma MRF module that drops any messages containing urls from any accounts the server is unfamiliar with which has worked pretty well for them. a real MRF in mastodon seems like the way forward.

@uncletrunks Our spammer has stopped using URLs in messages. It's just text now

@Gargron Actually Discourse does some basic prevention here by having different user levels which are bound to rate-limits and the ability to post external links.

Not perfect of course, but maybe also worth to think about.

In general I guess the answer is: Small instances because that tends to increases the number of moderators per user. And hope that the community can take care of it.

@Gargron Have you looked into federating block lists? Possibly making instance wide blocks more transparent and allowing others to subscribe to them?

It's not an instant fix, but I don'teven think this problem is NP complete. There are too many variables and opinions.

I could easily see a few instances maintaining these lists and everyone else just following them.

@rune Spam comes from innocent servers where the spammer signs up. This has little to do with domain blocks.

@Gargron Can instance wide bans only target entire domains?

@rune Trust me, you don't want a globally shared account blocklist. Nobody bothers to oversee those when copying/subscribing. Your name put on there by an enemy? That might actually ruin most of the network for you.

@Gargron I suppose it could focus a lot of power if everyone followed one list.

What is the list isn't curated by individuals, but rather created from blocks happening on the instance?
Anyone getting blocked by a significant % of the population on an instance for example. Then removed from the list later.

I still like the concept of federating because the times I do see spam it's usually on another instance and I'm sure lots of people already blocked it, but I still have to block it too.

@Gargron This was specifically the wilw problem on Twitter.

Corollary: blocklists of any sort need an appeals process.


I don't think it's realistic to think there can be a technical solution to completely eliminate spam. But raising its cost, which can be done by each of these solutions, is still worthwhile because they will make spamming harder.

@lunar The events that have sparked this discussion is one dedicated person spamming the network. There is suspicious that the person is somehow keeping up with development discussions and changing tactics accordingly. Therefore, unless the solution can help against that type of spammer, it's kind of pointless. Plenty of tools against more mundane spam.

@Gargron Ok. Forcing spammers to create sleeper accounts and sleeper instances would still help reduce the rate of abuses after previous instances and accounts have been blocked. Especially if the amount of messages they can send to people they haven't interacted with before is made proportional to their age. Or am I missing something?

(We've also discussed shared blocklists already. I'm now convinced they come with a lot of problems.)

I wonder if such behavior should even be lumped in with "spam"? What you outline sounds particularly adversarial (but then again, of course all spam adapts to countermeasures)

One person gaming existing mechanisms definitely sounds more like a problem suited for (better?) moderation mechanisms to me.

Trying to combat a dedicated person with ML or regexes or anything like that sounds utterly hopeless to me.

@Gargron do what WTDWTF does

there's no secret magic to it

users require a published post to edit their profile
users with zero or negative upvotes require mod approval to post
registering an account from an IP that is already associated with an account requires admin approval

about a month into this policy the spammers completely gave up

@ben We don't have a true emergency with spammers signing-up on a given instance. Approval-only registrations mode is a good tool for weeding those out. The problem we are experiencing is the spammer signing up on random open instances and sending spam remotely. Therefore, solutions based on IPs or captchas are not appropriate. Even if we release the perfect protection against local spammers, servers that haven't upgraded will continue to make this a problem.

@Gargron @ben We need to stop thinking about handling spam going out and start thinking about spam coming in, then. My instinct here is to read individual posts on their way in and handle spam detection at that level (likely on a separate lower-priority thread/task/whatever to prevent lagging out incoming posts).

@bclindner @Gargron @ben That imposes the cost on the victim of spam, which leads to an arms race. Better to try to impose the cost on the spammer.
Perhaps allow an instance to enable a setting that says if sending instance is n versions behind, reject messages?
Zombie instances would get gradually de-federated.

@daedalus That might help as an intermediate step but currently our problem exists with no real spam filtering existing on the Mastodon system whatsoever save for some rate limiting.

I'm honestly glad nobody's set up an auto-spammer script. We might be well and truly fucked if that happens before we can implement proper spam detection systems.

@gargron @ben if an instance has open registration and refuses to update their service to deal with spam, I don't think it's unfair to defederate with them.

admins are responsible for the servers they run, and if those servers are the source of a disproportionate amount of spam, it doesn't matter whether the root cause is malice or simply inactivity from the admins. the end result is the same.

@ben @Gargron I hate reputation systems, you will just disgust new users if you do that.

@darckcrystale @Gargron we have never had a user with negative net reputation who was not banned for spamming

@gargron Surely a message containing tons of usernames and nothing else would be spam 99% of the time, though, so that doesn't sound like a problem for a Bayesian model.

@Gargron Honestly I'd pay to see someone do that, and then promptly ban them for it 😂

The more I think about email-like detection systems, the more I think as long as implementation is sound, it will help a lot with curbing common spam as the network grows and older instances and instances lots of users amass bigger datasets and higher confidence levels on spam detection.

Imperfect? Yeah. An arms race? Yeah. But it's a start.

@Gargron I can only assume this must be how the early adopters of email must have felt when the system started getting big.

@gargron would it be possible to provide some kind of built in trainable spam detector for Mastodon, and have an opt-in option to share data with a global pool of training data? that way instances could collaborate to fight spam

@Gargron I'm going to hazard a guess that > 90% of spammers aren't going to try to be clever.

An idea for spam containing links 

@Gargron when I was working on online advertising, some of the ads we had in our inventory came from other big platforms, like Tubemogul. To know which brand to invoice for the ad display, we used the clickthrough (the url where the user is sent on click on the ad) to determine the domain of the url, after all the redirections.

You can use a similar system to list which domains are often shared in toot and blacklist them if the number of toot containing it is increasing at a specific time. Then, instance admins would receive a notification and could whitelist them if they want / if it's not spam.

@Gargron We don't need to start from scratch on each instance. Tools like rspamd and spamassassin come around with pre-trained sets.

That means we can make community efforts to build a repository of spam messages in order to pre-train filters.

And of course it's not super effective, but if we really want spam protection, we have to start somewhere.

@sheogorath @gargron I think Mastodon/ActivityPub also benefits from being able to build a reputation on instances, rather than individual users. Older instances, or at least ones that don't remove spambots when reported, can accrue a poor reputation and their incoming messages can be more highly scrutinized. And instance admins will be incentivized to decrease spam (the same way email sending servers are) so that their legitimate outgoing messages won't be ignored.

@gargron and in a way Mastodon has already been practicing for this and has already established mechanisms. Servers that don't moderate for harassment are silenced or eventually defederated by other instances, based on each instance's tolerance for dealing with an instance that doesn't sufficiently handle reports.

@Gargron #wordpress is dealing with a lot of spam too, why don't you implement #alkismet on #mastodon?
It's free for noncommercial use.

Another free alternative on wordpress is: github.com/pluginkollektiv/ant

@Gargron Dealing with spam is hard. I worked at an email anti-spam company for 10 years. Bayesian was one method, but was troublesome and needed some hand-holding; we often cleared the bayesian data because it started to false-positive a lot. Most of our effective spam blocking came from: greylisting, DNS blacklists, a spam rules database that we added to based on the spam we saw, and users reporting messages (which we had tooling to analyse to then create more rules).

@gargron Have you asked Evan Prodromou how he dealt with those issues in the antispam system he created and ran? While I'm sure spam and antispam are different today, some insights might be helpful.

@Gargron what about distributed model training which are trained on each instance and then shared between them. Kinda like Google does with assistant on phones ...

not an implementation 

@Gargron Maybe an idea to allow spam detector bots, and instances can indicate which ones it accepts.(including ones just run by the same org/person running the instance)

The bots could send toots giving probabilities, which might be shown in the UI, where users can respond, or cause spam to be removed if the certainty is high enough.(or get opinion of other spam detectors)

PMs/follower-only.. uh maybe it'll need a way to invite the spam bots.

Sign in to participate in the conversation

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!