Started training a bot that tries to analyze and identify hate-speech on Twitter.
A few things became quite apparent after only a few days:
1. There are huge networks of (seemingly) fake accounts that like and retweet each other's posts. Someone is operating this at a _massive_ scale.
2. Reporting and banning fake accounts seems futile. You're fighting a hydra that spawns new accounts quicker than one can report them.
3. I'm feeling sick to my stomach just browsing through the logs.
What you really need is to provide an account-reputation frontend filter to Twitter.
If an account is part of the hydra, just ignore it.
@fribbledom haha sweet awesome this is fine
@fribbledom I'm pretty sure many people are doing it.
@fribbledom It would be interesting if you bought some number of followers, then tracked those accounts to see if you can find a connection between those accounts and the networks that they are part of. Perhaps bet some idea as to how widespread this activity is.
At a guess: government organs trying to build a library of faces.
@fribbledom Some of those networks are for hire - somebody wants a particular narrative pushed or a trend, they'll do it for a couple of BTC. When I was still in the cryptocurrency community, I got propositioned once or twice a week to do that after HOPE that year. Mostly price manipulation. Their political activities are bought and paid for, and they don't much seem to care as long as it spends.
Some of those networks are definitely political only in nature. A bunch of bots during the '16 election were vulnerable to SQLi via DM. Lots of the table names were in Cyrillic. They were patched pretty fast after the back end databases were restored, and those are the ones you're seeing (that Lifeline filters out for me, because getting them taken down was futile).
Birbsite is lost.
It is particularly notable because it seems to be very badly done. There were a few cases where lots of account posted angry replies to unrelated topics... because misspelled words looked like the president's name.
@fribbledom maybe an analysis of what comes out of that bubble (i. e. who reads those bots except for the other bots) might answer that. (Mind that the posts may appear anywhere, not only to their followers.)
@fribbledom very possibly someone who gets paid to offer "a massively influential twitter network" like native bookface videos had "massive reach, you should totally move all your content to us." There are similar networks of fake sites full of ads being visited and clicked on by bots ... the owners get money, and the advertisers get some impressive-looking numbers to show their bosses.
@fribbledom there is a massive system of parasites and rent seekers in online advertisement, like tens or hundreds of billions of dollars a year ... "ad fraud" is one term for the kind where bots write sites and bots from the same server farm click the ads on them
@fribbledom The rich and powerful are historically the ones who benefit from keeping the poor focused on fighting each other rather than on looking at the root cause of their poverty. (Wealth being accumulated rather than distributed.)
@fribbledom Nobody good, I fear :(
@fribbledom whole country is being run via twitter now.. so no doubt the scale is massive.
@fribbledom I once had an idea of creating a framework to blocking accounts based on group reputation, but it was beyond my skill to implement. Things like this would have been great to feed into it.
I hope something good comes from it, though you risk PTSD getting there.
I'll certainly open-source the framework I've built for it and will publish the data I've gathered.
I also get the feeling there are a few things to be learnt for the Fediverse and how to protect against such scenarios.
@fribbledom like tumblr pornbots...
@fribbledom alternate solution to 2: fight for better wage distribution worldwide so that running a troll farm becomes way too expensive.
(at least until AI becomes good/cheap enough to automate the process entirely of course)
@fribbledom If possible, at some point in time, would be good if you can publish the figures. Think the IT Cells run these farms, each person having multiple accounts. Just a thought, may not be worth it.
What we should worry about is if they can replicate the same here - the admins would have a full time job cleaning out the bots.
@fribbledom Thank you for your yeoman work so far!
1) I would like to know how to train a bot (or help yours work better) like this one. Where can I start?
2) You should write about your findings in detail. With anonymised data. This would be great info to be out in the public domain.
@fribbledom Sadly not surprised. Social media has jumped the shark, HARD.
@fribbledom This is becoming its own pretty major area of research. See people like Kate Starbird (University of Washington), Renee Di Resta, or ... chap at the Institute for the Future. Cluster and network analysis is a large part of their work, and IMO ***VASTLY*** more productive than semantic content analysis.
Content is NOT king, channel is.
(And Sumner Redstone knew this, he was practicing misdirection.)
I've done some incidental analysis myself, small scale.
@fribbledom Sam Wooley is the IFTF guy. Doing Computational Propaganda.
Really good podcast episode here:
The Future of Computational Propaganda. Episode: http://tracking.feedpress.it/link/14189/8181529. Media: http://tracking.feedpress.it/link/14189/8181530/395101773-institute-for-the-future-the-future-of-computational-propaganda.mp3. Sam Woolley recently joined Institute for the Future as a Research Director and was previously the D....
@fribbledom Renee DiResta's got a few Wired pieces up, this is among the better:
@fribbledom Kate Starbird, who is all kinds of awesome, has a number of YouTube videos, mostly lecture presentations.
This one was posted just a week ago, and should have her most recent work. (I'm queueing it up for watching right now).
Earlier content that's quite good dates to, IIRC, either December 2018 or December 2017 (I think it's '18). Her weariness and the mind-warping nature of the material was a big element of that series.
@fribbledom Tom Scott's Royal Society talk isn't based on rigorous research, but talks around many of these issues as well, both in the main talk:
"There Is No Algorithm for Truth"
And the Q&A follow-up:
The views are based on experience and some research / discussion with experts.
Also, NB, I somewhat disagree with his premise. Our chief algorithm for truth is Bacon's Novum Organum, a/k/a the #ScientificMethod
There's no such thing as The scientific method: there are a bunch of methods that are more or less well used (as general guidelines) by a bunch of people who are more or less conscious they are trying to follow a principled approach to finding answers 😉
This is by no way an algorithm for truth, it's just the next best thing we've found so far...
@silmathoron "Algorithm" in the loose sense of a process followed, rather than a precise mathematical process or rule.
Most of what's described as "algorithm" in AI / ML / online tools is really much more heuristics.
@dredmorbius I think heuristics might actually be a good qualifier for scientific methods 😂
It sounds like abuse prevention (and spam prevention, because that's a subset of abuse) could really benefit from access to the account graph. That would be both easier (because we own the code) and harder (because we're a federation of not fully trusted instances) on the fediverse. It'd be interesting to do, though!
@fribbledom Such is the battle that social networking sites face. It's time consuming to identify and terminate fake accounts. Automate it and you suspend innocent accounts because of false positives. I suspect there are a handful of servers somewhere that spawn these accounts. Take them down and you've solved a lot of the problem.
@fribbledom Also what other than obvious hate speech has bot hydra out there doing the same thing? Ie are they targeting younger dems with apathy.
It's the war against our minds.
@fribbledom Frightening! Tell someone please. Like heise, or so, at least that _some_ people are more aware
@fribbledom This seems like really important work. Thank you!
Nobody could pay me enough these days to get back on Twitter. Though I do remember that, right before I shut my account down (early 2017), I tried manually searching for and reporting the bot-rings and re-tweets and fake MAGA spewing garbage accounts that weren't actually associated with any actual human. It was (as you mentioned) overwhelming.
Not a fun job. Twitter really is nothing but a sick toy of the spoiled, though inexplicably angry billionaires trying to defeat democracy and overpower the hive mind ... all for shits and giggles.
@fribbledom is it possible to bring down this network of bots sending hate speech
Only with the help of Twitter.
@fribbledom Report them, I'll close what I can.
@fribbledom I saw an interesting tactic for individuals to reasonably block all of this - identify a single spam account, programmatically block it, all its followers, and all the follower's followers. repeat to whatever depth you want
@fribbledom Do you have code available? I'd like to hack on this and build tools to automate reporting hate speech accounts at the scale or greater than they can re-spawn. This is a really cool idea with lots of great applications.
Yeah, I will open-source this!
That being said, I kinda flinch from doing automated reports. That's bound to also hit someone who's not deserving it.
@fribbledom yeah, but reporting gets double checked on Twitter's end, so hopefully not too damaging. Could also build an interface to help train a nueral net manually to get better at identifying hate speech. If twitter is going to insist on being a site with millions of users, I learn toward it being their responsibility to make the call on what to remove.
@fribbledom do you have a repo of the logs or a git of the bot?
@fribbledom what other patterns did you observe?
@fribbledom Thanks for the labor on this front. Get your emotional decontamination on whenever you can, those fuckers emit some seriously harmful memetic radiation without even thinking.
I run a myBB bulletin board for a user base of something like a mere dozen users. Every day, hundreds of fake spam accounts assault it, using a variety of techniques to get past the filters.
I can't think that myBB boards are a lucrative opportunity. The cost to run these account farms must be minuscule.
So yes, I believe the massive scale. It must cost less than a Disney Plus subscription to run. And when you throw subscriptions and patronage in the mix... 😬
Server run by the main developers of the project It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!