mastodon.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
The original server operated by the Mastodon gGmbH non-profit

Administered by:

Server stats:

380K
active users

@mcc So developers will stop sharing information on #StackOverflow and future #Copilot and friends will be forever stuck in the past, answering questions about historically relevant frameworks and languages.
#LLM #StuckOverflow

@chris Yeah. But for this to be true, we need a Stack Overflow replacement. And when Reddit went evil, the move to Lemmy doesn't seem to have succeeded as well as the move from Twitter to Mastodon.

@mcc IIRC Mastodon is older than Lemmy and the current move to Mastodon/Fedi happened in multiple waves, so it may be too early for higher expectations.
For stackoverflow I expect some degradation of quality since they accept “AI” generated content. This may additionally frustrate high quality authors and motivate them to leave. We’ll see.
What would a federated stack overflow look like if we were to invent it?

@chris I don't know. It's an interesting question because Stack Overflow is inherently more search-focused than Lemmy or Mastodon.

A good model for a distributed/ownerless SO might wind up looking more like bluesky than mastodon.

mcc

@chris And, of course, there's the weird element that the SO license *already* does not permit AI on a facial reading, and a distributed SO would probably be *easier* to scrape than the centralized one. So you're not actually preventing AI exploitation, you're only punishing one corporation (SO) for the AI bait-and-switch.

@chris … which is enough for ME to do a bunch of work and change my usage patterns, but may not be for other people.

@mcc I personally see less problem in scraping a federated pool of knowledge but I absolutely hate that stackoverflow now owns this knowledge and can keep people from using it but sell “AI” as a service to them.

@chris I suppose one thing to consider is if a federated pool of knowledge is CC-BY-SA, then we only need a court ruling that OpenAI violates CC-BY-SA and the federated pool becomes AI-safe. Whereas SO can, (or already has) change the TOS so they own rights to relicense all content.

…but of course, CC-BY-SA is also incredibly inconvenient for a SO clone because everyone will generally want to copypaste sample code!

@mcc So we’d be looking for Schrödingers license, allowing and forbidding closed derivative works at the same time :-)

(I have a feeling that a lot of licenses only work because nobody has a close look at how their objects are used.)

@chris If I were actually trying to create a stackoverflow clone, I'd have the default license be something like "all code blocks are CC0 but all human text outside the code blocks is CC-BY-SA". That would I think match the unspoken expectations both contributors and readers have.

@chris I *am* worried about the effect "AI" scraping is gonna have on copyleft in general, tho. I think people have for many years released copyleft on the rule of "hey, why not" and now the answer is "bc AI". (More thoughts: mastodon.social/@mcc/112209121 ) Like, my proposed license in the last post would be very AI-friendly.

@mcc @chris this would be a situation in which the FSF could have a beneficial effect. Bring a test case against OpenAI for infringement against the codebases FSF fully owns. It'd answer the question one way or the other.

@mcc That seems like a good and very straight forward approach, it’s would at least meet my expectations exactly.

@mcc I don't think contract law has (yet) gotten to the stage where a site can change a ToS and make it retroactively apply to people who no longer use the site, making their contributions from many years ago retroactively no longer CC-BY-SA.

@mcc @chris practically speaking, duplicating a single CC-BY-SA code snippet is never going to be practically actionable, because the damages payable would be miniscule. There's also a strong argument to be made that a whole software package is not a derivative work of a small snippet, although I wouldn't want to be the one paying for that judgement.

@mcc @chris the degree to which that means that an AI model, created from millions of CC-BY-SA fragments, but also from billions of other data points, might also not be a derivative work of any of them, is another interesting question I'm glad I'm not paying to answer.

@womble @chris As a person putting up sample code, I want that sample code to be useful to other people. I think the license should be picked to maximize that utility. The way I see it, one of the ways to maximize the utility is to make the license *unambiguous*. If the recipient has to *wonder* whether they can use the code, I am causing them unnecessary problems even if they eventually do use the code.

@mcc there is that. Finding a licence wording that explicitly allows the "good" uses, without allowing the "bad" uses, that doesn't have a billion unintended consequences, is probably something beyond human capacity. Quick, get an AI to write it!

@womble I actually do generally use CC0 these days if it's meant to be sample code rather than "open source".

@chris @mcc SO publishes database dumps so we could all make a fork and start from there with something more libre

@hey Good idea!
I was wondering if they still did and I expected, that they already stopped doing this.
I had this tool that indexed local copies of SO for referencing but I keep forgetting to reinstall it and update the database.
Thanks for reminding me!

@chris they still do (archive.org/details/stackexcha) and still out of their own infrastructure.

IIRC they made Stack Exchange as a response of entshittication of another Q&A service and when they designed it they made a promise to make the content on open license and publicly available so once they go evil people can move on somewhere else taking the content with them.

Which I guess might be heading into this direction.

Internet ArchiveStack Exchange Data Dump : Stack Exchange, Inc. : Free Download, Borrow, and Streaming : Internet ArchiveThis is an anonymized dump of all user-contributed content on the Stack Exchange network. Each site is formatted as a separate archive consisting of XML files...