mastodon.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
The original server operated by the Mastodon gGmbH non-profit

Administered by:

Server stats:

366K
active users

Aaron Patterson ✅

This might be a dumb question, but I can't seem to find the answer. What is the license of code generated by Copilot? Is it owned by GitHub, or the user? For example: Bison is GPL, but the parsers it generates aren't (and it specifically says so). Would be nice if Copilot was specific about ownership, but I can't find any info

The reason I'm curious about this is because I'd like to know whether or not it's OK to accept OSS contributions from people that used Copilot. Does the author actually have permission to give me the code they sent? (Also I wish I didn't have to think about this)

@tenderlove what I expect to be the official response from Copilot:

@tenderlove you won't find any because Copilot has been repeately shown to generate code based on multiple licenses without crediting them. sometimes just copies one and applies wrong license. sometimes mushes together.

it's legally dangerous to use anything that it generates in anything other than a private personal project because there's a non-trivial chance you're violating 1 or more licenses which you are unaware of.

@tenderlove I have also wondered a lot about this topic. The fact that they removed copyright and author info (while embarrassingly containing some earlier on) makes me really wonder if this is a legal minefield.

@enebo seems like there must be some kind of fair use involved depending on the code, but idk. I'm not a lawyer and honestly I don't even want to think about this problem

@tenderlove If you are generating rspec tests from a prompt I doubt it would lead to being accused of infringement. "Give me a dtoa implementation in Java". I would be very worried.

@enebo @tenderlove as someone who translated a lot of code and was still super careful about noting provenance and getting author consent for license changes even for a translation of the original code, Copilot is mind boggling to me.

Basically “code laundering” in a way that abstracts away the original copyright

@tenderlove I think it's all undefined until it's tested in court.

@jordan just accept patches until I end up in court 🤣

@tenderlove It'd be ironic if a GitHub employee used their Hyatt Legal Plan lawyer to sue a 3rd party for incorporating their code that Copilot regurgitated

@tenderlove it’s whatever you license it as, according to the faqs (github.com/features/copilot):

“GitHub does not own the suggestions GitHub Copilot provides to you. You are responsible for the code you write with GitHub Copilot’s help.”

obviously that leaves out all the potential legal trouble/concerns if the code used for training the model included GPL, etc. - but i guess courts will have to decide those issues.

GitHub Copilot works alongside you directly in your editor, suggesting whole lines or entire functions for you.
GitHubGitHub Copilot · Your AI pair programmerGitHub Copilot works alongside you directly in your editor, suggesting whole lines or entire functions for you.

@srecnig seems like as a maintainer it's probably safe to merge someone's code if they use Copilot? At least, it seems like I wouldn't be held responsible (I think??)

@tenderlove i’m not even close to being a lawyer, so i will just not say anything 🙊

@tenderlove haha, maybe i should’ve only quoted the faqs in my first reply, and not add any interpretation 😬

@tenderlove I can definitely see how there would be concern, especially given that Copilot has sometimes reproduced lines of code verbatim from it's training material..

But how safe is it really to accept *any* contributions? Humans are definitely capable of copying code verbatim, taking a snippet and adapting it, or writing code that's structurally similar to things we've seen.

@tenderlove »The code, functions, and other output returned to you by GitHub Copilot are called “Suggestions.” GitHub does not claim any rights in Suggestions, and you retain ownership of and responsibility for Your Code, including Suggestions you include in Your Code.«

docs.github.com/en/site-policy

GitHub DocsGitHub Terms for Additional Products and Features - GitHub DocsGet started, troubleshoot, and make the most of GitHub. Documentation for new users, developers, administrators, and all of GitHub's products.

@lumaxis @tenderlove They can say that, just like a book can say that lending or re-sale are forbidden. Doesn’t make it legally valid. US Copyright office guidance: govinfo.gov/content/pkg/FR-202

@lumaxis That considering the US Copyright Office published “Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence” on 2023-03-16, doc about the functioning of the legal code may not be in sync with the current implementation ;-)

@josephholsten I’m not a lawyer either but as I understand, it reiterates and clarifies practices and interpretations that have already existed in similar fashion, especially "only humans can produce copyrightable material”.
And I'm still not sure how that relates to the previous discussion. If anything, that doc would reinforce the statement in GitHub's documentation?

@dataKnightmare #FYI Vale per Copilot ma anche per tutta l'allegra famiglia GPT: sotto quale licenza considerare il codice prodotto? 😅

@olistik
è un marasma, visto che tutta la combriccola ha aspirato l'aspirabile strafottendosene della licenza. Adesso ti dicono che i diritti non sono loro, il che è come dire che se dici che sono tuio ti stai prendendo la responsailità del fato che magari loro hanno violato licenze a destra e a manca.
per tacere del fatto che in quanto LLM, non c'è nessuna assicurazione riguardo alla bontà del codice.
secondo me ce n'è abbastanza per evitare questo pattume come la peste.

@dataKnightmare non solo i diritti non sono loro ma non possono neanche dirti quali sono le licenze.
Potrebbero tranquillamente aver violato fior fior di licenze. 😅

@tenderlove The answer is a moving target. For example the US only ruled last week that code generated by an AI cannot be copyrighted.

@tenderlove I've been asking myself this ever since GH started sending me unsolicited PRs with dependency updates...

@petko @tenderlove Dependabot isn't AI generated though, it's all programmatically generated without any sort of training model

@BobbyMcWho, yes, you would think it would be clear for such trivial programatically-generated contributions... Do I assume the contribution is licensed under my repo's license? Who do I add as a contributor holding copyright over the contribution? Github? Microsoft? The authors of dependabot?

And these are questions for the trivial case, let alone for the ML model that rips off all open source code on GH... // @tenderlove

@tenderlove this is all pending a bunch of court cases, it is unclear what the ruling will be. eg: petapixel.com/2023/02/07/getty once some precedence in the legal system is set the industry will have to adapt. If you lift 21% of the code you wrote off a method in a GPL sourced repo, where do you stand? If you only lift the concept? Its a brave new world.

PetaPixelGetty Images is Suing Stable Diffusion for a Staggering $1.8 TrillionThe numbers involved are mindblowing.

@tenderlove Given theres a lawsuit about wether Copilot produced code is GPL if its trained on GPL code, I suspect the silence is deliberate until these sorts of things are worked out…