Copilot Copyright Debate

This cluster focuses on debates about whether GitHub Copilot violates software licenses like GPL by training on public repositories, regurgitating code snippets, and issues of fair use and copyright infringement.

📉 Falling 0.2x Legal
4,013
Comments
11
Years Active
5
Top Authors
#5433
Topic ID

Activity Over Time

2016
1
2017
1
2018
6
2019
3
2020
3
2021
1,043
2022
1,524
2023
751
2024
283
2025
336
2026
62

Keywords

EULA MS AI ycombinator.com OSS GPL github.blog wikipedia.org StackOverflouw CoPilot copilot code github license copyright gpl fair use copyrighted licenses source

Sample Comments

samtheprogram Jul 2, 2021 View on HN

Anything posted to Stack Overflow has a specific (Creative Commons IIRC) license associated with it. The same is not true of GitHub Copilot, and in fact their FAQ doesn’t specify a license at all, probably because they are technically unable to since it is trained on a wide variety of code from differing licenses (and code not written by a human is currently a grey area for copyright). The FAQ simply says to use it at your own risk.

noselasd Apr 21, 2023 View on HN

How much if this is due to someone ripping off GPL code and stuffing it in a repo under a different license that got fed to copilot training?

GoblinSlayer Jun 23, 2022 View on HN

Copilot is not a competing solution, it's a knowledge base about text, like encyclopedia. As for snippets it produces, those might be copyrightable if they pass the copyrightability threshold. If it provides you kilobytes of text at once, that would be bad. A middle ground would be Copilot tracking how much code under incompatible licenses it pasted and stop at, say, 200 LOC.

eloisius May 8, 2023 View on HN

Copilot was not only trained on permissively licensed code. It’s trained on all public repos, even if the code is copyrighted (which is the default absent a more permissive license)

ThePhysicist Jun 23, 2022 View on HN

I mean I'm not an expert but it's a valid point as people share code under a given license, and as far as I'm aware Copilot does not make this knowledge available. Nothing to do with the fact that Copilot is an amazing technological achievement.If I, as a human, go to a public repository on Github and copy/paste a non-trivial 200 line code snippet into my proprietary code base I have to abide by the license of that original code, even if I slightly modify it. I don't

FeepingCreature Jul 20, 2021 View on HN

Copilot generally (excepting rare cases where it produces snippets verbatim) does not steal code. The GPL restricts distribution, not usage. And (to my knowledge) no open-source license restricts learning from code. I cannot see anyone who doesn't want others to learn from their code ever release code as open-source.

carom Oct 18, 2022 View on HN

There is no obvious violation or obvious not violation. It is a matter of fair use and it will be settled in court. Using copywritten code and not open souring the derivative work (copilot's model) may very well be a violation.

alerighi Jul 9, 2021 View on HN

Can you copy 10 lines of code from a open source project in your software? Yes you an, it's considered fair use. Nobody will ever sue for that. If it was, websites like StackOverflouw where developers post code probably taken by project with some restrictive license and other developer copy it in their projects would not exist.Copilot will not write an entire software module, it will provide you with snippets. I see using GPL code for training fair use. If a developer reads the source co

severino Jul 12, 2021 View on HN

Can't they just say the code was randomly generated by Copilot so copyright and stuff like that doesn't apply?

dundarious Jul 4, 2021 View on HN

Without using Copilot, I can search the Internet for help, find some source available code, take it verbatim and place it in my project. For most licenses, this is a violation if I ever externally ship a product from this code, and for some licenses, it is even a violation if I only expose it as a service over the Internet. A succinct way to rephrase this might be "theft".Using Copilot, I can do the same (GitHub acknowledge that according to their testing, verbatim reproduction happ