Why Generative AI Guidance is Essential to Contributors of Open Source 

Roman Shaposhnik, VP of Legal Affairs at the ASF

Generative AI is rapidly evolving and changing the software industry as we know it, and the open source community must adapt accordingly. Generative AI has the potential to automate significant portions of code creation which could lead to faster development cycles, increased productivity, and reduced human error. It could also provide novel solutions to longstanding problems, encourage experimentation, and help developers explore unconventional approaches within open source projects. However, it also raises a lot of concerns about the originality of the generated code and has licensing implications within open source projects.

Open source software typically operates under specific licenses that dictate how the code can be used, modified, and distributed. Every contributor to open source is doing so by either exercising their personal right to license out the fruits of their own labor or by donating something that they didn’t create themselves but have the right to share. For example, any committer contributing to one of the Apache Software Foundation’s (ASF) 300+ active projects is doing so only after explicitly signing an Individual Contributor License Agreement (ICLA) that says,“You represent that each of Your Contributions is Your original creation (see section 7 for submissions on behalf of others).” 

While the criteria for what constitutes “your original creation” is pretty easy to understand if you literally just typed a page of code in a fit of inspiration, the line becomes quickly blurred with AI-generated code. Even before the advent of generative AI, open source developers were struggling with questions like “what if my IDE provided a code completion hint that I didn’t type – was that line of code really my original creation?” or “what if I ended up on StackOverflow and got a trivial-looking one-liner of how to handle a particularly gnarly issue from one of the threads there – am I allowed to include that?”

Determining the rightful ownership and giving proper attribution of code is critical to open source projects. Generated AI not only makes intellectual property management more confusing, but it also challenges the traditional understanding of authorship altogether. For example, generative AI might create code that closely resembles existing open source projects which blurs the line between innovation and derivation. Generative AI also raises concerns over quality and reliability which influences how contributions are evaluated and integrated; this stands to test the trust, collaboration, and goals shared within an open source community. 

But while intellectual property ramifications are perhaps the most immediate risk generative AI poses to open source software, as AI’s role in code generation expands, legal and ethical considerations related to ownership, bias, and accountability will only spur more heated debate. 

As a result, it’s more important than ever to have clear communication and a spirit of collaboration in order to harness the potential of generative AI without disturbing the ethos of open source software development. Organizations like the ASF have created dedicated Generative Tooling Guidance to help contributors of its software projects who plan to use any generative tools (AI-powered or not). But as with any open source effort, it takes a village of collaborators to find common terms and policies that serve the open source development community at large. The conversation about generative AI is evolving and more input from the open source community is required to ensure the guidance is pragmatic and beneficial to the industry-at-large. 

Connect with ASF