Log in

❖ Browsing media by libreplanet


BigCode: Open and responsible research on code-generating AI systems

Hosted by Harm de Vries and Leandro von Werra.


While code-generating AI systems like CoPilot have emerged as a powerful tool for professional developers, there are growing legal and ethical concerns around the development of these models. Questions have been raised as to whether these AI models respect current free software licenses---both for model training and generation---and what the social impact of this technology is on the free software community. The BigCode project is a scientific collaboration (with over 350 participants) working on the responsible development of code-generating AI systems. In this talk, we discuss how we navigate the legal-ethics-governance aspects around the development of these models, including how we developed a permissively licensed code dataset, give developers the option to remove their code from the training data, redact personally identifiable information (PII), and attribute generated programs to the original code snippet.


Harm de Vries is a staff research scientist at ServiceNow.

Leandro von Werra is a machine learning engineer at HuggingFace.

Video version


10 months ago

Tagged with

charting-the-course · libreplanet-conference · lp2023 · LibrePlanet · LibrePlanet 2023 audio · LibrePlanet 2023 · FSF · LibrePlanet audio · audio


CC BY-SA 4.0