enh(blog): Add blog post on generative AI peer review policy #734

lwasser · 2025-09-16T19:19:25Z

This blog post outlines pyOpenSci's new peer review policy regarding the use of generative AI tools in scientific software, emphasizing transparency, ethical considerations, and the importance of human oversight in the review process.

It is codeveloped by the pyOpenSci community and relates to a discussion here:

pyOpenSci/software-peer-review#331

crhea93 · 2025-09-16T19:38:00Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+For some contributors, these tools make open source more accessible.
+
+## Challenges we must address


I know you mention it above in the human oversight section, but maybe it's important to add another section here explaining that LLMs frequently incorrectly do programming tasks (especially those that are slightly more complex!).

This was an interesting report from last year on this https://arxiv.org/html/2407.06153v1

And this is a great study on how LLMs can actually slow down developers https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

Full paper here https://arxiv.org/abs/2507.09089

@crhea93, I am very open to adding a new section, and if you'd like to suggest the changes or write a few sentences/paragraph with links and resources, I welcome that too 👐🏻 This is up to you, but it's a great suggestion. they are definitely frequently wrong in their suggestions, will use dated dependencies, dated and or wrong approaches, etc.

I knew suggesting this could be dangerous !! 😝😝😝

I'll write up a proposed section :)

Incorrectness of LLMs and Misleading Time Benefits

Although it is commonly stated that LLM's help improve the productivity of high-level developers, recently scientific explorations of this hypothesis indicate the contrary (see https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/ for an excellent discussion on this). What's more is that the responses of LLM's for complex coding tasks tend to be incorrect (e.g., https://arxiv.org/html/2407.06153v1). Therefore, it is crucial that, if an LLM is used to help produce code, that the correctness of the code is evaluated separately from the LLM.

This blog post outlines pyOpenSci's new peer review policy regarding the use of generative AI tools in scientific software, emphasizing transparency, ethical considerations, and the importance of human oversight in the review process.

jedbrown · 2025-09-16T21:11:18Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+## Generative AI meets scientific open source
+
+It has been suggested that for some developers, using AI tools for tasks can increase efficiency by as much as 55%. But in open source scientific software, speed isn't everything—transparency, quality, and community trust matter just as much. So do the ethical questions these tools raise.


I wouldn't air such conjecture without citation (and believing in the integrity of that work). It could say that studies and perception are mixed and perhaps that perception of efficacy appears to exceed reality (citing the METR study).

jedbrown · 2025-09-16T21:14:09Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+## Our Approach: Transparency and Disclosure
+
+We know that people will continue to use LLMs. We also know they can meaningfully increase productivity and lower barriers to contribution for some. We also know that there are significant ethical, societal and other challenges that come with the development and use of LLM’s. 


Conjectures about the future depend greatly on legal outcomes and how society processes this moment. I would not say it's inevitable, but perhaps that pyOpenSci's policy will not on its own change the behavior of the community, especially those who aren't thinking about pyOpenSci.

jedbrown · 2025-09-16T21:20:29Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+### Licensing awareness
+
+LLMs may be trained on mixed-license corpora. Outputs can create **license compatibility questions**, especially when your package uses a permissive license (MIT/BSD-3).


LLM output does not comply with the license of the input package, even when the input is permissively licensed (MIT, CC-BY), because it fails to comply with the attribution requirement of the license. The license of the package incorporating LLM output does not matter.

License compatibility only matters after an egregious violation is discovered: if the licenses are compatible, one could become compliant merely by adding attribution.

jedbrown · 2025-09-16T21:23:16Z

_posts/2025-09-16-generative-ai-peer-review.md

+LLMs may be trained on mixed-license corpora. Outputs can create **license compatibility questions**, especially when your package uses a permissive license (MIT/BSD-3).
+
+* Acknowledge potential license ambiguity in your disclosure.  
+* Avoid pasting verbatim outputs that resemble known copyrighted code.  


How would someone determine this? Due diligence is to never use the output of an LLM directly, but that isn't how LLM-based coding products are marketed or used.

jedbrown · 2025-09-16T21:33:03Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+## Benefits and opportunities
+
+LLMs are already helping developers:


Suggested change

LLMs are already helping developers:

LLMs are already perceived as helping developers:

many maintainers/developers would claim they are misleading and cause more harm than good, even for these tasks

jedbrown · 2025-09-16T21:34:03Z

_posts/2025-09-16-generative-ai-peer-review.md

+* In some cases, simplifying language barriers for participants in open source around the world  
+* Speeding up everyday workflows
+
+For some contributors, these tools make open source more accessible.


Suggested change

For some contributors, these tools make open source more accessible.

Some contributors perceive these products as making open source more accessible.

jedbrown · 2025-09-16T21:35:33Z

_posts/2025-09-16-generative-ai-peer-review.md

+---
+layout: single
+title: "Navigating LLMs in Open Source: pyOpenSci's New Peer Review Policy"
+excerpt: "Generative AI tools are making is easier to generate large amounts of code which in some cases is causing a strain  on volunteer peer review programs like ours. Learn about pyOpenSci's policy on generative AI in peer review in this blog post."


Suggested change

excerpt: "Generative AI tools are making is easier to generate large amounts of code which in some cases is causing a strain on volunteer peer review programs like ours. Learn about pyOpenSci's policy on generative AI in peer review in this blog post."

excerpt: "Generative AI products are reducing the effort and skill necessary to generate large amounts of code, which in some cases is causing a strain on volunteer peer review programs like ours. Learn about pyOpenSci's policy on generative AI in peer review in this blog post."

Calling it a "tool" endorses some fitness for purpose, which is debatable.

jedbrown · 2025-09-16T21:38:30Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+### Ethical and legal complexities
+
+LLMs are often trained on copyrighted or licensed material. Outputs may create conflicts when used in projects under different licenses. They can also reflect extractive practices, like data colonialism, and disproportionately harm underserved communities.


Suggested change

LLMs are often trained on copyrighted or licensed material. Outputs may create conflicts when used in projects under different licenses. They can also reflect extractive practices, like data colonialism, and disproportionately harm underserved communities.

LLMs are often trained on copyrighted material with varying (or no) licenses. Outputs may constitute copyright infringement and/or ethical violations such as plagiarism. They can also reflect extractive practices, like data colonialism, and disproportionately harm underserved communities.

The licenses do not need to be different to be a license violation (and copyright infringement and/or plagiarism).

jedbrown · 2025-09-16T21:39:25Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+### Environmental impacts
+
+Training and running LLMs [requires massive energy consumption](https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/), raising sustainability concerns that sit uncomfortably alongside much of the scientific research our community supports.


The numbers are way higher now, and training is only one component.

jedbrown · 2025-09-16T21:42:04Z

_posts/2025-09-16-generative-ai-peer-review.md

+## What you can do now 
+
+* **Be transparent.** Disclose LLM use in your README and modules.  
+* **Be accountable.** Thoroughly review, test, and edit AI-assisted code.  


What does it mean to be "accountable"? What sort of lapses would constitute misconduct and what are the consequences? (E.g., a lawyer can lose their job and be disbarred when their use of LLMs undermine the integrity of the court.)

elliesch · 2025-09-19T23:17:54Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+We know that people will continue to use LLMs. We also know they can meaningfully increase productivity and lower barriers to contribution for some. We also know that there are significant ethical, societal and other challenges that come with the development and use of LLM’s. 
+
+Our community’s expectation is simple: **be open about it**.


Our community’s expectation is simple: be open about any AI usage.

elliesch · 2025-09-19T23:19:20Z

_posts/2025-09-16-generative-ai-peer-review.md

+
+* Run tests and confirm correctness.  
+* Check for security and quality issues.  
+* Ensure style, readability, and clear docstrings.  


Ensure style, readability, clear, and concise docstrings.

Depending on the AI tool, generated docstrings can sometimes be overly verbose without adding meaningful understanding.

lwasser mentioned this pull request Sep 16, 2025

Develop policy around LLM generated code in our packages submissions pyOpenSci/software-peer-review#331

Open

crhea93 reviewed Sep 16, 2025

View reviewed changes

lwasser force-pushed the gen-ai branch from f3fd601 to 65773ae Compare September 16, 2025 20:12

jedbrown reviewed Sep 16, 2025

View reviewed changes

elliesch reviewed Sep 19, 2025

View reviewed changes


		For some contributors, these tools make open source more accessible.

		## Challenges we must address


		## Generative AI meets scientific open source

		It has been suggested that for some developers, using AI tools for tasks can increase efficiency by as much as 55%. But in open source scientific software, speed isn't everything—transparency, quality, and community trust matter just as much. So do the ethical questions these tools raise.


		## Our Approach: Transparency and Disclosure

		We know that people will continue to use LLMs. We also know they can meaningfully increase productivity and lower barriers to contribution for some. We also know that there are significant ethical, societal and other challenges that come with the development and use of LLM’s.


		### Licensing awareness

		LLMs may be trained on mixed-license corpora. Outputs can create license compatibility questions, especially when your package uses a permissive license (MIT/BSD-3).


		## Benefits and opportunities

		LLMs are already helping developers:

	LLMs are already helping developers:
	LLMs are already perceived as helping developers:

	For some contributors, these tools make open source more accessible.
	Some contributors perceive these products as making open source more accessible.

	excerpt: "Generative AI tools are making is easier to generate large amounts of code which in some cases is causing a strain on volunteer peer review programs like ours. Learn about pyOpenSci's policy on generative AI in peer review in this blog post."
	excerpt: "Generative AI products are reducing the effort and skill necessary to generate large amounts of code, which in some cases is causing a strain on volunteer peer review programs like ours. Learn about pyOpenSci's policy on generative AI in peer review in this blog post."


		### Ethical and legal complexities

		LLMs are often trained on copyrighted or licensed material. Outputs may create conflicts when used in projects under different licenses. They can also reflect extractive practices, like data colonialism, and disproportionately harm underserved communities.

	LLMs are often trained on copyrighted or licensed material. Outputs may create conflicts when used in projects under different licenses. They can also reflect extractive practices, like data colonialism, and disproportionately harm underserved communities.
	LLMs are often trained on copyrighted material with varying (or no) licenses. Outputs may constitute copyright infringement and/or ethical violations such as plagiarism. They can also reflect extractive practices, like data colonialism, and disproportionately harm underserved communities.


		### Environmental impacts

		Training and running LLMs [requires massive energy consumption](https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/), raising sustainability concerns that sit uncomfortably alongside much of the scientific research our community supports.


		We know that people will continue to use LLMs. We also know they can meaningfully increase productivity and lower barriers to contribution for some. We also know that there are significant ethical, societal and other challenges that come with the development and use of LLM’s.

		Our community’s expectation is simple: be open about it.

enh(blog): Add blog post on generative AI peer review policy #734

Are you sure you want to change the base?

enh(blog): Add blog post on generative AI peer review policy #734

Uh oh!

Conversation

lwasser commented Sep 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

crhea93 Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Incorrectness of LLMs and Misleading Time Benefits

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

crhea93 Sep 16, 2025 •

edited

Loading