Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

JoelSpeed
Copy link
Contributor

This adds an initial AGENTS.md configuration for how to API review via an AI agent such as claude.

It also implements a /api-review command that can be used locally to review PRs for anyone who has claude installed.

I hope we can get folks using this to self help, but my long term goal is to integrate this into coderabbit or some other review tool that can post the comments directly on the PR.

As an example of the output, see #2488 (comment)

Currently highlights of its instructions:

  • Explaining valid values
  • Ensuring markers are documented in the godoc
  • Identifying cross field validations that aren't enforced

Copy link
Contributor

openshift-ci bot commented Sep 19, 2025

Hello @JoelSpeed! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 19, 2025
Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some general questions for learning purposes.

Nothing worth blocking this on IMO, especially if you are finding it useful.

**Explanation:** [Why this change is needed]


I'll run a comprehensive API review for the OpenShift API changes in the specified GitHub PR.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like something an LLM would spit out when you ask for a review - is it necessary to include this in the instructions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this whole section looks like it is something an LLM spit out as an execution plan. Should something like this be hand-rolled with explicit instructions on how to conduct the review and important considerations?

I guess my curiosity here is if we made an explicit guidelines type document that humans could follow, an LLM should be able to follow along relatively easily and we can potentially enforce more nuanced guardrails.

Not worth blocking this on - more so asking questions for myself as I've not worked with LLMs in this capacity before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artefact of how these documents are built. Using Claude locally, I've given it text based prompts, and it has converted that into instructions that it can read. So 95% of this document is it translating my instructions and feedback into rules that it can later apply.

Copy link
Contributor

@everettraven everettraven Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That is interesting. So this file is automatically updated with more detailed instructions by Claude as the AGENTS.md file is updated?

If you update this file by hand to "improve it", what happens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can improve it by hand, that's also fine and I have made a few edits here and there. But the bulk was generated, (then pruned - it was more verbose), and tested again to see if it was producing good results.

Copy link
Contributor

@theobarberbany theobarberbany Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary to include this in the instructions?

I've found giving explicit examples, when you want a particular format of output to work quite well. Unfortunately, it seems best with monkey see monkey do style learning in this department :(

I guess my curiosity here is if we made an explicit guidelines type document that humans could follow, an LLM should be able to follow along relatively easily and we can potentially enforce more nuanced guardrails.

I think this should work, and is sort of what the section ### API review in AGENTS.md is, but you would still need something similar to this document for the command to give it a guide on how you want the output, and communication style alongside extra prompts for when it fails to follow the rules...

If you update this file by hand to "improve it", what happens?

I'be found that often, what makes sense to me (as a human, reading the document), the AI will ignore / may degrade performance. Conversing with the model and asking it what seems to be the issue (why didn't you listen to my instructions?) does seem to help - but yeilds different structures to what we'd normally expect e.g for documentation to be consumed by humans.

This does lead into one of the bigger problems with automating problems like this, which is how do you ensure consistent output, and avoid regressions when experimenting with prompts? You've got a probabilistic system you need to build confidence in, and don't want to rely on manual checks or 'vibes'. evals are the way the industry seems to be moving, but the OSS tooling all seems pretty heavyweight for automating internal tasks / reviews.

One approach for API review that may work could be directing the command (or a version of it) to output json:

{
  "summary": "",
  "issues": [
    {"rule": "fieldDocumentation",  "msg": "...", "lines": ""},
    {"rule": "optionalFieldBehaviour", "msg": "...", "lines": ""}
  ],
  "meta": {"model": "claude-xxx",  "rules_version": "<commit?>"}
}

Having sets of real API PRs, where we can write units that Expect the correct issues to be caught. Something like:

golden/: real API doc chunks + your expected issue set (ground truth).
synthetic/: synthetic snippets each targeting exactly one rule. These are your 'unit tests' .

Then you can compare existing rules/prompt to changed, and hopefully catch any changes? There are lots of ways to extend this too e.g mutating synthetic snippets, and still expecting issues to be caught, or using existing reviews (e.g merged PRs with comment chains + changes requested) and an llm as a judge style system.

Either way, its not simple... unfortunately :(



I'll run a comprehensive API review for the OpenShift API changes in the specified GitHub PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to create a separate H1 heading to explain to the LLM that this is the steps for how to actually conduct the review?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirdly, I'll run a comprehensive API review for the OpenShift API changes in the specified GitHub PR. seems to work just fine. I've had instances (e.g on ccapio) where adding headings lead to the agent ignoring them 🤦

It's the sort of thing you want to build something to allow you to test :/

### Testing
```bash
make test-unit # Run unit tests
make integration # Run integration tests (in tests/ directory)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we teach it how to run the integration tests with more focused arguments?

That way running the integration tests don't take longer than necessary when reviewing a change that only impacts a subset of the APIs/tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good idea, I'll have a go at that

@theobarberbany
Copy link
Contributor

/lgtm

As this will be an iterative process of working out what works, this seems like a good place to start.

/hold for focussing integration tests

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 22, 2025
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 22, 2025
@openshift-ci-robot
Copy link

/test remaining-required

Overriding unmatched contexts:
/override ci/prow/e2e-aws-ovn ci/prow/e2e-aws-ovn-hypershift ci/prow/e2e-aws-ovn-hypershift-conformance ci/prow/e2e-aws-ovn-techpreview ci/prow/e2e-aws-serial-1of2 ci/prow/e2e-aws-serial-2of2 ci/prow/e2e-aws-serial-techpreview-1of2 ci/prow/e2e-aws-serial-techpreview-2of2 ci/prow/e2e-azure ci/prow/e2e-gcp ci/prow/e2e-upgrade ci/prow/e2e-upgrade-out-of-change

Copy link
Contributor

openshift-ci bot commented Sep 22, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: theobarberbany
Once this PR has been reviewed and has the lgtm label, please assign joelspeed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Sep 22, 2025

@openshift-ci-robot: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test build
/test e2e-aws-ovn
/test e2e-aws-ovn-hypershift
/test e2e-aws-ovn-hypershift-conformance
/test e2e-aws-ovn-techpreview
/test e2e-aws-serial-1of2
/test e2e-aws-serial-2of2
/test e2e-aws-serial-techpreview-1of2
/test e2e-aws-serial-techpreview-2of2
/test e2e-azure
/test e2e-gcp
/test e2e-upgrade
/test e2e-upgrade-out-of-change
/test images
/test integration
/test lint
/test minor-e2e-upgrade-minor
/test minor-images
/test okd-scos-images
/test unit
/test verify
/test verify-client-go
/test verify-crd-schema
/test verify-crdify
/test verify-deps
/test verify-feature-promotion

The following commands are available to trigger optional jobs:

/test okd-scos-e2e-aws-ovn

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-api-master-build
pull-ci-openshift-api-master-images
pull-ci-openshift-api-master-integration
pull-ci-openshift-api-master-lint
pull-ci-openshift-api-master-minor-images
pull-ci-openshift-api-master-okd-scos-images
pull-ci-openshift-api-master-unit
pull-ci-openshift-api-master-verify
pull-ci-openshift-api-master-verify-client-go
pull-ci-openshift-api-master-verify-crd-schema
pull-ci-openshift-api-master-verify-crdify
pull-ci-openshift-api-master-verify-deps
pull-ci-openshift-api-master-verify-feature-promotion

In response to this:

/test remaining-required

Overriding unmatched contexts:
/override ci/prow/e2e-aws-ovn ci/prow/e2e-aws-ovn-hypershift ci/prow/e2e-aws-ovn-hypershift-conformance ci/prow/e2e-aws-ovn-techpreview ci/prow/e2e-aws-serial-1of2 ci/prow/e2e-aws-serial-2of2 ci/prow/e2e-aws-serial-techpreview-1of2 ci/prow/e2e-aws-serial-techpreview-2of2 ci/prow/e2e-azure ci/prow/e2e-gcp ci/prow/e2e-upgrade ci/prow/e2e-upgrade-out-of-change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

openshift-ci bot commented Sep 22, 2025

@openshift-ci-robot: Overrode contexts on behalf of openshift-ci-robot: ci/prow/e2e-aws-ovn, ci/prow/e2e-aws-ovn-hypershift, ci/prow/e2e-aws-ovn-hypershift-conformance, ci/prow/e2e-aws-ovn-techpreview, ci/prow/e2e-aws-serial-1of2, ci/prow/e2e-aws-serial-2of2, ci/prow/e2e-aws-serial-techpreview-1of2, ci/prow/e2e-aws-serial-techpreview-2of2, ci/prow/e2e-azure, ci/prow/e2e-gcp, ci/prow/e2e-upgrade, ci/prow/e2e-upgrade-out-of-change

In response to this:

/test remaining-required

Overriding unmatched contexts:
/override ci/prow/e2e-aws-ovn ci/prow/e2e-aws-ovn-hypershift ci/prow/e2e-aws-ovn-hypershift-conformance ci/prow/e2e-aws-ovn-techpreview ci/prow/e2e-aws-serial-1of2 ci/prow/e2e-aws-serial-2of2 ci/prow/e2e-aws-serial-techpreview-1of2 ci/prow/e2e-aws-serial-techpreview-2of2 ci/prow/e2e-azure ci/prow/e2e-gcp ci/prow/e2e-upgrade ci/prow/e2e-upgrade-out-of-change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

openshift-ci bot commented Sep 22, 2025

@JoelSpeed: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@JoelSpeed
Copy link
Contributor Author

Couple of TODOs which I'll leave as a note here, so keep the hold for now:

  • Determine if we can API review the local changes since master, so folks can run this before they create a PR
  • Check that if I have existing changes checked out locally, Claude won't throw those away while running the /api-review command

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants