Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

lharrison13
Copy link
Contributor

@lharrison13 lharrison13 commented Mar 8, 2025

What kind of change does this PR introduce?

This PR adds the contributorsFromCodeOwners probe mentioned in here #3931 (comment) to supplement the contributor check.

It also adds a listCodeOwners method in the github repo client to support further checks and probes using code owners. As per #3931

What is the current behavior?

Currently the contributors check only checks if recent contributors are from companies and organizations.

What is the new behavior (if this is a feature change)?**

Now the contributors check also tests to see if code owners from the code owners file are included in the recent contributors list.

Which issue(s) this PR fixes

Contributes to #3931

Special notes for your reviewer

This is my first time creating a probe so let me know if I missed something.

I also tried to solve the issue of "verified external contributors" by allowing users to add an # @verified comment to their CODEOWNERS file along with the user names of the verified external contributors as per #3931 (comment).
This feature is not used in the probe I wrote but can be used if you want to write the new OSPO check described in the above issue.

Furthermore this does not actually implement the check described in the original issue as I just wanted to get your input on what I did so far. I can implement it in another PR later if this one ends up getting merged.

To test the new probe you can run the following:

go run main.go --repo=https://github.com/carbon-design-system/carbon --checks=Contributors
5/10

go run main.go --repo=https://github.com/apache/superset --checks=Contributors
10/10

Screenshot 2025-03-08 at 5 01 03 PM

This probe fails on a lot of repos because many orgs keep their teams private. I'm not really sure if making org teams public is a security issue but it just seems a lot of orgs do this. If you do have access to see the team, the probe should pass.

Lastly to parse the CODEOWNERS file I am using this package by @hmarr. I can manually parse it with regex but this was pretty easy to use and made things a lot simpler.

Does this PR introduce a user-facing change?

adds new contributorsFromCodeOwners probe

@lharrison13 lharrison13 requested a review from a team as a code owner March 8, 2025 22:40
@lharrison13 lharrison13 requested review from spencerschrock and raghavkaul and removed request for a team March 8, 2025 22:40
Copy link

codecov bot commented Mar 8, 2025

Codecov Report

Attention: Patch coverage is 44.20290% with 154 lines in your changes missing coverage. Please review.

Project coverage is 66.53%. Comparing base (353ed60) to head (2d4c9a7).
Report is 161 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4551      +/-   ##
==========================================
- Coverage   66.80%   66.53%   -0.27%     
==========================================
  Files         230      251      +21     
  Lines       16602    19056    +2454     
==========================================
+ Hits        11091    12679    +1588     
- Misses       4808     5540     +732     
- Partials      703      837     +134     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@spencerschrock
Copy link
Member

Just wanted to acknowledge the PR. I am back from travel, and this is on my list of things to review this week.

Copy link
Member

@spencerschrock spencerschrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this a little before iterating on the specifics (skipped most files in this review).

My individual comments tried to focus on the big picture, which to me is availability of the data, and focusing on this as something part of an OSPO type check.

Comment on lines 52 to +53
ListContributors() ([]User, error)
ListCodeOwners() ([]User, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the GitHub implementation of ListCodeOwners() I see a few different logical chunks:

  1. Where to find the CODEOWNERS file and how to parse it. Which I understand is partly forge specific, but there's also overlap between forges. At a glance, the syntax is similar (wont say identical), and the locations are almost equivalent.
  • GitHub: CODEOWNERS, docs/CODEOWNERS, .github/CODEOWNERS.
  • GitLab: CODEOWNERS, docs/CODEOWNERS, .gitlab/CODEOWNERS
  1. Calls to map teams/usernames/emails in codeowners to users
  2. A call to handler.ghClient.Organizations.List to get info about each discovered codeowner. A similar call already happens in ListContributors.

My immediate thought reviewing this is "Do we need to introduce a new interface method which would be a breaking change?"

I don't think there's any way to current do 2. or 3. with the current methods, unless we bundle the functionality under ListContributors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also ways of doing this without adding ListCodeowners to this interface for right now, but keeping it in the GitHub client.

For example if we add the probe as an independent probe for now (meaning it doesn't use any underlying check data), and do a type check to a GitHub client:

ghClient, ok := checkRequest.RepoClient.(*githubrepo.Client)
if !ok {
// something to skip or indicate not supported
}
ghClient.ListCodeowners()

Comment on lines 220 to -221
type ContributorsData struct {
Users []clients.User
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another breaking change

Comment on lines +363 to +379
short: Determines if the project has a set of contributors from multiple organizations (e.g., companies) and has contibutors who are also listed in the projects CODEOWNERS file.
description: |
Risk: `Low` (lower number of trusted code reviewers)
This check tries to determine if the project has recent contributors from
multiple organizations (e.g., companies). It is currently limited to
multiple organizations (e.g., companies) and has contibutors who are
also listed in the projects CODEOWNERS file. It is currently limited to
repositories hosted on GitHub, and does not support other source hosting
repositories (i.e., Forges).
The check looks at the `Company` field on the GitHub user profile for authors of
recent commits. To receive the highest score, the project must have had
recent commits. It also compares the users specified in the codeowners file
to the list of active contributors. To receive the highest score, the project must have had
contributors from at least 3 different companies in the last 30 commits; each of
those contributors must have had at least 5 commits in the last 30 commits.
those contributors must have had at least 5 commits in the last 30 commits.
Furthermore, the project should also have a CODEOWNERS file with at least 3 different users
and have commits in the last 30 commits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm interpreting the linked comment the same way:
#3931 (comment)

Instead of scoring if recent commits are from contributors, I thought it was saying anyone listed as a codeowner should be counted as a contributor regardless of recent commits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @raghavkaul for details maybe?

Copy link
Contributor Author

@lharrison13 lharrison13 Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm interpreting the linked comment the same way: #3931 (comment)

Instead of scoring if recent commits are from contributors, I thought it was saying anyone listed as a codeowner should be counted as a contributor regardless of recent commits.

That interpretation makes sense to me. Also if thats the case it means we could remove the listCodeOwners method from the client interface and move the logic into the listContributors method like you mentioned here.

I don't think there's any way to current do 2. or 3. with the current methods, unless we bundle the functionality under ListContributors.

Perhaps we could add a new field to the clients.user struct called RepoAssociation of type RepoAssociation and set it to RepoAssociationOwner so we can identify the contributor as an owner for future probes/checks? We could also get rid of the new probe and just have the code owner contributors count to the contributorsFromOrgOrCompany probe like you mentioned above.

In summary:

  • Get rid of listCodeOwners in repo_client interface
  • Move logic of listCodeOwners into listContributors
  • Add RepoAssociation field to clients.user and set it to RepoAssociationOwner for owners in listContributors method
  • Get rid of new contributorsFromCodeOwners probe
  • The contributorsFromOrgOrCompany probe will count these code owners toward final score. Basically a codeowners orgs/companies will count towards the 3 required orgs/companies?

@spencerschrock I think this could resolve the three comments you made, but let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could add a new field to the clients.user struct called RepoAssociation of type RepoAssociation and set it to RepoAssociationOwner so we can identify the contributor as an owner

RepoAssociationOwner has a slightly different meaning as it's currently used. But yes we could add some sort of metadata to the clients.User struct. Maybe a Maintainer or Codeowner bool?

But yes other than that, it sounds reasonable to me (pending Raghav's interpretation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghavkaul thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just go ahead and proceed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay ill probably take a look again this weekend. Ill have to figure out where i left off lol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spencerschrock sorry for the delay I worked on it today and I will probably open a new pr tomorrow with the new changes. It should be a lot simpler to review than this one.

Comment on lines +72 to +81
if numberOfTrueCompanies >= numberCompaniesForTopScore && numberOfTrueOwners >= numberCodeOwnersForTopScore {
return checker.CreateMaxScoreResult(name, reason)
}

return checker.CreateProportionalScoreResult(name, reason, numberOfTrue, numberCompaniesForTopScore)
return checker.CreateProportionalScoreResult(
name,
reason,
numberOfTrueCompanies+numberOfTrueOwners,
numberCompaniesForTopScore+numberCodeOwnersForTopScore,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm hesitant to make half the check something that usually relies on a privileged token

@spencerschrock
Copy link
Member

Continued in #4611

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants