-
Notifications
You must be signed in to change notification settings - Fork 574
✨ Add contributorsFromCodeOwners probe and listCodeOwners github client method #4551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Luke Harrison <[email protected]>
Signed-off-by: Luke Harrison <[email protected]>
Signed-off-by: Luke Harrison <[email protected]>
Signed-off-by: Luke Harrison <[email protected]>
Signed-off-by: Luke Harrison <[email protected]>
Signed-off-by: Luke Harrison <[email protected]>
Signed-off-by: Luke Harrison <[email protected]>
Signed-off-by: Luke Harrison <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4551 +/- ##
==========================================
- Coverage 66.80% 66.53% -0.27%
==========================================
Files 230 251 +21
Lines 16602 19056 +2454
==========================================
+ Hits 11091 12679 +1588
- Misses 4808 5540 +732
- Partials 703 837 +134 🚀 New features to boost your workflow:
|
Just wanted to acknowledge the PR. I am back from travel, and this is on my list of things to review this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss this a little before iterating on the specifics (skipped most files in this review).
My individual comments tried to focus on the big picture, which to me is availability of the data, and focusing on this as something part of an OSPO type check.
ListContributors() ([]User, error) | ||
ListCodeOwners() ([]User, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the GitHub implementation of ListCodeOwners()
I see a few different logical chunks:
- Where to find the CODEOWNERS file and how to parse it. Which I understand is partly forge specific, but there's also overlap between forges. At a glance, the syntax is similar (wont say identical), and the locations are almost equivalent.
- GitHub:
CODEOWNERS
,docs/CODEOWNERS
,.github/CODEOWNERS
. - GitLab:
CODEOWNERS
,docs/CODEOWNERS
,.gitlab/CODEOWNERS
- Calls to map teams/usernames/emails in codeowners to users
- A call to
handler.ghClient.Organizations.List
to get info about each discovered codeowner. A similar call already happens inListContributors
.
My immediate thought reviewing this is "Do we need to introduce a new interface method which would be a breaking change?"
I don't think there's any way to current do 2. or 3. with the current methods, unless we bundle the functionality under ListContributors
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also ways of doing this without adding ListCodeowners
to this interface for right now, but keeping it in the GitHub client.
For example if we add the probe as an independent probe for now (meaning it doesn't use any underlying check data), and do a type check to a GitHub client:
ghClient, ok := checkRequest.RepoClient.(*githubrepo.Client)
if !ok {
// something to skip or indicate not supported
}
ghClient.ListCodeowners()
type ContributorsData struct { | ||
Users []clients.User |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another breaking change
short: Determines if the project has a set of contributors from multiple organizations (e.g., companies) and has contibutors who are also listed in the projects CODEOWNERS file. | ||
description: | | ||
Risk: `Low` (lower number of trusted code reviewers) | ||
This check tries to determine if the project has recent contributors from | ||
multiple organizations (e.g., companies). It is currently limited to | ||
multiple organizations (e.g., companies) and has contibutors who are | ||
also listed in the projects CODEOWNERS file. It is currently limited to | ||
repositories hosted on GitHub, and does not support other source hosting | ||
repositories (i.e., Forges). | ||
The check looks at the `Company` field on the GitHub user profile for authors of | ||
recent commits. To receive the highest score, the project must have had | ||
recent commits. It also compares the users specified in the codeowners file | ||
to the list of active contributors. To receive the highest score, the project must have had | ||
contributors from at least 3 different companies in the last 30 commits; each of | ||
those contributors must have had at least 5 commits in the last 30 commits. | ||
those contributors must have had at least 5 commits in the last 30 commits. | ||
Furthermore, the project should also have a CODEOWNERS file with at least 3 different users | ||
and have commits in the last 30 commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'm interpreting the linked comment the same way:
#3931 (comment)
Instead of scoring if recent commits are from contributors, I thought it was saying anyone listed as a codeowner should be counted as a contributor regardless of recent commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @raghavkaul for details maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'm interpreting the linked comment the same way: #3931 (comment)
Instead of scoring if recent commits are from contributors, I thought it was saying anyone listed as a codeowner should be counted as a contributor regardless of recent commits.
That interpretation makes sense to me. Also if thats the case it means we could remove the listCodeOwners method from the client interface and move the logic into the listContributors method like you mentioned here.
I don't think there's any way to current do 2. or 3. with the current methods, unless we bundle the functionality under ListContributors.
Perhaps we could add a new field to the clients.user
struct called RepoAssociation of type RepoAssociation
and set it to RepoAssociationOwner so we can identify the contributor as an owner for future probes/checks? We could also get rid of the new probe and just have the code owner contributors count to the contributorsFromOrgOrCompany probe like you mentioned above.
In summary:
- Get rid of listCodeOwners in repo_client interface
- Move logic of listCodeOwners into listContributors
- Add RepoAssociation field to clients.user and set it to RepoAssociationOwner for owners in listContributors method
- Get rid of new contributorsFromCodeOwners probe
- The contributorsFromOrgOrCompany probe will count these code owners toward final score. Basically a codeowners orgs/companies will count towards the 3 required orgs/companies?
@spencerschrock I think this could resolve the three comments you made, but let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could add a new field to the clients.user struct called RepoAssociation of type RepoAssociation and set it to RepoAssociationOwner so we can identify the contributor as an owner
RepoAssociationOwner
has a slightly different meaning as it's currently used. But yes we could add some sort of metadata to the clients.User
struct. Maybe a Maintainer
or Codeowner
bool?
But yes other than that, it sounds reasonable to me (pending Raghav's interpretation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raghavkaul thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's just go ahead and proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay ill probably take a look again this weekend. Ill have to figure out where i left off lol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spencerschrock sorry for the delay I worked on it today and I will probably open a new pr tomorrow with the new changes. It should be a lot simpler to review than this one.
if numberOfTrueCompanies >= numberCompaniesForTopScore && numberOfTrueOwners >= numberCodeOwnersForTopScore { | ||
return checker.CreateMaxScoreResult(name, reason) | ||
} | ||
|
||
return checker.CreateProportionalScoreResult(name, reason, numberOfTrue, numberCompaniesForTopScore) | ||
return checker.CreateProportionalScoreResult( | ||
name, | ||
reason, | ||
numberOfTrueCompanies+numberOfTrueOwners, | ||
numberCompaniesForTopScore+numberCodeOwnersForTopScore, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm hesitant to make half the check something that usually relies on a privileged token
Continued in #4611 |
What kind of change does this PR introduce?
This PR adds the contributorsFromCodeOwners probe mentioned in here #3931 (comment) to supplement the contributor check.
It also adds a listCodeOwners method in the github repo client to support further checks and probes using code owners. As per #3931
What is the current behavior?
Currently the contributors check only checks if recent contributors are from companies and organizations.
What is the new behavior (if this is a feature change)?**
Now the contributors check also tests to see if code owners from the code owners file are included in the recent contributors list.
Which issue(s) this PR fixes
Contributes to #3931
Special notes for your reviewer
This is my first time creating a probe so let me know if I missed something.
I also tried to solve the issue of "verified external contributors" by allowing users to add an
# @verified
comment to their CODEOWNERS file along with the user names of the verified external contributors as per #3931 (comment).This feature is not used in the probe I wrote but can be used if you want to write the new OSPO check described in the above issue.
Furthermore this does not actually implement the check described in the original issue as I just wanted to get your input on what I did so far. I can implement it in another PR later if this one ends up getting merged.
To test the new probe you can run the following:
go run main.go --repo=https://github.com/carbon-design-system/carbon --checks=Contributors
5/10
go run main.go --repo=https://github.com/apache/superset --checks=Contributors
10/10
This probe fails on a lot of repos because many orgs keep their teams private. I'm not really sure if making org teams public is a security issue but it just seems a lot of orgs do this. If you do have access to see the team, the probe should pass.
Lastly to parse the CODEOWNERS file I am using this package by @hmarr. I can manually parse it with regex but this was pretty easy to use and made things a lot simpler.
Does this PR introduce a user-facing change?