-
Notifications
You must be signed in to change notification settings - Fork 450
ci: Remove spelling checks #3130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note: I don't know if there's some consensus building needed for this to happen, yet. |
|
I couldn't find anything in dev-docs or in community repo that would require to have these checks (though I'm happy to learn otherwise). Hence I think it's for each repo owners to decide. |
RobotSail
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Personally I'm against it but I won't block if there's a consensus from others |
|
@nathan-weinberg should I bring this up to some venue to make sure there's consensus? Not sure which it would be. |
Could you give more reasoning as to why? Currently the spell-checker serves to increase the amount of developer time spent contributing changes as it will fail and then more work is needed to figure out what went wrong, add it to the list of "approved" words, and then retest. If we're paying a high cost for this, what are we getting in return as a benefit? We will still have typos in our actual codebase, and it doesn't prevent other semantic or grammatical errors from going into the documentation which passed the spellchecker. |
|
I also find that this wastes more time than it gives value. |
The spellchecker only is a factor if you are changing Markdown files as it only runs against those. In the instances which I'm doing that, I've never found the difficulty of running |
I think this PR is enough - like I said I won't block on this, just giving my two cents. |
But why do we need it? This answer tells me what it is and how it works, but I'm not understanding why it's something we need in CI. |
I don't follow - do we not want words spelled correctly? Like I said, if people find it this difficult/too much time/etc I'm fine to get rid of it, but I feel like it's a pretty straightforward docs check. You can actually configure it to be more targeted/do spellchecking for code as well (https://github.com/rojopolis/spellcheck-github-actions) - but that wasn't included in the implementation of the action (pre-dated me, it was very early on in the project): #564 |
The question is more about - why do we want this as a CI check? What pain point for development does this solve? It sounds like improving developer velocity greatly outweighs the cost of having And if nobody notices, did it even matter in the first place? |
|
My take is: I think spelling is important, but also we should apply common sense when enforcing it. (e.g. be more stringent in release notes / changelog; less stringent in docs for contributors.) And that the drawback - worse experience for new contributors, more fragile CI - may overweight the benefit. If this ends up controversial, I'll abandon. Please speak up. Otherwise, I'll remove the hold from the patch in a week. |
CI isn't about solving development pain points - it's about ensuring that when we take a contribution from anyone, from a first-time contributor to a project maintainer, that the codebase - inclusive of documentation - is the highest quality it can be. I already stated why I personally do not find this such a velocity hamperment, given that if you run the local check the CI run is a moot point.
If I'm the only one who disagrees, please feel free to remove the hold. I think this is a minor enough decision we can go with a simple majority here, no need for unanimous consensus. |
|
@nathan-weinberg I have had to add to the dictionary on almost every single PR. Things like I am certain that removing it will not cause a deluge of unreadable text. |
If it's a matter of quality - why is it that most popular open source projects lack spell checkers? For instance, this repository largely builds on technology from HuggingFace and PyTorch which are widely known and respected in the AI community, with thousands of contributors. But I don't see their repos making use of spellcheckers.
This claim is simply untrue. Any filter you add on contributions requires effort on the contributor's end to:
If they want to figure out the tooling and run this locally, this itself involves more steps of:
If you are a first time contributor, these sorts of things are not obvious and often take more time in order to make something go through. Any barriers that you add is asking the contributor to do more work to make things go through. From the perspective of first-time contributors, they are mentally asking themselves: Why would I spend the time contributing to this project, when I can simply go build up another project that will be less restrictive and more respectful of my time? |
|
So, I think every CI check here has some value -- there's no question about that. Spell checking is generally a very good thing. However, I think the issue here is that the spell check in our CI seems to lower developer productivity without giving us much benefit in return. Many contributors are regularly forced to update the dictionary with words that theoretically shouldn't even have to be added. So I'm pretty in favor of removing this particular CI check. One could argue that yes, it's important to have "good spelling" and "no typos", but if we truly want pristine, professional documentation, then we should technically add a grammar checker into the mix as well. |
The evidence of absence does not equal evidence that something should be absent.
This entirely ignores the fact you can run this locally. If you choose not to run it locally, as with any CI check, is the project to be faulted that you didn't run local checks? Would we say the same about unit tests?
This is a problem either of contributors not having access to proper resources to contribute, or contributors choosing to ignore such documentation. Again, would we do the same for unit tests? Why should I as a contributor learn how they work? I will cede to @courtneypacheco's point and what @RobotSail and others have raised here - people don't want to deal with it, they don't see the value, ergo let's remove it - I've said time and again that folks should feel free to proceed with it. I just find the development velocity argument to be weak personally. |
I would disagree, there are inherent patterns we converge to for making things work well. Something not being adopted for this long means that people either overlooked it, or they tried it and found it not to add value. By deduction, we have clearly not overlooked it. Therefore we can conclude that most likely others have also tried it and found it not to have added value. In our case, we have tried it, and most have reported it to provide an overall negative experience without actually helping to maintain the quality of our codebase. From my personal experience, there's a lot that I'd like to do in a day but only so much I can do. Because I know that dealing with a spellchecker in CI is just going to spend more time, I tend to not want to update any of our markdown documents as much, because it's going to take time away from other tasks. As a result, our documents end up drifting out of date.
This is ignoring the original point being made, which is that new contributors will not be familiar with our tooling. So their choices are to either spend more time learning how a spellchecker can be run locally, or parsing CI logs. Either way, it's not a good use of their time and will decrease the chances of them coming back. |
If you're contributing to a new project, are you really going to take all of the time to read the entirety of their documentation before solving a problem you found? That sounds like a lot of time to invest just to fix or update something small. |
|
@nathan-weinberg What if instead of deleting this from the CI completely, we ran a trial for a month where we keep the check disabled in CI. This way - if at the end of the month, we have found that lots of spelling errors have gone through and reduced the readability of the documents, we can simply turn the check back on. Otherwise, we can just agree to keep it off. How does that sound? |
|
I agree with @RobotSail - let's try 1 month without. Also, if tons of typos and grammatical errors get through without this CI spell check in place, I'd argue that perhaps we have a reviewer problem where reviewers aren't paying close enough attention to documentation updates in general. I'm not saying reviewers should be expected to catch 100% of typos or grammatical errors, but if not a single typo or grammatical error gets caught during a review, then that's worrying and a CI spell check will only be a bandaid at best for a much larger, underlying reviewer issue. |
|
Sent #3133 hope this is a good compromise. |
We'd like to see if disabling spell check will make us agregeously ilitarate. ;) If the experiment doesn't go well, we can revert this patch later. This is a follow-up to #3130 Signed-off-by: Ihar Hrachyshka <[email protected]> **Checklist:** - [ ] **Commit Message Formatting**: Commit titles and messages follow guidelines in the [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/#summary). - [ ] [Changelog](https://github.com/instructlab/instructlab/blob/main/CHANGELOG.md) updated with breaking and/or notable changes for the next minor release. - [ ] Documentation has been updated, if necessary. - [ ] Unit tests have been added, if necessary. - [ ] Functional tests have been added, if necessary. - [ ] E2E Workflow tests have been added, if necessary. Approved-by: RobotSail Approved-by: courtneypacheco
|
It's been a month. I think it's time to revisit if we struggle without spellcheck job. I rebased the patch. |
I find them not useful and mostly distracting, esp. to new contributors. Of course, this is not an endorsement to stop carrying about egregious spelling issues, or where it's important (in user facing docs, changelog, etc.) Even then, some of these could be handled in particular moments in release schedule (when prepping a new release cut). Signed-off-by: Ihar Hrachyshka <[email protected]>
I find them not useful and mostly distracting, esp. to new contributors.
Of course, this is not an endorsement to stop carrying about egregious
spelling issues, or where it's important (in user facing docs,
changelog, etc.) Even then, some of these could be handled in particular
moments in release schedule (when prepping a new release cut).
Signed-off-by: Ihar Hrachyshka [email protected]
Checklist:
conventional commits.