-
Notifications
You must be signed in to change notification settings - Fork 263
Fix: allow dots in GitHub repository names (closes #7776) #7934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Added a new file under changelog/ to document the fix for taskcluster#7776.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks @pinakiz!
Unfortunately the fix will need to be a bit more sophisticated. See line 5 and 6 above your change, it says
Specifically, the length limitation and the fact that identifiers can't contain dots
.
is critical.
The issue is that github repositories appear in AMQP routing keys, for example in Pull Request exchanges.
You see the github repository is the third (index 2) entry in the routing key, so having a .
in the name would break the message binding.
In order to fix that we would need to change the routing key to be an encoded form of the github repository name, for example, which could also break existing clients and software that relies on the raw (unencoded) repository name being used in the routing key.
Your fix would have been fine, if it wasn't for this, but this rather complicates matters. Probably the real solution is for all routing key entries to escape the .
somehow, but that would be a major change.
However, we appreciate your offer of a contribution here, and sorry that it did not work out this time!
Many thanks again.
@petemoore Thanks a lot for the detailed explanation 🙏. Makes perfect sense about the routing key issue , I definitely learned something new here! Looking forward to contributing again in the future. |
hey @pinakiz I checked code as well, and found two places that say that dots are actually might be fine: taskcluster/services/github/src/exchanges.js Lines 16 to 32 in 28d3aa0
And there's a sanitization that is being used in API: taskcluster/services/github/src/api.js Lines 21 to 25 in 28d3aa0
So it might be that this restriction on the schema level can be lifted, but the challenge here would be to actually test it, and see if existing tests are covering all cases where this might happen. Maybe we can reopen this and make sure sanitization is used everywhere |
Ah, I just saw in the description of
So indeed, tests here are key, to ensure the dots are replaced as desired, and anything consuming messages from the message bus are listening for the converted repository name, rather than the raw name. An orthogonal concern from this comment above (not from the change in this PR) is about two different repositories sharing the same identifier (e.g. repositories |
# all common identifiers. It's not personal, it's just that without these | ||
# limitation, the identifiers won't be useful as routing keys in RabbitMQ | ||
# topic exchanges. Specifically, the length limitation and the fact that | ||
# identifiers can't contain dots `.` is critical. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should update this comment to tell that github service is doing sanitization and all dots are being replaced
# topic exchanges. Specifically, the length limitation and the fact that | ||
# identifiers can't contain dots `.` is critical. | ||
github-identifier-pattern: "^([a-zA-Z0-9-_%]*)$" | ||
github-identifier-pattern: "^([a-zA-Z0-9-_.%]*)$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please run yarn generate
to update all autogenerated schemas as well? Or just let me know, I can run it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I didn't spot any tests that would cover such checks. Would you be comfortable checking if you could add some new tests for repositories with dots?
Might be tricky to setup data for the tests though, I suggest starting with api_test.js
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I allow '%' ? Since ' . ' is later transformed into '%', allowing '%' could create ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The git_Identifier_pattern also allows whitespace-only strings. Is this something we should be concerned about?"
Fix: allow
.
in github-identifier-patternThis updates the regex in the schema so that repository identifiers
can include the
.
character. Previously, such identifiers were rejected.Changelog:
github-identifier-pattern
regex to include.
.Github Issue: Fixes #7776