Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

slashexx
Copy link
Member

@slashexx slashexx commented Feb 7, 2025

Description

This PR extends alertmanager CRD by adding support for a rocketchat reciever.

Fixes #7320

Type of change

What type of changes does your code introduce to the Prometheus operator? Put an x in the box that apply.

  • CHANGE (fix or feature that would cause existing functionality to not work as expected)
  • FEATURE (non-breaking change which adds functionality)
  • BUGFIX (non-breaking change which fixes an issue)
  • ENHANCEMENT (non-breaking change which improves existing functionality)
  • NONE (if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)

Verification

Please check the Prometheus-Operator testing guidelines for recommendations about automated tests.

Changelog entry

Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.

Add RocketChat reciever support in the AlertManagerConfig CRD.

@slashexx slashexx requested a review from a team as a code owner February 7, 2025 18:47
@heliapb
Copy link
Member

heliapb commented Feb 7, 2025

Also you need to add test to amcfg_test.go

@heliapb
Copy link
Member

heliapb commented Feb 7, 2025

It also requires a validation inside globalConfig to ensure alertmanager >= v0.28.0 https://github.com/prometheus/alertmanager/pull/3600/files

@slashexx
Copy link
Member Author

slashexx commented Feb 8, 2025

It also requires a validation inside globalConfig to ensure alertmanager >= v0.28.0 https://github.com/prometheus/alertmanager/pull/3600/files

Thanks for pointing out the changes @heliapb !

A little clarification on this part, we need to add a validation for the mutual exclusion (token, tokenFile and tokenID, tokenIDFile) as well right ?

Additionally, this includes adding apiUrl, token, tokenFile, tokenID, tokenIDFile to the globalConfig struct if I'm not wrong ?

type globalConfig struct {
// ResolveTimeout is the time after which an alert is declared resolved
// if it has not been updated.
ResolveTimeout *model.Duration `yaml:"resolve_timeout,omitempty" json:"resolve_timeout,omitempty"`

@heliapb
Copy link
Member

heliapb commented Feb 8, 2025

It also requires a validation inside globalConfig to ensure alertmanager >= v0.28.0 https://github.com/prometheus/alertmanager/pull/3600/files

Thanks for pointing out the changes @heliapb !

A little clarification on this part, we need to add a validation for the mutual exclusion (token, tokenFile and tokenID, tokenIDFile) as well right ?

Additionally, this includes adding apiUrl, token, tokenFile, tokenID, tokenIDFile to the globalConfig struct if I'm not wrong ?

type globalConfig struct {
// ResolveTimeout is the time after which an alert is declared resolved
// if it has not been updated.
ResolveTimeout *model.Duration `yaml:"resolve_timeout,omitempty" json:"resolve_timeout,omitempty"`

Hi @slashexx I would say so, you can take a look at the validations done for VictorOpsAPIKey | VictorOpsAPIKeyFile for example, as they follow a similar pattern

@slashexx
Copy link
Member Author

@slashpai would you mind taking a look as well ?

@heliapb
Copy link
Member

heliapb commented Feb 24, 2025

Hi @slashexx could you please rebase as the e2e test are failing due to an issue resolved.

@heliapb
Copy link
Member

heliapb commented Feb 25, 2025

Also you need to add test to amcfg_test.go

Hi @slashexx thanks for the work thus far I think we are close to be done, still as I said before the amcfg tests are missing.

@slashexx
Copy link
Member Author

slashexx commented Feb 25, 2025

Also you need to add test to amcfg_test.go

Hi @slashexx thanks for the work thus far I think we are close to be done, still as I said before the amcfg tests are missing.

Thanks for helping so far !

I'll add the test cases back like I did previously, but I thought we cleared that up since the Sanitize function for rocketchat handled the cases pretty well ?

func TestSanitizeRocketChatConfig(t *testing.T) {
logger := newNopLogger(t)
versionRocketChatAllowed := semver.Version{Major: 0, Minor: 28}
versionRocketChatNotAllowed := semver.Version{Major: 0, Minor: 27}

@slashexx
Copy link
Member Author

slashexx commented Mar 8, 2025

gentle ping @slashpai @simonpasquier, ptal :)

@heliapb
Copy link
Member

heliapb commented Mar 10, 2025

Also you need to add test to amcfg_test.go

Hi @slashexx thanks for the work thus far I think we are close to be done, still as I said before the amcfg tests are missing.

Thanks for helping so far !

I'll add the test cases back like I did previously, but I thought we cleared that up since the Sanitize function for rocketchat handled the cases pretty well ?

func TestSanitizeRocketChatConfig(t *testing.T) {
logger := newNopLogger(t)
versionRocketChatAllowed := semver.Version{Major: 0, Minor: 28}
versionRocketChatNotAllowed := semver.Version{Major: 0, Minor: 27}

Hi @slashexx my bad I guess we already had discuss that but I forgot. Still if you could review the tests as the pipeline is failing.Thanks

@slashexx
Copy link
Member Author

Hi @slashexx my bad I guess we already had discuss that but I forgot. Still if you could review the tests as the pipeline is failing.Thanks

Hi I checked and the failing tests seem not to be related to my RocketChat changes ?

image

RocketChat ones are passing fine

image

@heliapb
Copy link
Member

heliapb commented Mar 25, 2025

Hi @slashexx my bad I guess we already had discuss that but I forgot. Still if you could review the tests as the pipeline is failing.Thanks

Hi I checked and the failing tests seem not to be related to my RocketChat changes ?

image

RocketChat ones are passing fine

image

Hi @slashexx could you tried to rebase again? Just run the tests from the main branch with go test -v ./pkg/alertmanager/ -run TestInitializeFromAlertmanagerConfig and all good.

@slashexx
Copy link
Member Author

slashexx commented Jul 7, 2025

@slashexx can you run make --always-make format generate to get rid of the failures in the CI?

Seemed like an issue a rebase fixed !

@mviswanathsai
Copy link
Contributor

@slashexx could you resolve the comments and rebase?

@slashexx
Copy link
Member Author

@mviswanathsai done 👍🏻

This was referenced Jul 11, 2025
@slashexx
Copy link
Member Author

slashexx commented Aug 20, 2025

@simonpasquier the following check fails

2 errors:
		Documentation/getting-started/introduction.md:43: "https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/" not accessible even after retry; status code 0: context deadline exceeded (Client.Timeout or context cancellation while reading body)
		Documentation/getting-started/introduction.md:45: "https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors" not accessible even after retry; status code 0: net/http: request canceled (Client.Timeout or context cancellation while reading body)
2 errors:
		Documentation/platform/rbac-crd.md:14: "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings": http2: timeout awaiting response headers
		Documentation/platform/rbac-crd.md:16: "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#aggregated-clusterroles" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#aggregated-clusterroles": http2: timeout awaiting response headers
		Documentation/platform/troubleshooting.md:83: "https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/": http2: timeout awaiting response headers
		Documentation/platform/exposing-prometheus-and-alertmanager.md:179: "https://kubernetes.io/docs/concepts/services-networking/ingress/" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/concepts/services-networking/ingress/": http2: timeout awaiting response headers
make: *** [Makefile:336: docs] Error 1

It doesn't fail locally for some reason. Rest of the PR should be gtg 🚀

@slashpai
Copy link
Contributor

@simonpasquier the following check fails

2 errors:
		Documentation/getting-started/introduction.md:43: "https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/" not accessible even after retry; status code 0: context deadline exceeded (Client.Timeout or context cancellation while reading body)
		Documentation/getting-started/introduction.md:45: "https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors" not accessible even after retry; status code 0: net/http: request canceled (Client.Timeout or context cancellation while reading body)
2 errors:
		Documentation/platform/rbac-crd.md:14: "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings": http2: timeout awaiting response headers
		Documentation/platform/rbac-crd.md:16: "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#aggregated-clusterroles" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/reference/access-authn-authz/rbac/#aggregated-clusterroles": http2: timeout awaiting response headers
		Documentation/platform/troubleshooting.md:83: "https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/": http2: timeout awaiting response headers
		Documentation/platform/exposing-prometheus-and-alertmanager.md:179: "https://kubernetes.io/docs/concepts/services-networking/ingress/" not accessible even after retry; status code 0: Get "https://kubernetes.io/docs/concepts/services-networking/ingress/": http2: timeout awaiting response headers
make: *** [Makefile:336: docs] Error 1

It doesn't fail locally for some reason. Rest of the PR should be gtg 🚀

I have run-run the test and its passing. Its flaky on these links we need to look into separate

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@simonpasquier simonpasquier merged commit 81475dc into prometheus-operator:main Aug 21, 2025
31 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support RocketChat receiver in AlertManagerConfig CRD
6 participants