Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[BUG] Getting the error "An extra return was detected from minion" with Multimaster Syndic setup post upgrade to 3006.3 one dir #65516

@anandarajan-vivekanandam-agilysys

Description

Description
We are using an M.O.M and multimaster syndic setup. We were able to scale upto 20000 minions with the minion swarm without any issues when we were using the salt version 3004.2

We recently upgraded the saltmaster and syndic to 3006.3 one dir version. Post upgrade, the M.O.M master is getting spammed with the error message "salt-master[22011]: [ERROR ] An extra return was detected from minion XXXXXXX-XXXXXX-XXXXXX, please verify the minion, this could be a replay attack"

This is affecting the performance and master, eventually facing error in Syndic too. Pasting the syndic log below.
Nov 6 10:43:26 mi-syndic-master-01-perf salt-syndic: [ERROR ] Unable to call _return_pub_multi on cnc-perf-mi.hospitalityrevolution.com, trying another...
Nov 6 10:43:41 mi-syndic-master-01-perf salt-syndic: [WARNING ] The minion failed to return the job information for job 20231106102101948872. This is often due to the master being shut down or overloaded. If the master is running, consider increasing the worker_threads value.
Nov 6 10:43:41 mi-syndic-master-01-perf salt-syndic: [ERROR ] Unable to call _return_pub_multi on cnc-perf-mi.hospitalityrevolution.com, trying another...
Nov 6 10:44:04 mi-syndic-master-01-perf salt-syndic: [WARNING ] The minion failed to return the job information for job 20231106102101948872. This is often due to the master being shut down or overloaded. If the master is running, consider increasing the worker_threads value.
Nov 6 10:44:05 mi-syndic-master-01-perf salt-syndic: [ERROR ] Unable to call _return_pub_multi on cnc-perf-mi.hospitalityrevolution.com, trying another...

Moreover, even after shutting down , we are seeing the same message spamming the logs and the minion return are capture by the salt event bus event_bus.get_event( tag='salt/job//ret/',match_type='fnmatch')

Setup
We have a single M.O.M and 4 Syndic Masters ( All the syndic masters shares same pub key since they are set in multimaster mode). We also have an minion swarm setup for scale testing purpose.

M.O.M and Syndic masters use Salt version 3006.3 one dir version
Swarm minion uses the salt minion version 3004.2. Attached the minion swarm python code.
MinionSwarm.zip

Please be as specific as possible and give set-up details.

  • M.O.M and Syndic masters are VM running on a cloud service, please be explicit and add details. Minion swarm is VM scaleset in Cloud

Steps to Reproduce the behavior
Use Minion swarm with the M.O.M and Multimaster Syndic setup.

Expected behavior
Minions should connect without issues.

Screenshots
image

Versions Report
M.O.M:
Salt Version:
Salt: 3006.3

Python Version:
Python: 3.10.13 (main, Sep 6 2023, 02:11:27) [GCC 11.2.0]

Dependency Versions:
cffi: 1.14.6
cherrypy: unknown
dateutil: 2.8.1
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.2
libgit2: Not Installed
looseversion: 1.0.2
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.2
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 22.0
pycparser: 2.21
pycrypto: 3.16.0
pycryptodome: Not Installed
pygit2: Not Installed
python-gnupg: 0.4.8
PyYAML: 6.0.1
PyZMQ: 23.2.0
relenv: 0.13.10
smmap: Not Installed
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.4

System Versions:
dist: centos 7.9.2009 Core
locale: utf-8
machine: x86_64
release: 3.10.0-1160.88.1.el7.x86_64
system: Linux
version: CentOS Linux 7.9.2009 Core

Syndic Master:
Salt Version:
Salt: 3006.3

Python Version:
Python: 3.10.13 (main, Sep 6 2023, 02:11:27) [GCC 11.2.0]

Dependency Versions:
cffi: 1.14.6
cherrypy: unknown
dateutil: 2.8.1
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.2
libgit2: Not Installed
looseversion: 1.0.2
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.2
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 22.0
pycparser: 2.21
pycrypto: 3.16.0
pycryptodome: Not Installed
pygit2: Not Installed
python-gnupg: 0.4.8
PyYAML: 6.0.1
PyZMQ: 23.2.0
relenv: 0.13.10
smmap: Not Installed
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.4

System Versions:
dist: centos 7.9.2009 Core
locale: utf-8
machine: x86_64
release: 3.10.0-1160.95.1.el7.x86_64
system: Linux
version: CentOS Linux 7.9.2009 Core

Metadata

Metadata

Assignees

Labels

RegressionThe issue is a bug that breaks functionality known to work in previous releases.Salt-Syndicbugbroken, incorrect, or confusing behaviorneeds-triage

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions