Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PowerFlex/ScaleIO - Wait after SDC service start/restart/stop, and retry to fetch SDC id/guid #11099

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: 4.20
Choose a base branch
from

Conversation

sureshanaparti
Copy link
Contributor

@sureshanaparti sureshanaparti commented Jun 27, 2025

Description

This PR adds wait time after SDC service scini start/restart/stop, and retries to fetch SDC id/guid for PowerFlex/ScaleIO storage.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✖️ el9 ✔️ debian ✖️ suse15. SL-JID 13932

@sureshanaparti sureshanaparti changed the title PowerFlex/ScaleIO - Keep wait time after SDC service start/restart/stop, and retry attempts to fetch SDC id/guid PowerFlex/ScaleIO - Wait after SDC service start/restart/stop, and retry to fetch SDC id/guid Jun 27, 2025
@sureshanaparti sureshanaparti marked this pull request as ready for review June 27, 2025 17:59
@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13938

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds explicit wait periods after starting, stopping, or restarting the SDC service and introduces a retry loop to fetch the SDC ID or GUID in the PowerFlex/ScaleIO integration.

  • Introduce waitForSecs helper in ScaleIOUtil and invoke it after SDC service commands
  • Update prepareStorageClient to restart or start the SDC service and then retry fetching SDC details
  • Adjust existing test to expect failure when no SDC details are found

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.

File Description
plugins/storage/volume/scaleio/.../ScaleIOUtil.java Added waitForSecs method and invoked delays in start/stop/restart
plugins/hypervisors/kvm/.../ScaleIOStorageAdaptor.java Refactored client preparation flow, added retry loop for SDC details
plugins/hypervisors/kvm/.../ScaleIOStorageAdaptorTest.java Updated assertions to match new failure path
Comments suppressed due to low confidence (5)

plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/ScaleIOStorageAdaptor.java:641

  • [nitpick] The variable name sdcGuId is inconsistent (mixed casing). Consider renaming it to sdcGuid for clarity and consistency with the constant SDC_GUID.
                String sdcGuId = ScaleIOUtil.getSdcGuid();

plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/ScaleIOStorageAdaptor.java:629

  • StringUtils may not be imported in this file, leading to a compile error. Add the appropriate import (e.g., org.apache.commons.lang3.StringUtils).
        if (StringUtils.isEmpty(storageSystemId)) {

plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/ScaleIOStorageAdaptor.java:650

  • Catching broad Exception inside the retry loop can mask unexpected errors. Narrow this to InterruptedException or handle specific exceptions only.
            } catch (Exception ignore) {

plugins/storage/volume/scaleio/src/main/java/org/apache/cloudstack/storage/datastore/util/ScaleIOUtil.java:250

  • Suppressing InterruptedException without restoring the thread's interrupt status can lead to unpredictable thread behavior. Consider calling Thread.currentThread().interrupt() in the catch block.
        } catch (InterruptedException ignore) {

plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/ScaleIOStorageAdaptor.java:633

  • [nitpick] The hardcoded retry count (5) is a magic number. Extract it into a named constant or configuration to clarify its purpose and facilitate changes.
        int waitTimeInSecs = 5;

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-13632)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 54776 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11099-t13632-kvm-ol8.zip
Smoke tests completed. 140 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_sys_vm_start Failure 0.14 test_secondary_storage.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants