Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@skartikey
Copy link
Contributor

Summary

The OPC UA plugin encountered an alternating success/failure pattern when using the use_unregistered_reads = true setting. This manifested as:

  1. First read would succeed
  2. Second read would fail with StatusBadSessionIDInvalid error
  3. Third read would succeed
  4. Fourth read would fail again

This occurred because even with use_unregistered_reads = true, the plugin wasn't properly handling session invalidation between gather cycles. When a server invalidated the session (which happens periodically in most OPC UA servers), the client attempted to use the same session on the next read operation. Since the session was no longer valid, this caused errors every other gather cycle.

Fix Details

The fix addresses this by:

  1. Adding a lastSessionError flag to track when session errors occur
  2. Modifying the connection management to force reconnection when a session error is detected
  3. Adding a consecutive error counter to detect and recover from persistent connection issues
  4. Improving error messages to accurately reflect whether registered or unregistered nodes are being used

These changes ensure that after a session error, the next gather cycle performs a full reconnection and establishes a fresh session, rather than trying to reuse the invalid one.

Checklist

  • No AI generated code was used in this PR

Related issues

resolves #16735

@telegraf-tiger telegraf-tiger bot added area/opcua fix pr to fix corresponding bug plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins labels Apr 22, 2025
@skartikey skartikey force-pushed the inputs_opcua_session_error branch from 056d039 to 10a0cec Compare April 22, 2025 14:39
@srebhan srebhan self-assigned this May 1, 2025
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @skartikey for the fix! Some comments/questions in the code...

skartikey and others added 3 commits May 6, 2025 17:20
When use_unregistered_reads=true is configured, the client still attempts to use the previous session on subsequent reads, leading to alternating success/failure patterns when the server invalidates the session (StatusBadSessionIDInvalid errors).

This fix adds session state tracking to properly reconnect after session errors, ensuring consistent data collection regardless of whether registered or unregistered reads are used. Also adds tests to verify session recovery behavior.
@skartikey skartikey force-pushed the inputs_opcua_session_error branch from 3fa005f to 55a5213 Compare May 6, 2025 16:29
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @skartikey! Some more comments...

@telegraf-tiger
Copy link
Contributor

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @skartikey!

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label May 14, 2025
@srebhan srebhan assigned mstrandboge and unassigned srebhan May 14, 2025
@mstrandboge mstrandboge merged commit 36055ab into influxdata:master May 16, 2025
26 of 27 checks passed
@github-actions github-actions bot added this to the v1.34.4 milestone May 16, 2025
@marcv81
Copy link
Contributor

marcv81 commented May 17, 2025

FYI: #17014

@jminardi
Copy link

jminardi commented May 22, 2025

I tried running the latest code (1.34.4) and I am still seeing the alternating success/failure loop. Reading through the code it seems like this PR didn't really fix the core problem as it still needs to detect a failure before performing a full reconnect attempt. When I am doing long polling intervals (greater than 1m) I always get a failure in between each success.

Maybe I am misunderstanding how to properly use this new code. What specific settings should I be using to prevent the alternating failure scenario?

If I am polling with long intervals I would like to be able to FORCE a full reconnect on every read attempt, is that possible?

@skartikey
Copy link
Contributor Author

@jminardi I did some investigation and this is what's happening.

The reconnect_error_threshold setting controls how many consecutive errors trigger a forced reconnection, but it still requires at least one failure to detect session issues. With long polling intervals (>1 minute), OPC UA sessions often become stale between reads, leading to the alternating pattern you're experiencing:

  1. First read fails (stale session detected)
  2. Reconnection is forced for next cycle
  3. Second read succeeds (fresh session)
  4. Session becomes stale again during long wait
  5. Cycle repeats

Currently, the code prevents setting reconnect_error_threshold = 0 because it forces any zero value to default to 1. However, I'll submit a change that allows reconnect_error_threshold = 0 to enable reconnection before every read attempt - this will be particularly useful for long polling intervals like yours.

The fix will distinguish between "not configured" (uses default of 1) versus "explicitly set to 0" (forces reconnection every cycle), giving you the proactive reconnection behavior you need to eliminate the alternating success/failure pattern.

@jminardi
Copy link

@skartikey Thank you, that sounds perfect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/opcua fix pr to fix corresponding bug plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OPCUA Input Plugin reading registered nodes failed after 1 attempts even when setting use_unregistered_reads = true

5 participants