-
Notifications
You must be signed in to change notification settings - Fork 297
Description
In our app we are seeing a significant number of subscription failures with the StatusBadSequenceNumberUnknown error. After some digging we've seen two problems:
ACK handling
The current code does not handle the ACKs and error handling correctly IMO. If I understand the spec in Part 4, 5.3.13 correctly then
a PublishRequest contains
- The list of messages which can be removed from the servers' retransmission queue
subscription Acknowledgements
and a PublishResponse contains
- The data notification
notificationMessage - The list of sequence numbers which can be acknowledged
availableSequence Numbers - The status codes of the previous acknowledgements
results
The current code checks the results for errors before forwarding a data notification (https://github.com/gopcua/opcua/blob/v0.1.11/client.go#L685-L700). However, the results in the PublishResponse are not related to the data notification at all. They are only relevant for the subscription ACKs which need to be handled in the main loop and the StatusBadSequenceNumberUnknown should not be bubbled up to the caller.
Siemens PLC
This is triggered by a Siemens PLC with the following sequence:
- Server sends notif 1 and seq nr 1 for ACK
- Client ACKs notif 1
- Server sends notif 2 and seq nrs 1&2 for ACK
- Client ACKs notif 1&2
- Server sends notif 3 and seq nr 3 for ACK and
StatusBadSequenceNumberUnknownfor the Client ACKing notif 1 twice
This should not happen according to the spec IMO since the list of available sequence numbers for ACK'ing should be compiled after the server has processed the ACKs from the client.
Mitigation
I can think of several solutions but in all cases we need to remove the wrong error check from notifySubscription:
- Ignore
BadSequenceNumberUnknownand assume that this is because of the client sending a double ACK - Keep track of the ACKs and the sequence numbers and maybe trigger a republish
@alexbrdn @dwhutchison any of you have an opinion?
I don't know what happens if the client does not ACK the notifications at all.