Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Remove expired ack_ids from lease management. #786

@acocuzzo

Description

@acocuzzo

Possible root cause of: #593

Steps to reproduce

  1. High-latency subscriber (e.g. 2 hours)
  2. Low message count flow control
  3. Solution only works for exactly once delivery subscriptions, since they return invalid_ack_id errors.

Code example


def f(x):
    set_time = 60
    timeout = time.time() + 60*float(set_time)  # X minutes from now
    print(timeout)
    while True:
        if time.time() > timeout:
            print("Time to ack!")
            print(time.time())
            break
        x * x

def create_load(percent_load):
    processes = psutil.cpu_count()
    to_use = int(processes * (percent_load/100))
    print ('utilizing %d cores\n' % to_use)
    pool = Pool(to_use)
    pool.map(f, range(to_use))

def receive_messages_with_flow_control() -> None:
    """Receives messages from a pull subscription with flow control."""
    from concurrent.futures import TimeoutError
    from google.cloud import pubsub_v1

    subscriber = pubsub_v1.SubscriberClient()
    subscription_path = subscriber.subscription_path(project_id, subscription_id)

    def callback(message: pubsub_v1.subscriber.message.Message) -> None:
        print(f"Received {message.data!r} at {time.time()}.")
        create_load(50)
        message.ack()

    # Limit the subscriber to only have ten outstanding messages at a time.
    flow_control = pubsub_v1.types.FlowControl(max_messages=1)

    streaming_pull_future = subscriber.subscribe(
        subscription_path, callback=callback, flow_control=flow_control
    )

    # Wrap subscriber in a 'with' block to automatically call close() when done.
    with subscriber:
        try:
            # When `timeout` is not set, result() will block indefinitely,
            # unless an exception is encountered first.
            streaming_pull_future.result()
        except TimeoutError:
            streaming_pull_future.cancel()  # Trigger the shutdown.
            streaming_pull_future.result()  # Block until the shutdown is complete.

Error trace:

google.api_core.exceptions.InvalidArgument: 400 Some acknowledgement ids in the request were invalid. This could be because the acknowledgement ids have expired or the acknowledgement ids were malformed. [reason: "EXACTLY_ONCE_ACKID_FAILURE"
domain: "pubsub.googleapis.com"
metadata {
  value: "PERMANENT_FAILURE_INVALID_ACK_ID"
}

Even while flow control is activate and CPS client library does not receive messages in streaming_pull_manager._on_response, CPS still delivers messages to the bidi stream, which are buffered. These messages are not lease managed (because the library does not know about them) , so they expire before they are ever sent to a callback.

When one of these expired messages is passed to _on_response, and a receipt modack is sent via send_lease_modacks, if we receive an EOD related INVALID_ACK_ID failure, we should not pass these messages on to be processed, as they are already expired.

Metadata

Metadata

Assignees

Labels

api: pubsubIssues related to the googleapis/python-pubsub API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions