Thanks to visit codestin.com
Credit goes to github.com

Skip to content

WIP: Lambda invocation loop rework #8508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 61 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
f23642b
wip
dominikschubert Jun 13, 2023
00b7e9f
First working invoke
joe4dev Jun 14, 2023
9c2c3b9
Only execute lambda tests (temporarily)
joe4dev Jun 14, 2023
743a2cb
Add stop version todo
joe4dev Jun 14, 2023
129818b
fix circleci config
dominikschubert Jun 15, 2023
a11eef2
fix formatting
dominikschubert Jun 15, 2023
5622eb2
wip
dfangl Jul 5, 2023
7e369b6
wip
dominikschubert Jul 11, 2023
f1906cc
Rework reserved and unreserved concurrency
joe4dev Jul 11, 2023
a545c9f
Add discussion comments
joe4dev Jul 11, 2023
37edee0
Add invocation encoder WIP
joe4dev Jul 12, 2023
556dd9a
Create internal async queue infrastructure
joe4dev Jul 12, 2023
320430f
Add provisioned concurrency tracker
joe4dev Jul 28, 2023
fa2d979
Fix payload JSON encoding
joe4dev Jul 28, 2023
78a85a4
Remove debug sleep
joe4dev Jul 28, 2023
3ccacb9
Re-use environments
joe4dev Jul 28, 2023
fbabc75
Add provisioned concurrency planning (WIP)
joe4dev Jul 28, 2023
cb0f65b
Put provisioned concurrency working
joe4dev Aug 2, 2023
d1ee050
Add most simple provisioned concurrency update
joe4dev Aug 2, 2023
0d022ba
Notify assignment service upon function keepalive timeout
joe4dev Aug 2, 2023
0afc45c
Fix linter error
joe4dev Aug 2, 2023
3bc6628
Fix resource cleanup upon stopping environments
joe4dev Aug 2, 2023
4212d3b
Fix lambda cleanup of active function breaking CI
joe4dev Aug 2, 2023
9c3ebbc
First queue-based invoke working
joe4dev Aug 2, 2023
dc13f7d
Add SQS invocation with retry field
joe4dev Aug 2, 2023
84ddfa5
Async SQS message handling (WIP)
joe4dev Aug 3, 2023
e5c19ff
Complete async failure handling (retries need fixing)
joe4dev Aug 3, 2023
537985b
Add hacky workaround for broken delay seconds
joe4dev Aug 3, 2023
1c460b9
Disable sleep workaround for broken delay seconds
joe4dev Aug 3, 2023
e471a2a
Fix delay seconds and add thread pool
joe4dev Aug 4, 2023
6789966
Handle and log exceptions
joe4dev Aug 4, 2023
74f1e66
Clarify defaults and sources of event handling implementation
joe4dev Aug 4, 2023
f8f232a
Handle event_invoke_config == None
joe4dev Aug 8, 2023
7884e58
Fix approx invocation count for reserved concurrency 0
joe4dev Aug 8, 2023
738d789
Handle exception retries (WIP)
joe4dev Aug 8, 2023
15ef084
Stop event manager and handle exception cases
joe4dev Aug 8, 2023
d97339c
Fix event source listener callback
joe4dev Aug 9, 2023
0a6fb31
Fix SQS => Lambda DLQ test by reducing retries
joe4dev Aug 9, 2023
b879880
Fix service exception types
joe4dev Aug 9, 2023
c1a21a0
Fix stopping Lambda environment for provisioned concurrency
joe4dev Aug 9, 2023
1699a3e
Draft locking design
joe4dev Aug 9, 2023
4579a7a
readd shutdown, refactor counting service to allow locking
dfangl Aug 9, 2023
2a44107
Fix warn logging deprecations
joe4dev Aug 10, 2023
d287c4a
Remove implemented event manager todo.py
joe4dev Aug 10, 2023
460c678
Fix Lambda => SNS DLQ => SQS test by reducing Lambda retries
joe4dev Aug 10, 2023
e9ad77c
Fix provisioned concurrency tests and exceptions
joe4dev Aug 10, 2023
b762929
Re-activate other AWS tests
joe4dev Aug 10, 2023
fd2c662
Fix concurrency quota assumptions for provisioned concurrency test
joe4dev Aug 11, 2023
8ccbaa6
Fix limits testing for reserved concurrency
joe4dev Aug 11, 2023
ff2fa93
Re-enable all tests
joe4dev Aug 11, 2023
e0f4057
Add more logging info for Lambda poller shutdown error
joe4dev Aug 11, 2023
9642b84
Add test for invoking non-existing function
joe4dev Aug 11, 2023
80abdf3
Fix locking scope and cleanup concurrency tracking
joe4dev Aug 11, 2023
064ae15
Remove draft of irrelevant counting service view
joe4dev Aug 11, 2023
4548ca1
Remove dead code in lambda service
joe4dev Aug 11, 2023
19cd208
Fix snapshot skips for old provider
joe4dev Aug 11, 2023
a4bfe09
Remove planning notes file
joe4dev Aug 11, 2023
5968ced
Fix init lock and exception handling
joe4dev Aug 22, 2023
11fea6f
Skip failing SQS DLQ test for old provider
joe4dev Aug 22, 2023
aebad97
Fixing poller shutdown (WIP)
joe4dev Aug 22, 2023
cb21d7f
add more debug output, reorder to avoid missing cleanups
dfangl Aug 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 25 additions & 29 deletions localstack/services/lambda_/event_source_listeners/adapters.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import logging
import threading
from abc import ABC
from concurrent.futures import Future
from functools import lru_cache
from typing import Callable, Optional

Expand All @@ -13,7 +12,7 @@
from localstack.aws.protocol.serializer import gen_amzn_requestid
from localstack.services.lambda_ import api_utils
from localstack.services.lambda_.api_utils import function_locators_from_arn, qualifier_is_version
from localstack.services.lambda_.invocation.lambda_models import InvocationError, InvocationResult
from localstack.services.lambda_.invocation.lambda_models import InvocationResult
from localstack.services.lambda_.invocation.lambda_service import LambdaService
from localstack.services.lambda_.invocation.models import lambda_stores
from localstack.services.lambda_.lambda_executors import (
Expand All @@ -23,6 +22,7 @@
from localstack.utils.aws.client_types import ServicePrincipal
from localstack.utils.json import BytesEncoder
from localstack.utils.strings import to_bytes, to_str
from localstack.utils.threads import FuncThread

LOG = logging.getLogger(__name__)

Expand Down Expand Up @@ -143,29 +143,26 @@ def __init__(self, lambda_service: LambdaService):
self.lambda_service = lambda_service

def invoke(self, function_arn, context, payload, invocation_type, callback=None):
def _invoke(*args, **kwargs):
# split ARN ( a bit unnecessary since we build an ARN again in the service)
fn_parts = api_utils.FULL_FN_ARN_PATTERN.search(function_arn).groupdict()

# split ARN ( a bit unnecessary since we build an ARN again in the service)
fn_parts = api_utils.FULL_FN_ARN_PATTERN.search(function_arn).groupdict()

ft = self.lambda_service.invoke(
# basically function ARN
function_name=fn_parts["function_name"],
qualifier=fn_parts["qualifier"],
region=fn_parts["region_name"],
account_id=fn_parts["account_id"],
invocation_type=invocation_type,
client_context=json.dumps(context or {}),
payload=to_bytes(json.dumps(payload or {}, cls=BytesEncoder)),
request_id=gen_amzn_requestid(),
)

if callback:
result = self.lambda_service.invoke(
# basically function ARN
function_name=fn_parts["function_name"],
qualifier=fn_parts["qualifier"],
region=fn_parts["region_name"],
account_id=fn_parts["account_id"],
invocation_type=invocation_type,
client_context=json.dumps(context or {}),
payload=to_bytes(json.dumps(payload or {}, cls=BytesEncoder)),
request_id=gen_amzn_requestid(),
)

def mapped_callback(ft_result: Future[InvocationResult]) -> None:
if callback:
try:
result = ft_result.result(timeout=10)
error = None
if isinstance(result, InvocationError):
if result.is_error:
error = "?"
callback(
result=LegacyInvocationResult(
Expand All @@ -187,7 +184,8 @@ def mapped_callback(ft_result: Future[InvocationResult]) -> None:
error=e,
)

ft.add_done_callback(mapped_callback)
thread = FuncThread(_invoke)
thread.start()

def invoke_with_statuscode(
self,
Expand All @@ -204,7 +202,7 @@ def invoke_with_statuscode(
fn_parts = api_utils.FULL_FN_ARN_PATTERN.search(function_arn).groupdict()

try:
ft = self.lambda_service.invoke(
result = self.lambda_service.invoke(
# basically function ARN
function_name=fn_parts["function_name"],
qualifier=fn_parts["qualifier"],
Expand All @@ -218,11 +216,10 @@ def invoke_with_statuscode(

if callback:

def mapped_callback(ft_result: Future[InvocationResult]) -> None:
def mapped_callback(result: InvocationResult) -> None:
try:
result = ft_result.result(timeout=10)
error = None
if isinstance(result, InvocationError):
if result.is_error:
error = "?"
callback(
result=LegacyInvocationResult(
Expand All @@ -243,11 +240,10 @@ def mapped_callback(ft_result: Future[InvocationResult]) -> None:
error=e,
)

ft.add_done_callback(mapped_callback)
mapped_callback(result)

# they're always synchronous in the ASF provider
result = ft.result(timeout=900)
if isinstance(result, InvocationError):
if result.is_error:
return 500
else:
return 200
Expand Down
148 changes: 148 additions & 0 deletions localstack/services/lambda_/invocation/assignment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
import contextlib
import logging
from collections import defaultdict
from concurrent.futures import Future, ThreadPoolExecutor
from typing import ContextManager

from localstack.services.lambda_.invocation.execution_environment import (
ExecutionEnvironment,
InvalidStatusException,
)
from localstack.services.lambda_.invocation.lambda_models import (
FunctionVersion,
InitializationType,
OtherServiceEndpoint,
)

LOG = logging.getLogger(__name__)


class AssignmentException(Exception):
pass


class AssignmentService(OtherServiceEndpoint):
"""
scope: LocalStack global
"""

# function_version (fully qualified function ARN) => runtime_environment_id => runtime_environment
environments: dict[str, dict[str, ExecutionEnvironment]]

# Global pool for spawning and killing provisioned Lambda runtime environments
provisioning_pool: ThreadPoolExecutor

def __init__(self):
self.environments = defaultdict(dict)
self.provisioning_pool = ThreadPoolExecutor(thread_name_prefix="lambda-provisioning-pool")

@contextlib.contextmanager
def get_environment(
self, function_version: FunctionVersion, provisioning_type: InitializationType
) -> ContextManager[ExecutionEnvironment]:
version_arn = function_version.qualified_arn
applicable_envs = (
env
for env in self.environments[version_arn].values()
if env.initialization_type == provisioning_type
)
for environment in applicable_envs:
try:
environment.reserve()
execution_environment = environment
break
except InvalidStatusException:
pass
else:
if provisioning_type == "provisioned-concurrency":
raise AssignmentException(
"No provisioned concurrency environment available despite lease."
)
elif provisioning_type == "on-demand":
execution_environment = self.start_environment(function_version)
self.environments[version_arn][execution_environment.id] = execution_environment
execution_environment.reserve()
else:
raise ValueError(f"Invalid provisioning type {provisioning_type}")

try:
yield execution_environment
execution_environment.release()
except InvalidStatusException as invalid_e:
LOG.error("Should not happen: %s", invalid_e)
except Exception as e:
LOG.error("Failed invocation %s", e)
self.stop_environment(execution_environment)
raise e

def start_environment(self, function_version: FunctionVersion) -> ExecutionEnvironment:
LOG.debug("Starting new environment")
execution_environment = ExecutionEnvironment(
function_version=function_version,
initialization_type="on-demand",
on_timeout=self.on_timeout,
)
try:
execution_environment.start()
except Exception as e:
message = f"Could not start new environment: {e}"
LOG.error(message, exc_info=LOG.isEnabledFor(logging.DEBUG))
raise AssignmentException(message) from e
return execution_environment

def on_timeout(self, version_arn: str, environment_id: str) -> None:
"""Callback for deleting environment after function times out"""
del self.environments[version_arn][environment_id]

def stop_environment(self, environment: ExecutionEnvironment) -> None:
version_arn = environment.function_version.qualified_arn
try:
environment.stop()
self.environments.get(version_arn).pop(environment.id)
except Exception as e:
LOG.debug(
"Error while stopping environment for lambda %s, environment: %s, error: %s",
version_arn,
environment.id,
e,
)

def stop_environments_for_version(self, function_version: FunctionVersion):
# We have to materialize the list before iterating due to concurrency
environments_to_stop = list(
self.environments.get(function_version.qualified_arn, {}).values()
)
for env in environments_to_stop:
self.stop_environment(env)

def scale_provisioned_concurrency(
self, function_version: FunctionVersion, target_provisioned_environments: int
) -> list[Future[None]]:
version_arn = function_version.qualified_arn
current_provisioned_environments = [
e
for e in self.environments[version_arn].values()
if e.initialization_type == "provisioned-concurrency"
]
# TODO: refine scaling loop to re-use existing environments instead of re-creating all
# current_provisioned_environments_count = len(current_provisioned_environments)
# diff = target_provisioned_environments - current_provisioned_environments_count

# TODO: handle case where no provisioned environment is available during scaling
# Most simple scaling implementation for now:
futures = []
# 1) Re-create new target
for _ in range(target_provisioned_environments):
execution_environment = ExecutionEnvironment(
function_version=function_version,
initialization_type="provisioned-concurrency",
on_timeout=self.on_timeout,
)
self.environments[version_arn][execution_environment.id] = execution_environment
futures.append(self.provisioning_pool.submit(execution_environment.start))
# 2) Kill all existing
for env in current_provisioned_environments:
# TODO: think about concurrent updates while deleting a function
futures.append(self.provisioning_pool.submit(self.stop_environment, env))

return futures
Loading