Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Access current update info with ID inside update handler #544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 6, 2024

Conversation

cretz
Copy link
Member

@cretz cretz commented Jun 5, 2024

What was changed

  • Added temporalio.workflow.UpdateInfo, similar to temporalio.workflow.Info, with workflow-accessible information for an update (currently id and name)
  • Added temporalio.workflow.current_update_info(), similar to temporalio.workflow.info(), with the current in-context update info
    • This uses contextvars so it works inside the handler and things it starts but nowhere else
    • current_update_info() is preferred over info() because the latter is always present and there is nothing "current" about it, whereas the former can return None if you're not in an update

Checklist

  1. Closes Expose UpdateID in an update handler #542

@cretz cretz requested a review from a team as a code owner June 5, 2024 20:09
Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming works for me.

I think current_update_info could just be update_info still, since I dunno if current necessarily tells you much about how it might not be present at all. I'm fine with either tho.

class UpdateInfo:
"""Information about a workflow update."""

id: str
Copy link
Member Author

@cretz cretz Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use workflow_id to differentiate from all the other _id fields in Info, should we call this update_id? A simple id I think works here but can change if needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Go we did ID, but update_id is also fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think id works best here.

@cretz
Copy link
Member Author

cretz commented Jun 5, 2024

since I dunno if current necessarily tells you much about how it might not be present at all. I'm fine with either tho.

Yeah I think the better justification for "current" is that it is not fixed and therefore only matters to the "current" context it's called within.

Copy link
Contributor

@dandavison dandavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one aspect of this implementation that isn't ideal -- the SDK is making use of the userspace contextvars feature and there's some abstraction leakage -- the SDK's current_update_info() API will suddenly stop working if a user sets the context themselves in Python >= 3.11. This is unfortunate since in general our philosophy is to encourage users to use all features of the language, so we'd like them to be able to control contexts manually in tasks spawned in update handlers.

I'm hoping that we can solve this by modifiying our implementation of create_task, something like this:

    def create_task(
        self,
        coro: Union[Awaitable[_T], Generator[Any, None, _T]],
        *,
        name: Optional[str] = None,
        context: Optional[contextvars.Context] = None,
    ) -> asyncio.Task[_T]:
        # Context only supported on newer Python versions
        if sys.version_info >= (3, 11):
            if context:
                if update_info := temporalio.workflow.current_update_info():
                    context.run(
                        lambda: temporalio.workflow._set_current_update_info(
                            update_info
                        )
                    )

Here's an illustration of the problem I'm suggesting we try to solve: Suppose Alice initially writes workflow code like

    @workflow.update
    async def my_handler(self):
        asyncio.create_task(self.my_child_task_of_handler())

    async def my_child_task_of_handler(self):
        update_info = self.current_update_info()

Then later Bob changes it to:

        asyncio.create_task(self.my_child_task_of_handler(), context=my_context)

Now update_info has suddenly become None in Alice's handler, and the reason won't be obvious to the users (they might guess, but confirming it requires looking at SDK implementation).

"""Information about a workflow update."""

id: str
"""Update ID."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray string (should use comment, not string, it it's explaining the field).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For dataclasses we have use docstrings like this in the past. For example, see https://python.temporal.io/temporalio.client.WorkflowExecution.html.

@cretz
Copy link
Member Author

cretz commented Jun 6, 2024

@dandavison - This is why we document it as using contextvars. We do the same with activity context too. We tell people if they need to copy context they can. In your sample you have my_context but you don't show how that context is created. Can you show that? It's usually via copy_context. We should not override people's/Python's chosen context propagation. Same for activities.

@dandavison
Copy link
Contributor

We should not override people's/Python's chosen context propagation.

Agreed -- I'm not suggesting overriding it; I'm suggesting ensuring that the special Temporal-internal keys are always set on their context, whether it's a fresh context created via contextvars.Context() or by copy_context from the existing context.

@cretz
Copy link
Member Author

cretz commented Jun 6, 2024

I'm suggesting ensuring that the special Temporal-internal keys are always set on their context, whether it's a fresh context created via contextvars.Context() or by copy_context from the existing context.

I don't think we should. If a user is choosing not to propagate the context (and we clearly state this is based on context var), that is their choice. This is the same for activity context. And this will be the same in other programming languages (e.g. Go contexts and .NET AsyncLocal) when we add this feature there. Developers in these languages can intentionally discard the context, this is normal context var behavior.

@dandavison
Copy link
Contributor

dandavison commented Jun 6, 2024

Yes, I can see two design decisions here:

A

The rule is simply that workflow.current_update_info() can be used in a handler coroutine or in any child task thereof. Users do not need to know the SDK's implementation of this.

B

The rule is that workflow.current_update_info() can be used, and that users should understand that the implementation uses Python's contextvars mechanism, and therefore that it won't work if they replace the context with one that does not inherit from the current context.

If we're going with B, then I wonder whether we should expose the contextvar directly for public use rather than wrapping it in a getter.

@workflow.update
async def my_handler(self):
    update_info = workflow.current_update_info.get()  # user knows they are in a handler and hence that there is no possibility of LookupError

That way, rather than having to explain the implementation to users, it follows naturally from the language that you can't expect this to work if you replace the context with one that doesn't inherit from the parent. And users can choose to allow LookupError, or provide a default.

The docs would read something like

workflow.current_update_info is a contextvar that you can use to obtain an UpdateInfo object containing information about the current update. The SDK takes care of setting the value when your handler starts executing. Beyond that, it's just a normal contextvar so, for example, its value is propagated to child tasks by default, and also if you create your own context for the child task using copy_context.
Example usage:
<like snippet above>

That said, I see we do it as follows for activities:

During activity execution, an implicit activity context is set as a context variable. The context variable itself is not visible, but calls in the temporalio.activity package make use of it.
...
With the exception of in_activity(), if any of the functions are called outside of an activity context, an error occurs.

@cretz
Copy link
Member Author

cretz commented Jun 6, 2024

If we're going with B, then I wonder whether we should expose the contextvar directly for public use rather than wrapping it in a getter.

I think we can go with B and not expose the mutable context var. It is better to explain to our users the implementation than give them raw access. Most users aren't going to care that it's a context var any more than most users care that activity context is (and we clearly document that too). This is the same in other languages. We won't give the Go context key to access this update info, we will make a getter that accesses it on the context for them. Same for other languages too IMO.

That said, I see we do it as follows for activities:

Yes, intentionally (and similar in other languages).

@cretz cretz merged commit 58d6951 into temporalio:main Jun 6, 2024
12 checks passed
@cretz cretz deleted the current-update-info branch June 6, 2024 21:37
"""Update ID."""

name: str
"""Update type name."""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This meaning is unclear to me. Is name the function name (unscoped by the class name) unless it's overridden with the attribute?

Copy link
Member Author

@cretz cretz Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is also used in the decorator and the client calls. The meaning of update ID above it may be unclear to users too.

The "update name" is a Temporal concept like "workflow type" in "Info.workflow_type". Ideally the general documentation would detail this, but if necessary we can update all of the Python docs for signal/update/query name, all the workflow info stuff, and all of that if we want. But this one field isn't unique (it applies to the field above it, where "name" is used elsewhere, all the other fields we delegate to docs to describe, etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose UpdateID in an update handler
5 participants