Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Feature Request] Disabling Deadlock detection in data converters #823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
CtrlAltGiri opened this issue Apr 10, 2025 · 5 comments
Open
Labels
enhancement New feature or request

Comments

@CtrlAltGiri
Copy link

Looking to get a workflow.disabledeadlockdetection context manager similar to other languages for data converters:

https://github.com/temporalio/rules/blob/main/rules/TMPRL1101.md

@CtrlAltGiri CtrlAltGiri added the enhancement New feature or request label Apr 10, 2025
@CtrlAltGiri CtrlAltGiri changed the title [Feature Request] Deadlock detection in data converters [Feature Request] Disabling Deadlock detection in data converters Apr 10, 2025
@cretz
Copy link
Member

cretz commented Apr 10, 2025

That feature is in Go and Java SDK because they don't support a clear separation of payload codecs and payload converters, but other SDKs do. Deadlock detection is already disabled for the payload codec part of the data converter (because it runs outside the sandbox), just not the payload converter part. This is by intention. You should not do any heavy/IO/async work inside a payload converter. Rather, do it in a payload codec. Sometimes a custom payload codec can be combined with a custom payload converter so the converter can set some info in payload metadata that the codec needs.

Does this help? If not, can you share your use case for why deadlock detection needs to be disabled for your payload converter instead?

@CtrlAltGiri
Copy link
Author

Got it.

Thanks @cretz

I've written a custom data converter which serializers generics and pydantic objects myself. This doesn't take more than 100-200ms at best, but for large payloads, I get this 2s deadlock error.

I've given my worker pod 1 CPU core, so it's definitely not a scheduling issue.

Trying to understand if something else is the issue here.

@cretz
Copy link
Member

cretz commented Apr 21, 2025

This doesn't take more than 100-200ms at best, but for large payloads, I get this 2s deadlock error.

Yes, there may be something else causing deadlock detection. Deadlock detection just means a workflow task is not processing within 2s. This could mean Python is not able to process code quick enough, or something like a spinning loop or accidental thread blocking in the workflow. It could just be a simple case of too many things on one core to process workflow tasks fast enough.

@CtrlAltGiri
Copy link
Author

Thanks @cretz I'm moving my data converter code into the codec to ensure I have enough time to serialize.

One quick question before I close the issue: Can you point me to the logic that whitelists the codec from the sandbox? I tried digging in the sdk, but couldn't find anything specific.

@cretz
Copy link
Member

cretz commented Apr 28, 2025

Can you point me to the logic that whitelists the codec from the sandbox?

It's not whitelisted per se, it just runs before/after the sandbox. So we decode at

if self._data_converter.payload_codec:
await temporalio.bridge.worker.decode_activation(
act, self._data_converter.payload_codec
)
, then we let the workflow task be processed (which is subject to deadlock detection and does payload converter stuff within it), then we encode it on the way out at
if self._data_converter.payload_codec:
try:
await temporalio.bridge.worker.encode_completion(
completion, self._data_converter.payload_codec
)

So the logic for codecs is on the boundaries, not within the task processing. It is of course still subject to task timeouts (default 10s) so we recommend trying to make it as fast as possible (e.g. if using a KMS for encryption keys, try to cache keys locally where it makes sense instead of downloading every time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants