Releases: dapr/dapr
Dapr Runtime v1.16.5
Dapr 1.16.5
This update includes bug fixes:
- Trace information not populated in pubsub component using GPRC as transport
- Allow for OIDC clientSecret to be rotated when token is refreshed
Trace information not populated in pubsub component using GPRC as transport
Problem
The pubsub component did not correctly propagate tracing information when delivering messages over gRPC.
Impact
Distributed traces were incomplete or missing links between publishers and subscribers. This prevented users from reliably correlating pubsub messages with their originating requests and spans.
Root Cause
The gRPC metadata used for pubsub calls did not include the tracing headers expected by downstream services and OpenTelemetry tooling. In particular, the trace context was not consistently attached to outgoing gRPC calls.
Solution
The trace context is now explicitly added to the outgoing gRPC metadata for pubsub calls. This ensures that downstream services receive the necessary tracing information and that spans can be correctly correlated across pubsub message flows.
Allow for OIDC clientSecret to be rotated when token is refreshed in the Pulsar PubSub component
Problem
The Pulsar OAuth2 client in the Go SDK only loads the client secret once at startup, and the Dapr Pulsar component only supports providing the clientSecret as a static value.
This combination prevents rotating the OAuth2 client secret via a file path and breaks authentication when the clientSecret is changed.
Impact
Environments with strict security policies that require periodic rotation of the Pulsar OAuth2 client secret cannot safely rotate secrets.
Once the clientSecret file is updated, token refresh operations may fail because the running client continues using the old secret, leading to authentication errors and potential message flow interruption.
Root Cause
The Dapr Pulsar component exposes clientSecret only as a literal value in metadata, not as a file path, so it cannot take advantage of secret rotation mechanisms based on files.
Solution
The Dapr Pulsar component will add support for specifying clientSecret (privateKey) via a file path in its metadata.
Dapr Runtime v1.16.4
Dapr 1.16.4
This update includes bug fixes:
- Workflow logging loop when no pending task completed
- Deleted Jobs in all prefix matching deleted Namespaces
Workflow logging loop when no pending task completed
Problem
When Daprd is made unhealthy during a workflow execution, pending activity tasks which are made completed will result in looping logs from daprd.
Impact
Daprd will continually print log messages indicating that activity task result has been completed for no pending tasks.
Root Cause
Daprd holds a streaming connection to Schedulers which handles job execution for the Jobs API, Actor Reminders, and workflow execution.
Each stream established has a single set of types which the client supports.
When the app reports as unhealthy, the stream to Schedulers need to be re-established as daprd no longer supports the Jobs API and Actor Reminders while the app is unhealthy.
This restarts the workflow runtime, which clears all pending activity tasks.
Resulting task completed from the previous execution are then received with no pending tasks, causing an internal error.
This error is intentionally retried indefinitely, resulting in a logging loop.
Solution
The error occurring from no pending tasks is now typed as a non-retryable error, preventing the logging loop.
Deleted Jobs in all prefix matching deleted Namespaces
Problem
Deleting a namespace in Kubernetes will delete all the associated jobs in that namespace.
If there are any other namespaces with a name which has a prefix matching the deleted namespace, the jobs in those namespaces will also be deleted (i.e. deleting namespace "test" will also delete jobs in namespace "test-1" or "test-abc").
Impact
Deleting a namespace will delete jobs in other namespaces with prefix matching the deleted namespace.
Root Cause
Prefix logic did not terminate the prefix match with an exact match so that deleting a namespace would delete jobs in other namespaces with prefix matching the deleted namespace.
Solution
The prefix logic has been updated to ensure that only jobs in the exact deleted namespace are deleted.
Dapr Runtime v1.16.3
Dapr 1.16.3
This update includes bug fixes:
Sftp binding not handling reconnections
Problem
The SFTP binding, introduced in v1.15.0, did not correctly handle reconnections.
If the SFTP connection was closed externally (outside the Dapr sidecar), the sidecar would not attempt to reconnect.
Impact
In scenarios where the SFTP server or network closed the connection, the Dapr sidecar lost connectivity permanently and required a restart to restore SFTP communication.
Root Cause
The SFTP binding maintained a single long-lived connection and did not attempt to recreate it when operations failed due to network or server-side disconnects.
Once the underlying SFTP/SSH session was closed, subsequent binding operations continued to use the stale connection instead of establishing a new one, leaving the binding in a permanently broken state until the sidecar was restarted.
Solution
A new reconnection mechanism was added to the SFTP binding (PR).
When an SFTP action fails due to a connection issue, the binding now attempts to reconnect to the server and restore connectivity automatically, avoiding the need to restart the sidecar.
Dapr Runtime v1.16.2
Dapr 1.16.2
This update includes bug fixes:
- HTTP API default CORS behavior
- Scheduler External etcd with multiple client endpoints
- Placement not cleaning internal state after host that had actors disconnects
- Blocked Placement dissemination during high churn
- Blocked Placement dissemination with high Scheduler dataset
- Fix panic during actor deactivation
- OpenTelemetry environment variables support
- Fixing goavro bug due to codec state mutation
- APP_API_TOKEN not passed in gRPC metadata for app callbacks
- Fixed Pulsar OAuth token renewal
- Fix Scheduler connection during non-graceful network interruptions
- Prevent infinite loop when workflow state is corrupted or destroyed
HTTP API default CORS behavior
Problem
In the 1.16.0 release a change was introduced that changed the default behavior of CORS in the Dapr HTTP API. Now by default CORS headers were added to all HTTP responses. However this new behavior couldn't be disabled.
Impact
This caused problems in scenarios where CORS is handled outside of the Dapr sidecar, because the Dapr Sidecar always added CORS headers.
Solution
Revert part of the behavior introduced in this PR and change the default value of allowed-origins flag to be an empty string, and disabling the CORS filter by default.
Scheduler External etcd with multiple client endpoints
Problem
Using Scheduler in non-embed mode with multiple etcd client endpoints was not working.
Impact
It was not possible to use multiple etcd endpoints for high availability with an external etcd database for scheduler.
Root Cause
The Scheduler etcd client endpoints CLI flag was typed as an string array, rather than a string slice, causing the given value to be parsed as a single string rather than a slice of strings.
Solution
Changed the type of the etcd client endpoints CLI flag to be a string slice.
Placement not cleaning internal state after host that had actors disconnects
Problem
An actor host that had actors doesn't get properly cleaned up from placement after the sidecar is scaled down and the placement stream is closed.
Impact
This results in the placement server iterating over namespaces that no longer exist for every tick of the disseminate ticker.
Root Cause
The function requiresUpdateInPlacementTables sould not set isActorHost to false once it is set to true, because once a host has actors the placement server keeps internal state for it and cleanup logic must be executed once the host disconnects.
Solution
Update the logic in requiresUpdateInPlacementTables.
Blocked Placement dissemination during high churn
Problem
Placement would fail to ever, or very slowly, disseminate the actor table in high daprd churn scenarios.
Impact
Actors or workflows would fail to be activated, and existing actors or workflows would fail.
Root Cause
Placement used a "small" (100) queue size which when exhausted would cause a deadlock. Placement would also wait for a fully consumed channel queue before disseminating slowing down the dissemination process.
Solution
Increase the queue size to 10000 and change the dissemination logic to not wait for a fully consumed queue before disseminating.
Blocked Placement dissemination with high Scheduler dataset
Problem
Disseminations would hang for long periods of time when the Scheduler dataset was large.
Impact
Dissemination could take up to hours to complete, causing reminders to not be delivered for a long period of time.
Root Cause
The reminder migration of state store to scheduler reminders does a full decoded scan of the Scheduler database, which would take a long time if there were many entries. During this time the dissemination would be blocked.
Solution
Limit the maximum time spent doing the migration to 3 seconds.
Expose a new global.reminders.skipMigration="true" helm chart value which will skip the migration entirely.
Fix panic during actor deactivation
Problem
Daprd could panic during actor deactivation.
Impact
Daprd sidecar would crash, resulting in downtime for the application.
Root Cause
A race in the actor lock cached memory release and claiming logic meant a stale lock could be used during deactivation, double closing it, and causing a panic.
Solution
Tie the lock's lifecycle to the actor's lifecycle, ensuring the lock is only released when the actor is fully deactivated, and claimed with the actor itself.
OpenTelemetry environment variables support
Problem
OpenTelemetry OTEL_* environment variables were not fully respected, and dapr.io/env annotation parsing broke when values contained =.
Impact
OpenTelemetry resource attributes could not be reliably applied to the Dapr sidecar, degrading trace correlation with application containers, especially on Kubernetes. Configuring OTEL_RESOURCE_ATTRIBUTES via annotations did not work.
Root Cause
- Resource creation used manual logic instead of the OpenTelemetry SDK’s environment-based resource detection.
- The injector’s environment variable parsing treated
=as a hard delimiter, breaking values that include=.
Solution
- Adopt the OpenTelemetry SDK’s env-based resource detection so
OTEL_*variables (includingOTEL_RESOURCE_ATTRIBUTES) are honored. - Fix
dapr.io/envparsing to allow values containing=. - Keep the Dapr app ID as the default service name when not overridden.
Fixing goavro bug due to codec state mutation
Problem
The goavro library had a bug where the codec state was mutated during decoding, causing the decoder to panic.
Impact
The goavro library would panic, causing the application to crash.
Root Cause
The goavro library did not correctly handle the codec state, causing it to panic when the codec state was mutated during decoding.
Solution
Update the goavro library to v2.14.1 to fix the bug. Take a more defensive approach, bringing back the old approach that always creates a new codec.
APP_API_TOKEN not passed in gRPC metadata for app callbacks
Problem
When APP_API_TOKEN was configured, the token was not being passed in gRPC metadata for app callbacks including:
- PubSub subscriptions
- Bindings
- Jobs
This meant that applications using gRPC protocol could not authenticate incoming requests from Dapr when using the app API token security feature.
Impact
Applications that configured APP_API_TOKEN to secure their endpoints could not validate that incoming gRPC requests were from their Dapr sidecar. This broke the app API token authentication feature for gRPC applications.
Root Cause
The gRPC subscription delivery, binding, and job callback code paths were directly calling the app's gRPC client without going through the channel layer abstraction. The channel layer is responsible for injecting the APP_API_TOKEN in the dapr-api-token metadata header, but these direct calls bypassed this mechanism.
Solution
Centralized the APP_API_TOKEN injection logic in a helper function (AddAppTokenToContext) in the gRPC channel layer. Updated all gRPC app callback code paths (pubsub subscriptions, bindings, and job callbacks) to use this helper, ensuring the token is consistently added to the outgoing gRPC context metadata. Added comprehensive integration tests to verify token passing for all callback scenarios in both HTTP and gRPC protocols.
Fixed Pulsar OAuth token renewal
Problem
The pulsar pubsub component was not renewing the OAuth token when it expired.
Impact
Applications using the pulsar pubsub component could not receive/publish messages when the OAuth token expired.
Root Cause
There was a bug in the component code that was preventing the OAuth token from being renewed when it expired.
Solution
Fixed the bug in the component code ensuring the OAuth token is renewed when it expires. Also added a test to verify the token renewal functionality. Fixed in dapr/components-contrib#4079
Fix Scheduler connection during non-graceful network interruptions
Problem
Catastrophic failure of scheduler connection during non-graceful network interruptions would not cause the dapr runtime to attempt to reconnect to Scheduler.
Impact
A true host network interruption (e.g. unplugging the network cable) would cause the dapr runtime to only recover connections to Scheduler after roughly 2 hours.
Root Cause
The gRPC KeepAlive parameters were not set correctly, causing the gRPC client to not detect broken connections in a timely manner.
Solution
The server and client KeepAlive parameters are now set to 3 second intervals with a 5 second timeout.
Prevent infinite loop when workflow state is corrupted or destroyed
Problem
Dapr workflows could enter an infinite reminder loop when the workflow state in the actor state store is corrupted or destroyed.
Impact
Dapr workflows would enter an infinite loop of reminder calls.
Root Cause
...
Dapr Runtime v1.15.13
Dapr 1.15.13
This update includes bug fixes:
APP_API_TOKEN not passed in gRPC metadata for app callbacks
Problem
When APP_API_TOKEN was configured, the token was not being passed in gRPC metadata for app callbacks including:
- PubSub subscriptions
- Bindings
- Jobs
This meant that applications using gRPC protocol could not authenticate incoming requests from Dapr when using the app API token security feature.
Impact
Applications that configured APP_API_TOKEN to secure their endpoints could not validate that incoming gRPC requests were from their Dapr sidecar. This broke the app API token authentication feature for gRPC applications.
Root Cause
The gRPC subscription delivery, binding, and job callback code paths were directly calling the app's gRPC client without going through the channel layer abstraction. The channel layer is responsible for injecting the APP_API_TOKEN in the dapr-api-token metadata header, but these direct calls bypassed this mechanism.
Solution
Centralized the APP_API_TOKEN injection logic in a helper function (AddAppTokenToContext) in the gRPC channel layer. Updated all gRPC app callback code paths (pubsub subscriptions, bindings, and job callbacks) to use this helper, ensuring the token is consistently added to the outgoing gRPC context metadata. Added comprehensive integration tests to verify token passing for all callback scenarios in both HTTP and gRPC protocols.
Fixed Pulsar OAuth token renewal
Problem
The pulsar pubsub component was not renewing the OAuth token when it expired.
Impact
Applications using the pulsar pubsub component could not receive/publish messages when the OAuth token expired.
Root Cause
There was a bug in the component code that was preventing the OAuth token from being renewed when it expired.
Solution
Fixed the bug in the component code ensuring the OAuth token is renewed when it expires. Also added a test to verify the token renewal functionality. Fixed in dapr/components-contrib#4079
Dapr Runtime v1.16.2-rc.2
This is the release candidate 1.16.2-rc.2
Dapr Runtime v1.16.2-rc.1
This is the release candidate 1.16.2-rc.1
Dapr Runtime v1.16.1
Dapr 1.16.1
This update includes bug fixes:
- Actor Initialization Timing Fix
- Sidecar Injector Crash with Disabled Scheduler
- Workflow actors reminders stopped after Application Health check transition
- Fix Scheduler Etcd client port networking in standalone mode
- Component initialization timeout check before using reporter
- Fix Regression in pubsub.kafka Avro Message Publication
- Ensure Files are Closed Before Reading in SFTP Component
- Fix AWS Secrets Manager YAML Metadata Parsing
- Reuse Kafka Clients in AWS v2 Migration
- Fix Kafka AWS Authentication Configuration Bug
- Enhanced debug logs for placement server
- Workflow actors never registered again after failed actors registration on GetWorkItems connection callback
- Fix DynamoDB not working as a workflow state store
Actor Initialization Timing Fix
Problem
When running Dapr with an --app-port specified but no application listening on that port (either due to no server or delayed server startup), the actor runtime would initialize immediately before the app channel was ready. This created a race condition where actors were trying to communicate with an application that wasn't available yet, resulting in repeated error logs:
WARN[0064] Error processing operation DaprBuiltInActorNotFoundRetries. Retrying in 1s…
DEBU[0064] Error for operation DaprBuiltInActorNotFoundRetries was: failed to lookup actor: api error: code = FailedPrecondition desc = did not find address for actor
Impact
This created a poor user experience with confusing error messages when users specified an --app-port but had no application listening on that port.
Root cause
The actor runtime initialization was occurring before the application channel was ready, creating a race condition where actors attempted to communicate with an unavailable application.
Solution
Defer actor runtime initialization until the application channel is ready. The runtime now:
- Defers actor runtime initialization until the application is listening on the specified port
- Provides informative
waiting for application to listen on port XXXXmessages instead of confusing error logs - Prevents actor lookup errors during startup
Sidecar Injector Crash with Disabled Scheduler
Problem
The sidecar injector crashes with error (dapr-scheduler-server StatefulSet not found) when the scheduler is disabled via Helm chart (global.scheduler.enabled: false).
Impact
The crash prevents the sidecar injector from functioning correctly when the scheduler is disabled, disrupting deployments.
Root cause
A previous change caused the dapr-scheduler-server StatefulSet to be removed when the scheduler was disabled, instead of scaling it to 0 as originally intended. The injector, hardcoded to check for the StatefulSet in the injector.go file, fails when it is not found.
Solution
Revert the behavior to scale the dapr-scheduler-server StatefulSet to 0 when the scheduler is disabled, instead of removing it, as implemented in the Helm chart.
Workflow actors reminders stopped after Application Health check transition
Problem
Application Health checks transitioning from unhealthy to healthy were incorrectly configuring the scheduler clients to stop watching for actor reminder jobs.
Impact
The misconfiguration in the scheduler clients made workflows to stop executing because reminders no longer executed.
Root cause
On Application Health change daprd was able to trigger an actors update for an empty slice, which caused a scheduler client reconfiguration. However because there were no changes in the actor types, daprd never received a new version of the placement table which caused the scheduler clients to get misconfigured. This happens because when daprd sends an actor types update to the placement server daprd wipes out the known actor types in the scheduler client, and because daprd never received an acknowledgement from placement with a new table version then the scheduler client never got updated back with the actor types.
Solution
Prevent any changes to hosted actor types if the input slice is empty
Fix Scheduler Etcd client port networking in standalone mode
Problem
The Scheduler Etcd client port is not available when running in Dapr CLI standalone mode.
Impact
Cannot perform Scheduler Etcd admin operations in Dapr CLI standalone mode.
Root cause
The Scheduler Etcd client port is only listened on localhost.
Solution
The Scheduler Etcd client listen address is now configurable via the --scheduler-etcd-client-listen-address CLI flag, meaning port can be exposed when running in standalone mode.
Fix Helm chart not honoring --etcd-embed argument
Problem
The Scheduler would always treat --etcd-embed as true, even when set to false in the context of the Helm chart.
Impact
Cannot use external etcd addresses since Scheduler would always assume embedded etcd is used.
Root cause
The Helm template format treated the boolean argument as a seperate argument rather than inline.
Solution
The template format string was fixed to allow for .etcdEmbed to be set to false.
Component initialization timeout check before using reporter
Problem
The Component init timeout was checked after using the component reporter
Impact
This misalignment could lead to false positives, dapr could have reported success when later dapr was returning an error due the timeout check
Solution
Move the timeout check to be right after the actual component initialization and before the component reporter
Fix Regression in pubsub.kafka Avro Message Publication
Problem
The pubsub.kafka component failed to publish Avro messages in Dapr 1.16, breaking existing workflows.
Impact
Avro messages could not be published correctly, causing failures in Kafka message pipelines and potential data loss or dead-lettering issues.
Root cause
The Kafka pubsub component did not correctly create codecs in the SchemaRegistryClient. Additionally, the goavro library had a bug converting default null values that broke legitimate schemas.
Solution
Enabled codec creation in the Kafka SchemaRegistryClient and upgraded github.com/linkedin/goavro/v2 from v2.13.1 to v2.14.0 to fix null value handling. Metadata options useAvroJson and excludeHeaderMetaRegex were validated to ensure correct message encoding and dead-letter handling. Manual tests confirmed Avro and JSON message publication works as expected.
Ensure Files are Closed Before Reading in SFTP Component
Problem
Some SFTP servers require files to be closed before they become available for reading. Without closing, read operations could fail or return incomplete data.
Impact
SFTP file reads could fail or return incomplete data on certain servers, causing downstream processing issues.
Root cause
The SFTP component did not explicitly close files after writing, which some servers require to make files readable.
Solution
Updated the SFTP component to close files after writing, ensuring they are available for reading on all supported servers.
Fix AWS Secrets Manager YAML Metadata Parsing
Problem
The AWS Secrets Manager component failed to correctly parse YAML metadata, causing boolean fields like multipleKeyValuesPerSecret to be misinterpreted.
Impact
Incorrect metadata parsing could lead to misconfiguration, preventing secrets from being retrieved or handled properly.
Root cause
The component used a JSON marshal/unmarshal approach in getSecretManagerMetadata, which did not handle string-to-boolean conversion correctly for YAML metadata.
Solution
Replaced JSON marshal/unmarshal with kitmd.DecodeMetadata to correctly parse YAML metadata and convert string fields to their proper types, ensuring multipleKeyValuesPerSecret works as expected.
Reuse Kafka Clients in AWS v2 Migration
Problem
After migrating to the AWS v2 Kafka client, a new client was created for every message published, causing inefficiency and unnecessary resource usage.
Impact
Frequent client creation led to performance degradation, increased connection overhead, and potential resource exhaustion during high-throughput message publishing.
Root cause
The AWS v2 client integration did not implement client reuse, resulting in a new client being instantiated for each publish operation.
Solution
Updated the Kafka component to reuse clients instead of creating a new one for each message, improving performance and resource efficiency.
Fix Kafka AWS Authentication Configuration Bug
Problem
The Kafka AWS authentication configuration was not initialized correctly, causing authentication failures.
Impact
Kafka components using AWS authentication could fail to connect, preventing message publishing and consumption.
Root cause
A bug in the Kafka AWS auth config initialization prevented proper setup of authentication parameters.
Solution
Fixed the initialization...
Dapr Runtime v1.16.1-rc.3
This is the release candidate 1.16.1-rc.3
Dapr Runtime v1.16.1-rc.2
This is the release candidate 1.16.1-rc.2