-
Notifications
You must be signed in to change notification settings - Fork 708
Make exporter timeout
encompass retries/backoffs, add jitter to backoffs, cleanup code a bit
#4564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
...orter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/metric_exporter/__init__.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running a basic example of sending span/metric to a non existent collector through grpc:
Before (timeout not respected)
$ OTEL_EXPORTER_OTLP_TIMEOUT=5 uv run repro.py
2025-05-09 01:19:57 INFO [test] Hello world
2025-05-09 01:19:57 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting metrics to localhost:4317, retrying in 1s.
2025-05-09 01:19:58 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting metrics to localhost:4317, retrying in 2s.
2025-05-09 01:20:00 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting metrics to localhost:4317, retrying in 4s.
^[[B2025-05-09 01:20:02 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 1s.
2025-05-09 01:20:03 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 2s.
2025-05-09 01:20:04 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting metrics to localhost:4317, retrying in 8s.
2025-05-09 01:20:05 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 4s.
2025-05-09 01:20:09 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 8s.
2025-05-09 01:20:12 WARNING [opentelemetry.exporter.otlp.proto.grpc.exporter] Transient error StatusCode.UNAVAILABLE encountered while exporting metrics to localhost:4317, retrying in 16s.
Now (timeout is respected)
$ OTEL_EXPORTER_OTLP_TIMEOUT=5 uv run repro.py
2025-05-09 01:22:43 INFO [test] Hello world
2025-05-09 01:22:48 ERROR [opentelemetry.exporter.otlp.proto.grpc.exporter] Failed to export metrics to localhost:4317, error code: StatusCode.DEADLINE_EXCEEDED
2025-05-09 01:22:53 ERROR [opentelemetry.exporter.otlp.proto.grpc.exporter] Failed to export traces to localhost:4317, error code: StatusCode.DEADLINE_EXCEEDED
When exporting metrics/traces in the same program I noticed that:
Is this expected?
$ OTEL_EXPORTER_OTLP_TRACES_TIMEOUT=5 uv run repro.py
2025-05-09 15:37:54 INFO [test] Hello world
2025-05-09 15:38:04 ERROR [opentelemetry.exporter.otlp.proto.grpc.exporter] Failed to export traces to localhost:4317, error code: StatusCode.DEADLINE_EXCEEDED
2025-05-09 15:38:04 ERROR [opentelemetry.exporter.otlp.proto.grpc.exporter] Failed to export metrics to localhost:4317, error code: StatusCode.DEADLINE_EXCEEDED
...lemetry-exporter-opencensus/src/opentelemetry/exporter/opencensus/trace_exporter/__init__.py
Outdated
Show resolved
Hide resolved
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
...xporter-otlp-proto-http/src/opentelemetry/exporter/otlp/proto/http/_log_exporter/__init__.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add the description which issues that are fixed by this PR? https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
exporter/opentelemetry-exporter-otlp-proto-grpc/tests/test_otlp_exporter_mixin.py
Outdated
Show resolved
Hide resolved
I didn't see this in the code, if you updated can you just update the description as well? |
Updated the description |
timeout
encompass retries/backoffs
Nevermind, not going to use gRPC retry config for now.. |
timeout
encompass retries/backoffs timeout
encompass retries/backoffs, add jitter to backoffs, cleanup code a bit
...xporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/_log_exporter/__init__.py
Show resolved
Hide resolved
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
exporter/opentelemetry-exporter-otlp-proto-grpc/tests/test_otlp_exporter_mixin.py
Outdated
Show resolved
Hide resolved
exporter/opentelemetry-exporter-otlp-proto-grpc/tests/test_otlp_exporter_mixin.py
Outdated
Show resolved
Hide resolved
exporter/opentelemetry-exporter-otlp-proto-grpc/tests/test_otlp_exporter_mixin.py
Outdated
Show resolved
Hide resolved
@emdneto any additional comments on this since I made changes to remove the retry config ? Otherwise I think this can be merged |
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Outdated
Show resolved
Hide resolved
…try/exporter/otlp/proto/grpc/exporter.py Co-authored-by: Emídio Neto <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DylanRussell running griffe I got:
griffe check opentelemetry-exporter-otlp-proto-http -a main -s exporter
exporter/opentelemetry-exporter-otlp-proto-http/tests/test_proto_span_exporter.py:0: TestOTLPSpanExporter.test_exponential_backoff: Public object was removed
exporter/opentelemetry-exporter-otlp-proto-http/tests/test_proto_log_exporter.py:0: TestOTLPHTTPLogExporter.test_exponential_backoff: Public object was removed
exporter/opentelemetry-exporter-otlp-proto-http/tests/metrics/test_otlp_metrics_exporter.py:0: TestOTLPMetricExporter.test_exponential_backoff: Public object was removed
exporter/opentelemetry-exporter-otlp-proto-http/src/opentelemetry/exporter/otlp/proto/http/metric_exporter/__init__.py:205: OTLPMetricExporter.export(timeout_millis): Parameter default was changed: 10000 -> None
Good to go @emdneto ? |
@@ -16,9 +20,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
|
|||
- typecheck: add sdk/resources and drop mypy | |||
([#4578](https://github.com/open-telemetry/opentelemetry-python/pull/4578)) | |||
- Refactor `BatchLogRecordProcessor` to simplify code and make the control flow more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep this?
Description
Make
timeout
encompass retries and backoffs, rather than being applied per HTTP request or gRPC RPC.Added a +/- 20% jitter to each backoff (both gRPC/HTTP).
Cleanup up the exporter code some. I got rid of a pointless 32 second sleep we would do after our last retry attempt before failing.
fixes: #3309, #4043, #2663
#4183 -- similar to this PR and what's discussed in #4043, but I implemented it in as minimal a way as I could..
Fixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Lots of unit tests.
Does This PR Require a Contrib Repo Change?
Checklist: