Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit c84ba94

Browse files
girishc13srikanthccvlzchen
authored
Implement shutdown procedure for OTLP grpc exporters (open-telemetry#3138)
* Implement shutdown procedure for OTLP grpc exporters - Add `_shutdown` variable for checking if the exporter has been shutdown. - Prevent export if the `_shutdown` flag has been set. Log a warning message is exporter has been shutdown. - Use thread lock to synchronize the last export call before shutdown timeout. The `shutdown` method will wait until the `timeout_millis` if there is an ongoing export. If there is no ongiong export, set the `_shutdown` flag to prevent further exports and return. - Add unit tests for the `OTLPExporterMixIn` and the sub classes for traces and metrics. * lint files * add changelog entry for fix * lint test files --------- Co-authored-by: Srikanth Chekuri <[email protected]> Co-authored-by: Leighton Chen <[email protected]>
1 parent af582e9 commit c84ba94

File tree

7 files changed

+277
-72
lines changed

7 files changed

+277
-72
lines changed

CHANGELOG.md

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,17 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

88
## Unreleased
9+
910
- PeriodicExportingMetricReader will continue if collection times out
1011
([#3100](https://github.com/open-telemetry/opentelemetry-python/pull/3100))
1112
- Fix formatting of ConsoleMetricExporter.
1213
([#3197](https://github.com/open-telemetry/opentelemetry-python/pull/3197))
13-
14+
- Implement shutdown procedure forOTLP grpc exporters
15+
([#3138](https://github.com/open-telemetry/opentelemetry-python/pull/3138))
1416
- Add exponential histogram
1517
([#2964](https://github.com/open-telemetry/opentelemetry-python/pull/2964))
1618

17-
## Version 1.16.0/0.37b0 (2023-02-15)
19+
## Version 1.16.0/0.37b0 (2023-02-17)
1820

1921
- Change ``__all__`` to be statically defined.
2022
([#3143](https://github.com/open-telemetry/opentelemetry-python/pull/3143))
@@ -398,7 +400,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
398400
- `opentelemetry-distro` & `opentelemetry-sdk` Moved Auto Instrumentation Configurator code to SDK
399401
to let distros use its default implementation
400402
([#1937](https://github.com/open-telemetry/opentelemetry-python/pull/1937))
401-
- Add Trace ID validation to meet [TraceID spec](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/overview.md#spancontext) ([#1992](https://github.com/open-telemetry/opentelemetry-python/pull/1992))
403+
- Add Trace ID validation to
404+
meet [TraceID spec](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/overview.md#spancontext) ([#1992](https://github.com/open-telemetry/opentelemetry-python/pull/1992))
402405
- Fixed Python 3.10 incompatibility in `opentelemetry-opentracing-shim` tests
403406
([#2018](https://github.com/open-telemetry/opentelemetry-python/pull/2018))
404407
- `opentelemetry-sdk` added support for `OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT`
@@ -729,7 +732,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
729732
([#1471](https://github.com/open-telemetry/opentelemetry-python/pull/1471))
730733
- Add support for Python 3.9
731734
([#1441](https://github.com/open-telemetry/opentelemetry-python/pull/1441))
732-
- Added the ability to disable instrumenting libraries specified by OTEL_PYTHON_DISABLED_INSTRUMENTATIONS env variable, when using opentelemetry-instrument command.
735+
- Added the ability to disable instrumenting libraries specified by OTEL_PYTHON_DISABLED_INSTRUMENTATIONS env variable,
736+
when using opentelemetry-instrument command.
733737
([#1461](https://github.com/open-telemetry/opentelemetry-python/pull/1461))
734738
- Add `fields` to propagators
735739
([#1374](https://github.com/open-telemetry/opentelemetry-python/pull/1374))
@@ -778,7 +782,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
778782
([#1533](https://github.com/open-telemetry/opentelemetry-python/pull/1533))
779783
- `opentelemetry-sdk` The JaegerPropagator has been moved into its own package: `opentelemetry-propagator-jaeger`
780784
([#1525](https://github.com/open-telemetry/opentelemetry-python/pull/1525))
781-
- `opentelemetry-exporter-jaeger`, `opentelemetry-exporter-zipkin` Update InstrumentationInfo tag keys for Jaeger and Zipkin exporters
785+
- `opentelemetry-exporter-jaeger`, `opentelemetry-exporter-zipkin` Update InstrumentationInfo tag keys for Jaeger and
786+
Zipkin exporters
782787
([#1535](https://github.com/open-telemetry/opentelemetry-python/pull/1535))
783788
- `opentelemetry-sdk` Remove rate property setter from TraceIdRatioBasedSampler
784789
([#1536](https://github.com/open-telemetry/opentelemetry-python/pull/1536))
@@ -888,7 +893,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
888893
([#1199](https://github.com/open-telemetry/opentelemetry-python/pull/1199))
889894
- Add Global Error Handler
890895
([#1080](https://github.com/open-telemetry/opentelemetry-python/pull/1080))
891-
- Add support for `OTEL_BSP_MAX_QUEUE_SIZE`, `OTEL_BSP_SCHEDULE_DELAY_MILLIS`, `OTEL_BSP_MAX_EXPORT_BATCH_SIZE` and `OTEL_BSP_EXPORT_TIMEOUT_MILLIS` environment variables
896+
- Add support for `OTEL_BSP_MAX_QUEUE_SIZE`, `OTEL_BSP_SCHEDULE_DELAY_MILLIS`, `OTEL_BSP_MAX_EXPORT_BATCH_SIZE`
897+
and `OTEL_BSP_EXPORT_TIMEOUT_MILLIS` environment variables
892898
([#1105](https://github.com/open-telemetry/opentelemetry-python/pull/1120))
893899
- Adding Resource to MeterRecord
894900
([#1209](https://github.com/open-telemetry/opentelemetry-python/pull/1209))
@@ -913,7 +919,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
913919
([#1151](https://github.com/open-telemetry/opentelemetry-python/pull/1151))
914920
- Fixed OTLP events to Zipkin annotations translation.
915921
([#1161](https://github.com/open-telemetry/opentelemetry-python/pull/1161))
916-
- Fixed bootstrap command to correctly install opentelemetry-instrumentation-falcon instead of opentelemetry-instrumentation-flask.
922+
- Fixed bootstrap command to correctly install opentelemetry-instrumentation-falcon instead of
923+
opentelemetry-instrumentation-flask.
917924
([#1138](https://github.com/open-telemetry/opentelemetry-python/pull/1138))
918925
- Update sampling result names
919926
([#1128](https://github.com/open-telemetry/opentelemetry-python/pull/1128))
@@ -923,7 +930,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
923930
([#1203](https://github.com/open-telemetry/opentelemetry-python/pull/1203))
924931
- Protect access to Span implementation
925932
([#1188](https://github.com/open-telemetry/opentelemetry-python/pull/1188))
926-
- `start_as_current_span` and `use_span` can now optionally auto-record any exceptions raised inside the context manager.
933+
- `start_as_current_span` and `use_span` can now optionally auto-record any exceptions raised inside the context
934+
manager.
927935
([#1162](https://github.com/open-telemetry/opentelemetry-python/pull/1162))
928936

929937
## Version 0.13b0 (2020-09-17)
@@ -1000,7 +1008,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
10001008
([#959](https://github.com/open-telemetry/opentelemetry-python/pull/959))
10011009
- Update default port to 55680
10021010
([#977](https://github.com/open-telemetry/opentelemetry-python/pull/977))
1003-
- Add proper length zero padding to hex strings of traceId, spanId, parentId sent on the wire, for compatibility with jaeger-collector
1011+
- Add proper length zero padding to hex strings of traceId, spanId, parentId sent on the wire, for compatibility with
1012+
jaeger-collector
10041013
([#908](https://github.com/open-telemetry/opentelemetry-python/pull/908))
10051014
- Send start_timestamp and convert labels to strings
10061015
([#937](https://github.com/open-telemetry/opentelemetry-python/pull/937))

exporter/opentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py

Lines changed: 72 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -14,16 +14,16 @@
1414

1515
"""OTLP Exporter"""
1616

17-
from logging import getLogger
17+
import threading
1818
from abc import ABC, abstractmethod
1919
from collections.abc import Sequence
20+
from logging import getLogger
2021
from os import environ
2122
from time import sleep
2223
from typing import Any, Callable, Dict, Generic, List, Optional, Tuple, Union
2324
from typing import Sequence as TypingSequence
2425
from typing import TypeVar
2526
from urllib.parse import urlparse
26-
from opentelemetry.sdk.trace import ReadableSpan
2727

2828
import backoff
2929
from google.rpc.error_details_pb2 import RetryInfo
@@ -37,6 +37,9 @@
3737
ssl_channel_credentials,
3838
)
3939

40+
from opentelemetry.exporter.otlp.proto.grpc import (
41+
_OTLP_GRPC_HEADERS,
42+
)
4043
from opentelemetry.proto.common.v1.common_pb2 import (
4144
AnyValue,
4245
ArrayValue,
@@ -51,12 +54,10 @@
5154
OTEL_EXPORTER_OTLP_INSECURE,
5255
OTEL_EXPORTER_OTLP_TIMEOUT,
5356
)
54-
from opentelemetry.sdk.resources import Resource as SDKResource
5557
from opentelemetry.sdk.metrics.export import MetricsData
58+
from opentelemetry.sdk.resources import Resource as SDKResource
59+
from opentelemetry.sdk.trace import ReadableSpan
5660
from opentelemetry.util.re import parse_env_headers
57-
from opentelemetry.exporter.otlp.proto.grpc import (
58-
_OTLP_GRPC_HEADERS,
59-
)
6061

6162
logger = getLogger(__name__)
6263
SDKDataT = TypeVar("SDKDataT")
@@ -92,7 +93,6 @@ def environ_to_compression(environ_key: str) -> Optional[Compression]:
9293

9394

9495
def _translate_value(value: Any) -> KeyValue:
95-
9696
if isinstance(value, bool):
9797
any_value = AnyValue(bool_value=value)
9898

@@ -135,7 +135,6 @@ def get_resource_data(
135135
resource_class: Callable[..., TypingResourceT],
136136
name: str,
137137
) -> List[TypingResourceT]:
138-
139138
resource_data = []
140139

141140
for (
@@ -282,6 +281,9 @@ def __init__(
282281
secure_channel(endpoint, credentials, compression=compression)
283282
)
284283

284+
self._export_lock = threading.Lock()
285+
self._shutdown = False
286+
285287
@abstractmethod
286288
def _translate_data(
287289
self, data: TypingSequence[SDKDataT]
@@ -302,6 +304,11 @@ def _translate_attributes(self, attributes) -> TypingSequence[KeyValue]:
302304
def _export(
303305
self, data: Union[TypingSequence[ReadableSpan], MetricsData]
304306
) -> ExportResultT:
307+
# After the call to shutdown, subsequent calls to Export are
308+
# not allowed and should return a Failure result.
309+
if self._shutdown:
310+
logger.warning("Exporter already shutdown, ignoring batch")
311+
return self._result.FAILURE
305312

306313
# FIXME remove this check if the export type for traces
307314
# gets updated to a class that represents the proto
@@ -317,69 +324,75 @@ def _export(
317324
# exponentially. Once delay is greater than max_value, the yielded
318325
# value will remain constant.
319326
for delay in _expo(max_value=max_value):
320-
321-
if delay == max_value:
327+
if delay == max_value or self._shutdown:
322328
return self._result.FAILURE
323329

324-
try:
325-
self._client.Export(
326-
request=self._translate_data(data),
327-
metadata=self._headers,
328-
timeout=self._timeout,
329-
)
330+
with self._export_lock:
331+
try:
332+
self._client.Export(
333+
request=self._translate_data(data),
334+
metadata=self._headers,
335+
timeout=self._timeout,
336+
)
330337

331-
return self._result.SUCCESS
338+
return self._result.SUCCESS
332339

333-
except RpcError as error:
340+
except RpcError as error:
334341

335-
if error.code() in [
336-
StatusCode.CANCELLED,
337-
StatusCode.DEADLINE_EXCEEDED,
338-
StatusCode.RESOURCE_EXHAUSTED,
339-
StatusCode.ABORTED,
340-
StatusCode.OUT_OF_RANGE,
341-
StatusCode.UNAVAILABLE,
342-
StatusCode.DATA_LOSS,
343-
]:
342+
if error.code() in [
343+
StatusCode.CANCELLED,
344+
StatusCode.DEADLINE_EXCEEDED,
345+
StatusCode.RESOURCE_EXHAUSTED,
346+
StatusCode.ABORTED,
347+
StatusCode.OUT_OF_RANGE,
348+
StatusCode.UNAVAILABLE,
349+
StatusCode.DATA_LOSS,
350+
]:
344351

345-
retry_info_bin = dict(error.trailing_metadata()).get(
346-
"google.rpc.retryinfo-bin"
347-
)
348-
if retry_info_bin is not None:
349-
retry_info = RetryInfo()
350-
retry_info.ParseFromString(retry_info_bin)
351-
delay = (
352-
retry_info.retry_delay.seconds
353-
+ retry_info.retry_delay.nanos / 1.0e9
352+
retry_info_bin = dict(error.trailing_metadata()).get(
353+
"google.rpc.retryinfo-bin"
354+
)
355+
if retry_info_bin is not None:
356+
retry_info = RetryInfo()
357+
retry_info.ParseFromString(retry_info_bin)
358+
delay = (
359+
retry_info.retry_delay.seconds
360+
+ retry_info.retry_delay.nanos / 1.0e9
361+
)
362+
363+
logger.warning(
364+
(
365+
"Transient error %s encountered while exporting "
366+
"%s, retrying in %ss."
367+
),
368+
error.code(),
369+
self._exporting,
370+
delay,
371+
)
372+
sleep(delay)
373+
continue
374+
else:
375+
logger.error(
376+
"Failed to export %s, error code: %s",
377+
self._exporting,
378+
error.code(),
354379
)
355380

356-
logger.warning(
357-
(
358-
"Transient error %s encountered while exporting "
359-
"%s, retrying in %ss."
360-
),
361-
error.code(),
362-
self._exporting,
363-
delay,
364-
)
365-
sleep(delay)
366-
continue
367-
else:
368-
logger.error(
369-
"Failed to export %s, error code: %s",
370-
self._exporting,
371-
error.code(),
372-
)
373-
374-
if error.code() == StatusCode.OK:
375-
return self._result.SUCCESS
381+
if error.code() == StatusCode.OK:
382+
return self._result.SUCCESS
376383

377-
return self._result.FAILURE
384+
return self._result.FAILURE
378385

379386
return self._result.FAILURE
380387

381-
def shutdown(self) -> None:
382-
pass
388+
def shutdown(self, timeout_millis: float = 30_000, **kwargs) -> None:
389+
if self._shutdown:
390+
logger.warning("Exporter already shutdown, ignoring call")
391+
return
392+
# wait for the last export if any
393+
self._export_lock.acquire(timeout=timeout_millis)
394+
self._shutdown = True
395+
self._export_lock.release()
383396

384397
@property
385398
@abstractmethod

exporter/opentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/metric_exporter/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -433,7 +433,7 @@ def _split_metrics_data(
433433
yield MetricsData(resource_metrics=split_resource_metrics)
434434

435435
def shutdown(self, timeout_millis: float = 30_000, **kwargs) -> None:
436-
pass
436+
OTLPExporterMixin.shutdown(self, timeout_millis=timeout_millis)
437437

438438
@property
439439
def _exporting(self) -> str:

exporter/opentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/trace_exporter/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,9 @@ def _translate_data(
290290
def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult:
291291
return self._export(spans)
292292

293+
def shutdown(self) -> None:
294+
OTLPExporterMixin.shutdown(self)
295+
293296
def force_flush(self, timeout_millis: int = 30000) -> bool:
294297
return True
295298

0 commit comments

Comments
 (0)