-
Notifications
You must be signed in to change notification settings - Fork 312
Reduce span construction overhead by switching to optimized TagMap #8589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
AgentSpanContext traceContext = | ||
new TagContext( | ||
CIConstants.CIAPP_TEST_ORIGIN, | ||
Collections.emptyMap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feelings about this particular change. In effect, the constructor previously required the user to pass a mutable map. However if the provided Map was empty, the class would lazily construction a mutable Map to take place of the empty Map.
Because TagMap does not have an O(1) isEmpty, I didn't want to stick with this pattern.
What could be done instead is to pass TagMap.EMPTY and then check via a reference equality check. If others prefer that, I can adjust accordingly.
this.context = new TestContextImpl(coverageStore); | ||
|
||
AgentSpanContext traceContext = | ||
new TagContext(CIConstants.CIAPP_TEST_ORIGIN, Collections.emptyMap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same mutable empty Map issue
private ConfigurationUpdater configurationUpdater; | ||
private DefaultExceptionDebugger exceptionDebugger; | ||
private TestSnapshotListener listener; | ||
private Map<String, Object> spanTags = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the primary type of change that I've made throughout -- replacing HashMaps with TagMap-s.
To get the benefit of TagMap's quick Map-to-Map copying ability, both the source and destination Map need to be TagMap-s.
/** A set of tags that are added only to the application's root span */ | ||
private final Map<String, ?> localRootSpanTags; | ||
private final TagMap localRootSpanTags; | ||
private final boolean localRootSpanTagsNeedIntercept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get the full benefit of TagMap's fast copy ability, I need to be to call TagMap.putAll.
However the TagInterceptor interferes with being able to do that, so I've added a method to TagInterceptor that can check the source Map in advance. If any tag in the source Map requires interception, then "needs intercept" is "true". If no tag needs interception, then "needs intercept" is false.
When "needs intercept" is false, setAllTags can then safely bypass the interceptor and use TagMap.putAll to do the fully optimized copy.
|
||
public CoreTracerBuilder localRootSpanTags(Map<String, ?> localRootSpanTags) { | ||
this.localRootSpanTags = tryMakeImmutableMap(localRootSpanTags); | ||
this.localRootSpanTags = TagMap.fromMapImmutable(localRootSpanTags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept some of the old methods that take Map instead of TagMap for ease of testing; however, the preferred way is to use TagMap now.
} | ||
|
||
@Deprecated | ||
private CoreTracer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a few places, I kept constructors that take Map-s instead of TagMap-s. This is mostly concession to the Groovy tests where using Map is far easier.
I may eliminate these over time, but I didn't want to create a giant diff where many test files had to be updated. This diff is big enough as it is.
spanSamplingRules = SpanSamplingRules.deserializeFile(spanSamplingRulesFile); | ||
} | ||
|
||
this.tagInterceptor = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to do a bit of reordering to make tagInterceptor available earlier in the constructor, so I can analyze the shared tag sets.
For the curious, I was not able to move the setting of defaultSpanTags down because it has some odd interactions with other parts of the constructor.
|
||
// Builder attributes | ||
private Map<String, Object> tags; | ||
private TagMap.Builder tagBuilder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TagMap.Builder serves as a stand-in for using LinkedHashMap to preserve order. Although, the semantics aren't exactly the same.
LinkedHashMap traverses entries in the original insertion order. Meaning that if you insert: tag1, tag2, tag1. Then there will be a traversal of tag1, tag2.
TagMap.Builder simply records modifications in order, so the traversal order will be tag1 first value, tag2, tag1 second value. Arguably, this is closer to the intended semantics because it makes the behavior of the builder the same as calling setTag.
} | ||
if (value == null) { | ||
tagMap.remove(tag); | ||
tagBuilder.remove(tag); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tagBuilder records removals as a special removal Entry object. Removal Entry objects are never stored in the TagMap.
This design allows tagBuilder to be an in-order ledger of modifications, but also allow Entry objects to be shared between the TagMap.Builder and the TagMap that is produced in the end.
samplingPriority = PrioritySampling.UNSET; | ||
origin = null; | ||
coreTags = null; | ||
coreTagsNeedsIntercept = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could just as easily set needs intercept to true, since interception short-circuits on a null TagMap.
+ (null == coreTags ? 0 : coreTags.size()) | ||
+ (null == rootSpanTags ? 0 : rootSpanTags.size()) | ||
+ (null == contextualTags ? 0 : contextualTags.size()); | ||
final int tagsSize = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a further simplification, TagMaps are a fixed size (16) that's big enough to avoid a large number of collisions. But more importantly, TagMap doesn't currently implement an O(1) size method.
In short, the tagsSize calculation just isn't needed and its costly.
If no one objects to this, then I'll probably eliminate tagsSize altogether. Right now, this variable still exists because there's an argument still being passed to the DDSpanContext constructor.
|
||
final Map<String, ?> mergedTracerTags = traceConfig.mergedTracerTags; | ||
final TagMap mergedTracerTags = traceConfig.mergedTracerTags; | ||
boolean mergedTracerTagsNeedsIntercept = traceConfig.mergedTracerTagsNeedsIntercept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar pattern of checking if interception is needed in advance - in this case, traceConfig will be updated each time a new config is received by remote config.
if (contextualTags != null) { | ||
context.setAllTags(contextualTags); | ||
} | ||
context.setAllTags(mergedTracerTags, mergedTracerTagsNeedsIntercept); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These calls have been updated to call new versions of setAllTags that work with TagMap or TagMap.Builder. For those that takes a TagMap, there's also the ability to pass along previously done calculation for "needs intercept".
context.setAllTags(tagBuilder); | ||
context.setAllTags(coreTags, coreTagsNeedsIntercept); | ||
context.setAllTags(rootSpanTags, rootSpanTagsNeedsIntercept); | ||
context.setAllTags(contextualTags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't yet updated contextualTags to use a TagMap. That's an obvious next step.
if (null == oldSnapshot) { | ||
mergedTracerTags = CoreTracer.this.defaultSpanTags; | ||
mergedTracerTags = CoreTracer.this.defaultSpanTags.immutableCopy(); | ||
this.mergedTracerTagsNeedsIntercept = CoreTracer.this.defaultSpanTagsNeedsIntercept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling from CoreTracer use its "needs intercept" value
this.mergedTracerTagsNeedsIntercept = CoreTracer.this.defaultSpanTagsNeedsIntercept; | ||
} else if (getTracingTags().equals(oldSnapshot.getTracingTags())) { | ||
mergedTracerTags = oldSnapshot.mergedTracerTags; | ||
mergedTracerTagsNeedsIntercept = oldSnapshot.mergedTracerTagsNeedsIntercept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior config - just reuse "needs intercept" value from before
mergedTracerTagsNeedsIntercept = oldSnapshot.mergedTracerTagsNeedsIntercept; | ||
} else { | ||
mergedTracerTags = withTracerTags(getTracingTags(), CoreTracer.this.initialConfig, this); | ||
mergedTracerTagsNeedsIntercept = CoreTracer.this.tagInterceptor.needsIntercept(mergedTracerTags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New tag set -- needs to be freshly analyzed
dd-trace-core/src/main/java/datadog/trace/core/DDSpanContext.java
Outdated
Show resolved
Hide resolved
Debugger benchmarksParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 8 metrics, 7 unstable metrics. See unchanged results
Request duration reports for reportsgantt
title reports - request duration [CI 0.99] : candidate=None, baseline=None
dateFormat X
axisFormat %s
section baseline
noprobe (317.967 µs) : 284, 352
. : milestone, 318,
basic (276.316 µs) : 270, 283
. : milestone, 276,
loop (8.965 ms) : 8960, 8969
. : milestone, 8965,
section candidate
noprobe (328.404 µs) : 276, 381
. : milestone, 328,
basic (278.719 µs) : 273, 285
. : milestone, 279,
loop (8.958 ms) : 8953, 8963
. : milestone, 8958,
|
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 46 metrics, 7 unstable metrics. Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.0 s) : 0, 1000399
Total [baseline] (10.685 s) : 0, 10684993
Agent [candidate] (997.628 ms) : 0, 997628
Total [candidate] (10.679 s) : 0, 10678576
section appsec
Agent [baseline] (1.174 s) : 0, 1173501
Total [baseline] (10.749 s) : 0, 10748782
Agent [candidate] (1.175 s) : 0, 1175281
Total [candidate] (10.79 s) : 0, 10789701
section iast
Agent [baseline] (1.131 s) : 0, 1130723
Total [baseline] (10.885 s) : 0, 10884576
Agent [candidate] (1.135 s) : 0, 1134691
Total [candidate] (10.888 s) : 0, 10888315
section profiling
Agent [baseline] (1.248 s) : 0, 1247670
Total [baseline] (10.96 s) : 0, 10960360
Agent [candidate] (1.256 s) : 0, 1255825
Total [candidate] (10.945 s) : 0, 10945382
gantt
title petclinic - break down per module: candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (690.215 ms) : 0, 690215
BytebuddyAgent [candidate] (687.238 ms) : 0, 687238
GlobalTracer [baseline] (243.55 ms) : 0, 243550
GlobalTracer [candidate] (243.831 ms) : 0, 243831
AppSec [baseline] (30.623 ms) : 0, 30623
AppSec [candidate] (30.684 ms) : 0, 30684
Debugger [baseline] (6.073 ms) : 0, 6073
Debugger [candidate] (6.038 ms) : 0, 6038
Remote Config [baseline] (677.245 µs) : 0, 677
Remote Config [candidate] (680.364 µs) : 0, 680
Telemetry [baseline] (8.262 ms) : 0, 8262
Telemetry [candidate] (8.196 ms) : 0, 8196
section appsec
BytebuddyAgent [baseline] (708.53 ms) : 0, 708530
BytebuddyAgent [candidate] (708.455 ms) : 0, 708455
GlobalTracer [baseline] (235.353 ms) : 0, 235353
GlobalTracer [candidate] (236.826 ms) : 0, 236826
IAST [baseline] (23.457 ms) : 0, 23457
IAST [candidate] (23.407 ms) : 0, 23407
AppSec [baseline] (170.982 ms) : 0, 170982
AppSec [candidate] (171.378 ms) : 0, 171378
Debugger [baseline] (5.707 ms) : 0, 5707
Debugger [candidate] (5.756 ms) : 0, 5756
Remote Config [baseline] (594.017 µs) : 0, 594
Remote Config [candidate] (603.263 µs) : 0, 603
Telemetry [baseline] (7.956 ms) : 0, 7956
Telemetry [candidate] (8.049 ms) : 0, 8049
section iast
BytebuddyAgent [baseline] (805.287 ms) : 0, 805287
BytebuddyAgent [candidate] (806.812 ms) : 0, 806812
GlobalTracer [baseline] (232.524 ms) : 0, 232524
GlobalTracer [candidate] (234.683 ms) : 0, 234683
IAST [baseline] (26.495 ms) : 0, 26495
IAST [candidate] (29.091 ms) : 0, 29091
AppSec [baseline] (31.349 ms) : 0, 31349
AppSec [candidate] (28.951 ms) : 0, 28951
Debugger [baseline] (5.69 ms) : 0, 5690
Debugger [candidate] (5.742 ms) : 0, 5742
Remote Config [baseline] (577.435 µs) : 0, 577
Remote Config [candidate] (579.008 µs) : 0, 579
Telemetry [baseline] (7.904 ms) : 0, 7904
Telemetry [candidate] (7.934 ms) : 0, 7934
section profiling
BytebuddyAgent [baseline] (678.856 ms) : 0, 678856
BytebuddyAgent [candidate] (682.998 ms) : 0, 682998
GlobalTracer [baseline] (363.094 ms) : 0, 363094
GlobalTracer [candidate] (365.515 ms) : 0, 365515
AppSec [baseline] (31.98 ms) : 0, 31980
AppSec [candidate] (31.582 ms) : 0, 31582
Debugger [baseline] (11.376 ms) : 0, 11376
Debugger [candidate] (11.46 ms) : 0, 11460
Remote Config [baseline] (660.108 µs) : 0, 660
Remote Config [candidate] (669.187 µs) : 0, 669
Telemetry [baseline] (9.633 ms) : 0, 9633
Telemetry [candidate] (10.274 ms) : 0, 10274
ProfilingAgent [baseline] (103.255 ms) : 0, 103255
ProfilingAgent [candidate] (104.297 ms) : 0, 104297
Profiling [baseline] (103.28 ms) : 0, 103280
Profiling [candidate] (104.322 ms) : 0, 104322
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.003 s) : 0, 1002967
Total [baseline] (8.644 s) : 0, 8644195
Agent [candidate] (1.003 s) : 0, 1003120
Total [candidate] (8.6 s) : 0, 8600021
section iast
Agent [baseline] (1.141 s) : 0, 1140523
Total [baseline] (9.33 s) : 0, 9329840
Agent [candidate] (1.141 s) : 0, 1140885
Total [candidate] (9.334 s) : 0, 9333957
gantt
title insecure-bank - break down per module: candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (692.139 ms) : 0, 692139
BytebuddyAgent [candidate] (690.329 ms) : 0, 690329
GlobalTracer [baseline] (244.198 ms) : 0, 244198
GlobalTracer [candidate] (245.136 ms) : 0, 245136
AppSec [baseline] (30.715 ms) : 0, 30715
AppSec [candidate] (30.997 ms) : 0, 30997
Debugger [baseline] (5.994 ms) : 0, 5994
Debugger [candidate] (6.097 ms) : 0, 6097
Remote Config [baseline] (678.101 µs) : 0, 678
Remote Config [candidate] (687.589 µs) : 0, 688
Telemetry [baseline] (8.335 ms) : 0, 8335
Telemetry [candidate] (9.032 ms) : 0, 9032
section iast
BytebuddyAgent [baseline] (812.494 ms) : 0, 812494
BytebuddyAgent [candidate] (811.969 ms) : 0, 811969
GlobalTracer [baseline] (234.303 ms) : 0, 234303
GlobalTracer [candidate] (235.342 ms) : 0, 235342
AppSec [baseline] (30.941 ms) : 0, 30941
AppSec [candidate] (30.032 ms) : 0, 30032
Debugger [baseline] (5.77 ms) : 0, 5770
Debugger [candidate] (5.768 ms) : 0, 5768
Remote Config [baseline] (596.384 µs) : 0, 596
Remote Config [candidate] (584.02 µs) : 0, 584
Telemetry [baseline] (7.952 ms) : 0, 7952
Telemetry [candidate] (7.96 ms) : 0, 7960
IAST [baseline] (27.514 ms) : 0, 27514
IAST [candidate] (28.295 ms) : 0, 28295
LoadParameters
See matching parameters
SummaryFound 1 performance improvements and 2 performance regressions! Performance is the same for 9 metrics, 12 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section baseline
no_agent (36.464 ms) : 36176, 36752
. : milestone, 36464,
appsec (47.596 ms) : 47169, 48024
. : milestone, 47596,
code_origins (45.211 ms) : 44812, 45610
. : milestone, 45211,
iast (43.459 ms) : 43088, 43829
. : milestone, 43459,
profiling (48.239 ms) : 47789, 48689
. : milestone, 48239,
tracing (42.222 ms) : 41874, 42570
. : milestone, 42222,
section candidate
no_agent (36.432 ms) : 36133, 36732
. : milestone, 36432,
appsec (47.889 ms) : 47457, 48321
. : milestone, 47889,
code_origins (45.408 ms) : 45010, 45806
. : milestone, 45408,
iast (45.398 ms) : 44995, 45800
. : milestone, 45398,
profiling (47.215 ms) : 46783, 47647
. : milestone, 47215,
tracing (43.935 ms) : 43559, 44311
. : milestone, 43935,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section baseline
no_agent (4.391 ms) : 4343, 4440
. : milestone, 4391,
iast (9.6 ms) : 9442, 9758
. : milestone, 9600,
iast_FULL (13.801 ms) : 13526, 14076
. : milestone, 13801,
iast_GLOBAL (10.359 ms) : 10178, 10540
. : milestone, 10359,
profiling (8.42 ms) : 8275, 8566
. : milestone, 8420,
tracing (7.763 ms) : 7644, 7883
. : milestone, 7763,
section candidate
no_agent (4.261 ms) : 4208, 4314
. : milestone, 4261,
iast (9.43 ms) : 9271, 9590
. : milestone, 9430,
iast_FULL (14.346 ms) : 14051, 14640
. : milestone, 14346,
iast_GLOBAL (10.227 ms) : 10049, 10405
. : milestone, 10227,
profiling (8.418 ms) : 8283, 8553
. : milestone, 8418,
tracing (7.371 ms) : 7262, 7480
. : milestone, 7371,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section baseline
no_agent (15.493 s) : 15493000, 15493000
. : milestone, 15493000,
appsec (14.609 s) : 14609000, 14609000
. : milestone, 14609000,
iast (18.364 s) : 18364000, 18364000
. : milestone, 18364000,
iast_GLOBAL (17.934 s) : 17934000, 17934000
. : milestone, 17934000,
profiling (15.234 s) : 15234000, 15234000
. : milestone, 15234000,
tracing (14.789 s) : 14789000, 14789000
. : milestone, 14789000,
section candidate
no_agent (15.611 s) : 15611000, 15611000
. : milestone, 15611000,
appsec (14.758 s) : 14758000, 14758000
. : milestone, 14758000,
iast (18.114 s) : 18114000, 18114000
. : milestone, 18114000,
iast_GLOBAL (18.522 s) : 18522000, 18522000
. : milestone, 18522000,
profiling (15.846 s) : 15846000, 15846000
. : milestone, 15846000,
tracing (14.723 s) : 14723000, 14723000
. : milestone, 14723000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.52.0-SNAPSHOT~148117965e, baseline=1.52.0-SNAPSHOT~b86f4f70d6
dateFormat X
axisFormat %s
section baseline
no_agent (1.473 ms) : 1461, 1484
. : milestone, 1473,
appsec (2.396 ms) : 2346, 2446
. : milestone, 2396,
iast (2.185 ms) : 2123, 2248
. : milestone, 2185,
iast_GLOBAL (2.232 ms) : 2169, 2295
. : milestone, 2232,
profiling (2.035 ms) : 1984, 2085
. : milestone, 2035,
tracing (2.002 ms) : 1954, 2051
. : milestone, 2002,
section candidate
no_agent (1.471 ms) : 1460, 1483
. : milestone, 1471,
appsec (2.397 ms) : 2347, 2447
. : milestone, 2397,
iast (2.18 ms) : 2117, 2242
. : milestone, 2180,
iast_GLOBAL (2.234 ms) : 2172, 2297
. : milestone, 2234,
profiling (2.048 ms) : 1996, 2099
. : milestone, 2048,
tracing (2.001 ms) : 1953, 2049
. : milestone, 2001,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest a rebase to get an up-to-date benchmark run.
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
* Because the Entry objects can be shared between multiple TagMaps, the Entry objects cannot contain | ||
* form a link list to handle collisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to choose only one verb between contain/form
public static final TagMap EMPTY = createEmpty(); | ||
|
||
private static final TagMap createEmpty() { | ||
return new TagMap().freeze(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we use the parameterized ctor to avoid allocating an array of 16 entries that'll never be used ?
return new TagMap().freeze(); | |
return new TagMap(null); |
(not sure if null is OK, I haven't read all the code yet :p)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, yeah, not ok, but we could create an empty array.
public final boolean getBoolean(String tag) { | ||
Entry entry = this.getEntry(tag); | ||
return entry == null ? false : entry.booleanValue(); | ||
} | ||
|
||
public final int getInt(String tag) { | ||
Entry entry = this.getEntry(tag); | ||
return entry == null ? 0 : entry.intValue(); | ||
} | ||
|
||
public final long getLong(String tag) { | ||
Entry entry = this.getEntry(tag); | ||
return entry == null ? 0L : entry.longValue(); | ||
} | ||
|
||
public final float getFloat(String tag) { | ||
Entry entry = this.getEntry(tag); | ||
return entry == null ? 0F : entry.floatValue(); | ||
} | ||
|
||
public final double getDouble(String tag) { | ||
Entry entry = this.getEntry(tag); | ||
return entry == null ? 0D : entry.doubleValue(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks a bit dangerous to have default values for those, then we cannot really know when we get 0
if the value is really 0 or if the key is absent 🤔
Maybe it's no issue the way it's used... And I don't really have a better proposal other than throwing
Object[] thisBuckets = this.buckets; | ||
|
||
int hash = _hash(tag); | ||
int bucketIndex = hash & (thisBuckets.length - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the speed gain from doing a bitmask rather than a modulo worth the downside of discarding the upper bits from the hash ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty standard to use a bitmask to get the bucket index instead of modulo. div is not cheap on cpu.
I don't understand your comment about discarding the upper bits. modulo will give the same result.
the only constraint is the power of 2 for bucket array size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah if we keep the power of two constraint it doesn't change anything, but if you use modulo you can use prime numbers for the array size, which will be better at shuffling hash bits when modulo-ed.
If you do a bitmask with a size that is a power of two, you effectively only look at the lower bits of the hash
(the most significant bits are just masked away)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the hashes are already "folded in two" in the _hash()
function, but that's still 16 bits, of which the upper bits will rarely see the light
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what the standard HashMap in the JDK is doing, nothing new here.
It seems it'd be relatively straightforward to keep track of the count in the map and avoid the O(n) price of computing it (or checking for emptiness), was it a deliberate choice to save to memory of the count and the operations of incrementing/decrementing it ? |
I'll admit that the lack of count support partially is down to laziness so far. I'm happy to try implementing it and then we can measure the impact. I'm not really worried about the extra memory - mostly just the extra code to track it, but I imagine the cost is negligible. |
I've gone ahead and added the count support. In some places updating count is O(n) itself because of needing to traverse the collision chain. In practice, the chain shouldn't be deep and the constant factor is small, so I think it is worthwhile change. I did rerun my benchmarks and adding count tracking had no real impact overall. |
// e.g. size 0 will not work, it results in ArrayIndexOutOfBoundsException, but size 1 does | ||
static final OptimizedTagMap EMPTY = new OptimizedTagMap(new Object[1], 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took me 30 seconds to understand why it says "size 0 will not work" and just after we pass 0 to the size
parameter. The comment refers to the size of the buckets
array. Maybe the comment can be rephrased to be understood more easily.
...va-agent/appsec/src/test/groovy/com/datadog/appsec/gateway/GatewayBridgeSpecification.groovy
Outdated
Show resolved
Hide resolved
dd-trace-core/src/main/java/datadog/trace/core/taginterceptor/TagInterceptor.java
Show resolved
Hide resolved
internal-api/src/main/java/datadog/trace/bootstrap/instrumentation/api/TagContext.java
Outdated
Show resolved
Hide resolved
This change introduces TagMap as replacement for HashMap when working with tags. TagMap has two different implementations... - one that extends a regular HashMap - another that uses a different approach that allows for Entry sharing This change currently uses the HashMap approach by default, but allows for switching to the optimized Map via a configuration flag. The optimized TagMap is designed to be good at operations that the tracer performs regularly but HashMap isn't great at. Specifically, Map-to-Map copies and storing primitives. To get the benefit of the optimized TagMap, calling code needs to use TagMap-s for both the source and destination map. The calling code also needs to make sure to use the bulk operations that which are the most optimized. To take advantage of TagMap in span creation, a mechanism was introduced that allows for bypassing TagInterceptors. Bypassing TagInterceptors is done by analyzing a TagMap that is going to be reused ahead-of-time to determine if interception is needed. If interception isn't needed, then SpanBuilder can use a bulk operation to update the map. To maintain the insertion order semantics of SpanBuilder, TagMap also includes a Ledger. A Ledger is concatenative ledger of entry modifications to a map A Ledger is now used in place of a LinkedHashMap in SpanBuilder to provide the insertion order semantics. Since these changes primarily serve to reduce allocation, they should the biggest gain in memory constrained environments. In memory constrained environments, these changes yield a 10% increase in sustainable throughput with spring-petclinic.
05204f1
to
98f71f6
Compare
This change reduces the overhead from constructing spans both in terms of CPU and memory.
The biggest gains come when making use of SpanBuilders for both constructing the span and manipulating the tags on the Span. Span creation throughput with SpanBuilders improves by as much as 45%. startSpan methods commonly used in instrumentations improve by around 20%. In applications, real median response time gains are around to 5-10%.
More importantly, these changes reduce the amount of memory consumed by each Span reducing allocation / garbage collection pressure.
In a "real" application, the change is less noticeable when memory is plentiful; however, the difference becomes more pronounced when memory is limited. spring-petclinic shows a 17% throughput improvement relative to the current release when memory is constrained to 192M or 128M. At 96M, the difference is negligible 2-3% gain throughput. At 64M, this change becomes a detriment showing a -5% change in throughput.
What Does This Do
These gains are accomplished by changing how tags are stored to use a new Map (TagMap) that excels at Map-to-Map copies. To fully realize the gain, there's additional work to skip tag interceptors when possible. With these changes, the setting of the shared tags on a Span-s is nearly allocation free.
Motivation
The tracer does some Map operations regularly that regular HashMaps aren't good at.
The primary operation of concern being copying Entry-s from Map to Map where every copied Entry requires allocating a new Entry object in the destination Map.
And secondarily, Builder patterns which use defensive copying but also require in-order processing in the Tracer.
TagMap solves both those problems by using immutable Entry objects. By making the Entry objects immutable, the Entry objects can be freely shared between Map instances and between the Builder and a Map.
Additional Notes
To get the full benefit of this new TagMap, both the source Map and the destination Map need to be TagMap-s and the transfer needs to happen through putAll or the TagMap specific putEntry.
Meaning - that to get a significant gain quite a few files had to be modified
Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any usefull labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issue