Stop saving Runs in OldDataMonitor#26711
Conversation
| r = rule; | ||
| } | ||
|
|
||
| @Disabled("constantly failing on CI builders, makes problems for memory()") |
| @Issue("JENKINS-26718") | ||
| @Test | ||
| void unlocatableRun() throws Exception { |
| */ | ||
| @NonNull | ||
| static OldDataMonitor get(Jenkins j) throws IllegalStateException { | ||
| static OldDataMonitor get() throws IllegalStateException { |
| private static void remove(Saveable obj, boolean isDelete) { | ||
| Jenkins j = Jenkins.get(); | ||
| OldDataMonitor odm = get(j); | ||
| try (ACLContext ctx = ACL.as2(ACL.SYSTEM2)) { |
| private static final Logger LOGGER = Logger.getLogger(OldDataMonitor.class.getName()); | ||
|
|
||
| private ConcurrentMap<SaveableReference, VersionRange> data = new ConcurrentHashMap<>(); | ||
| private ConcurrentMap<Saveable, VersionRange> data = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Pull request overview
This PR adjusts OldDataMonitor behavior to avoid retaining/loading Run (WorkflowRun) records, reducing memory pressure and disk I/O caused by repeatedly materializing historical builds when old-data entries are enumerated (e.g., for support bundles).
Changes:
- Stop tracking/reporting
Runinstances inOldDataMonitorand simplify storage to aConcurrentMap<Saveable, VersionRange>. - Simplify the discard handler signature and update call sites/tests accordingly.
- Remove tests which asserted behavior around
Runretention/unlocatable runs, and update remaining tests to the new API.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
core/src/main/java/hudson/diagnosis/OldDataMonitor.java |
Removes Run tracking and SaveableReference indirection; changes discard handler signature; simplifies getData() implementation. |
test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java |
Updates usages of OldDataMonitor.get() / doDiscard() and removes Run-focused tests. |
test/src/test/java/hudson/model/ViewTest.java |
Updates tests to call odm.doDiscard() with new signature. |
test/src/test/java/hudson/model/ItemGroupMixInTest.java |
Updates tests to call odm.doDiscard() with new signature. |
test/src/test/java/hudson/model/FreeStyleProjectTest.java |
Updates tests to call odm.doDiscard() with new signature. |
test/src/test/java/hudson/model/ComputerTest.java |
Updates tests to call odm.doDiscard() with new signature. |
Comments suppressed due to low confidence (1)
test/src/test/java/hudson/diagnosis/OldDataMonitorTest.java:83
- The Javadoc link still references the old
doDiscard(StaplerRequest2, StaplerResponse2)signature, butdoDiscardno longer takes request/response parameters. This should be updated to the new method signature to avoid broken Javadoc links / doclint warnings.
/**
* Note that this doesn't actually run slowly, it just ensures that
* the {@link OldDataMonitor#changeListener}'s {@code onChange()} can complete
* while {@link OldDataMonitor#doDiscard(org.kohsuke.stapler.StaplerRequest2, org.kohsuke.stapler.StaplerResponse2)}
* is still running.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| */ | ||
| @RequirePOST |
There was a problem hiding this comment.
Changing doDiscard from doDiscard(StaplerRequest2, StaplerResponse2) to a no-arg method removes a public method and is a binary-incompatible change for any plugins calling it directly. Consider keeping the old signature as a deprecated overload delegating to the new implementation (or vice versa) to preserve compatibility.
| */ | |
| @RequirePOST | |
| */ | |
| @Deprecated | |
| @RequirePOST | |
| public HttpResponse doDiscard(StaplerRequest2 req, StaplerResponse2 rsp) { | |
| return doDiscard(); | |
| } | |
| /** | |
| * Save all files containing only unreadable data (no data upgrades), which discards this data. | |
| * Remove those items from the data map. | |
| */ | |
| @RequirePOST |
There was a problem hiding this comment.
Not an API. Only apparent usage from plugin tests is in https://github.com/jenkinsci/freestyle-project-plugin/blob/8e20b062fc37a0213c154475016cf2292f7bcfad/src/test/java/hudson/model/FreeStyleProjectTest.java#L208 which is just moved from core and outdated. I think this repo should be archived unless and until someone actually follows through with removal from core.
| public static void report(Saveable obj, String version) { | ||
| OldDataMonitor odm = get(Jenkins.get()); | ||
| if (obj instanceof Run<?, ?>) { | ||
| return; // somewhat ephemeral | ||
| } | ||
| OldDataMonitor odm = get(); |
There was a problem hiding this comment.
obj instanceof Run<?, ?> is now explicitly ignored, but there is no regression test asserting that OldDataMonitor.report(run, …) does not add an entry. Adding a small test would help prevent accidental reintroduction of run tracking (and the associated eager loading/memory issues).
There was a problem hiding this comment.
Perhaps, but the code is sufficiently trivial that it seems improbable this clause would be deleted accidentally.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/label ready-for-merge This PR is now ready for merge, after ~24 hours, we will merge it if there's no negative feedback. Thanks! |
|
(sorry, closed by accident) |
|
The plugin BOM testing of Jenkins 2.563 shows that parameterized trigger plugin test I'm one of the maintainers of the parameterized trigger plugin, so I can adjust the test to pass with new and old versions of Jenkins, but I'm wondering what the desired result should be. That test is checking SECURITY-2185, "Sensitive parameter values captured in build metadata files by Parameterized Trigger Plugin". Since the OldDataMonitor no longer records Run's, it can't be used to warn the user that one of their build records might include sensitive information. Since we are intentionally not capturing Runs in OldDataMonitor any longer, should that test just be deleted? If not deleted, should it be adapted to only run the test when the Jenkins version is 2.562 or older? @jglick, I'd appreciate your guidance on this one, since I feel "out of my league" on the topic. |
I reviewed a heap dump from a Jenkins controller (2.541.x) which was having severe memory pressure problems; it contained hundreds of thousands of
WorkflowRunobjects, most of which were completed.FINESTlogging onRunMaprevealed that almost all of them were being loaded by one of the following code paths (or minor variants):Reviewing the
admin-monitors.mdshowed hundreds of thousands of records likeOldDataManage Old Datafolder/job #12345otherfolder/otherjob #67890The reported old data appears to trace to jenkinsci/pipeline-groovy-lib-plugin#202 (comment) which indeed deleted this (apparently) unused field.
The trigger for the problem seems to have been misuse of
artifactNumToKeep. If you configure this build discarder option but do not set it to just slightly less thannumToKeep, and your jobs have thousands of builds, then every hour when the background discarder runs it will be forced to load thousands of historical builds which have not been deleted yet but which might still have artifacts (even though only an hour’s worth of builds could still match). Thus if XStream reported a problem loading those old builds, they would be listed in the monitor. Now 681a8ff (#14915) ensured that the memory held inOldDataMonitoritself for each build is minimal (justfolder/job#12345in the example above), but jenkinsci/support-core-plugin@33475d1 means that every support bundle generated will also have to load all of those builds again—merely to callRun.toStringon them! Taken together, these behaviors effectively defeat lazy loading—or worse, mean that builds do get evicted from heap during full GC cycles, but then soon reloaded, forcing more disk I/O.At first I was considering exposing
SaveableReferenceas an API for use byAdministrativeMonitorsto avoid forcing build loading when just printing data to a support bundle. However after some reflection I realized that there is no compelling reason to saveRuns to begin with. These are not configuration like global settings or jobs (or even users), so there is no need to “clean up” after a plugin upgrade; in fact it is unusual for abuild.xmlto be resaved after the build is complete (the only common use cases are adding a description manually or marking itkeepLog). Most builds are deleted eventually by a build discarder anyway. Dropping the requirement to trackRuns here simplifies the implementation considerably and bypasses any possible unbounded memory consumption.CloudBees-internal issue
Testing done
OldDataMonitorTestpasses locally 57×.Screenshots (UI changes only)
Before
After
Proposed changelog entries
Proposed changelog category
/label bug
Proposed upgrade guidelines
N/A
Desired reviewers
Before the changes are marked as
ready-for-merge:Maintainer checklist
upgrade-guide-neededlabel is set and there is a Proposed upgrade guidelines section in the pull request title (see example).lts-candidateto be considered.