Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@donoghuc
Copy link
Member

@donoghuc donoghuc commented Nov 11, 2025

Release notes

[rn:skip]

What does this PR do?

Managing a Go toolchain for persisting ENV vars in logstash container artifacts has become cumbersome. We already manage a java runtime so this commit presents a path forward to use that instead of Go. The Go binary is faster than java (in my testing Go would complete in around less than 200ms while java takes over 300ms). Given the container startup time is on the order of magnitute of seconds this change should be inperceptable to consumers. The benefit from consolidating in Java is worth the slightly lower performance.

Why is it important/What is the impact to the user?

This should not be noticeable, though technically starting logstash in a container will take about 200ms longer.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

How to test this PR locally

Build container:

➜  logstash git:(rewrite-env2yaml-in-java) βœ— ARCH="aarch64" rake artifact:docker
➜  logstash git:(rewrite-env2yaml-in-java) βœ— docker image ls
REPOSITORY                                 TAG              IMAGE ID       CREATED              SIZE
docker.elastic.co/logstash/logstash-full   9.3.0-SNAPSHOT   19652eb23245   About a minute ago   1.48GB

Run env2yaml directly or check at startup.

#!/bin/bash

JAVA_IMAGE="19652eb23245"
GO_IMAGE="docker.elastic.co/logstash/logstash:9.2.0"

echo "=== env2yaml comparison ==="

echo "Go:"
docker run --rm -e PIPELINE_WORKERS=4 --entrypoint="" $GO_IMAGE \
  bash -c "env2yaml /usr/share/logstash/config/logstash.yml 2>&1"

echo
echo "Java:"
docker run --rm -e PIPELINE_WORKERS=4 --entrypoint="" $JAVA_IMAGE \
  bash -c "env2yaml /usr/share/logstash/config/logstash.yml 2>&1"

echo
echo "=== Logstash startup comparison ==="

echo "Go startup:"
time timeout 10 docker run -e PIPELINE_WORKERS=4 --rm $GO_IMAGE

echo
echo "Java startup:"
time timeout 10 docker run -e PIPELINE_WORKERS=4 --rm $JAVA_IMAGE

Example:

=== env2yaml comparison ===
Go:
2025/11/11 23:52:04 Setting 'pipeline.workers' from environment.

Java:
2025/11/11 23:52:04 Setting 'pipeline.workers' from environment.

=== Logstash startup comparison ===
Go startup:
2025/11/11 23:52:04 Setting 'pipeline.workers' from environment.
Using bundled JDK: /usr/share/logstash/jdk
Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2025-11-11T23:52:08,200][INFO ][logstash.runner          ] Log4j configuration path used is: /usr/share/logstash/config/log4j2.properties
[2025-11-11T23:52:08,202][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"9.2.0", "jruby.version"=>"jruby 9.4.13.0 (3.1.4) 2025-06-10 9938a3461f OpenJDK 64-Bit Server VM 21.0.8+9-LTS on 21.0.8+9-LTS +indy +jit [aarch64-linux]"}
[2025-11-11T23:52:08,203][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, -Dls.cgroup.cpuacct.path.override=/, -Dls.cgroup.cpu.path.override=/, -Djruby.regexp.interruptible=true, -Djdk.io.File.enableADS=true, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED, -Dio.netty.allocator.maxOrder=11]
[2025-11-11T23:52:08,217][INFO ][org.logstash.jackson.StreamReadConstraintsUtil] Jackson default value override `logstash.jackson.stream-read-constraints.max-string-length` configured to `200000000` (logstash default)
[2025-11-11T23:52:08,217][INFO ][org.logstash.jackson.StreamReadConstraintsUtil] Jackson default value override `logstash.jackson.stream-read-constraints.max-number-length` configured to `10000` (logstash default)
[2025-11-11T23:52:08,217][INFO ][org.logstash.jackson.StreamReadConstraintsUtil] Jackson default value override `logstash.jackson.stream-read-constraints.max-nesting-depth` configured to `1000` (logstash default)
[2025-11-11T23:52:08,220][INFO ][logstash.settings        ] Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[2025-11-11T23:52:08,222][INFO ][logstash.settings        ] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[2025-11-11T23:52:08,286][INFO ][logstash.agent           ] No persistent UUID file found. Generating new UUID {:uuid=>"dc147b65-1477-40b7-a2db-214791cc5319", :path=>"/usr/share/logstash/data/uuid"}
[2025-11-11T23:52:08,453][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2025-11-11T23:52:08,513][INFO ][org.reflections.Reflections] Reflections took 37 ms to scan 1 urls, producing 162 keys and 557 values
[2025-11-11T23:52:08,610][INFO ][logstash.javapipeline    ][main] Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[2025-11-11T23:52:08,620][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "batch_metric_sampling"=>"minimal", "pipeline.sources"=>["/usr/share/logstash/pipeline/logstash.conf"], :thread=>"#<Thread:0x55ea7f5c /usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:147 run>"}
[2025-11-11T23:52:08,828][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.21}
[2025-11-11T23:52:08,829][INFO ][logstash.inputs.beats    ][main] Starting input listener {:address=>"0.0.0.0:5044"}
[2025-11-11T23:52:08,831][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2025-11-11T23:52:08,857][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2025-11-11T23:52:08,868][INFO ][org.logstash.beats.Server][main][0710cad67e8f47667bc7612580d5b91f691dd8262a4187d9eca8cf87229d04aa] Starting server on port: 5044
[2025-11-11T23:52:14,841][WARN ][logstash.runner          ] SIGTERM received. Shutting down.
[2025-11-11T23:52:14,849][WARN ][logstash.runner          ] SIGTERM received. Shutting down.
[2025-11-11T23:52:23,184][INFO ][logstash.javapipeline    ][main] Pipeline terminated {"pipeline.id"=>"main"}
[2025-11-11T23:52:24,102][INFO ][logstash.pipelinesregistry] Removed pipeline from registry successfully {:pipeline_id=>:main}
[2025-11-11T23:52:24,107][INFO ][logstash.runner          ] Logstash shut down.

real	0m19.434s
user	0m0.021s
sys	0m0.022s

Java startup:
2025/11/11 23:52:24 Setting 'pipeline.workers' from environment.
Using bundled JDK: /usr/share/logstash/jdk
Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2025-11-11T23:52:27,737][INFO ][logstash.runner          ] Log4j configuration path used is: /usr/share/logstash/config/log4j2.properties
[2025-11-11T23:52:27,740][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"9.3.0", "jruby.version"=>"jruby 9.4.13.0 (3.1.4) 2025-06-10 9938a3461f OpenJDK 64-Bit Server VM 21.0.9+10-LTS on 21.0.9+10-LTS +indy +jit [aarch64-linux]"}
[2025-11-11T23:52:27,741][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dls.cgroup.cpuacct.path.override=/, -Dls.cgroup.cpu.path.override=/, -Djruby.regexp.interruptible=true, -Djruby.compile.invokedynamic=true, -Djdk.io.File.enableADS=true, -Dlog4j2.isThreadContextMapInheritable=true, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED, -Dio.netty.allocator.maxOrder=11]
[2025-11-11T23:52:27,755][INFO ][org.logstash.jackson.StreamReadConstraintsUtil] Jackson default value override `logstash.jackson.stream-read-constraints.max-string-length` configured to `200000000` (logstash default)
[2025-11-11T23:52:27,755][INFO ][org.logstash.jackson.StreamReadConstraintsUtil] Jackson default value override `logstash.jackson.stream-read-constraints.max-number-length` configured to `10000` (logstash default)
[2025-11-11T23:52:27,755][INFO ][org.logstash.jackson.StreamReadConstraintsUtil] Jackson default value override `logstash.jackson.stream-read-constraints.max-nesting-depth` configured to `1000` (logstash default)
[2025-11-11T23:52:27,758][INFO ][logstash.settings        ] Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[2025-11-11T23:52:27,760][INFO ][logstash.settings        ] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[2025-11-11T23:52:27,822][INFO ][logstash.agent           ] No persistent UUID file found. Generating new UUID {:uuid=>"25df194b-1282-49b1-945f-26f145f99d1e", :path=>"/usr/share/logstash/data/uuid"}
[2025-11-11T23:52:27,984][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2025-11-11T23:52:28,045][INFO ][org.reflections.Reflections] Reflections took 39 ms to scan 1 urls, producing 162 keys and 558 values
[2025-11-11T23:52:28,153][INFO ][logstash.javapipeline    ][main] Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[2025-11-11T23:52:28,164][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "batch_metric_sampling"=>"minimal", "pipeline.sources"=>["/usr/share/logstash/pipeline/logstash.conf"], :thread=>"#<Thread:0x775975c0 /usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:147 run>"}
[2025-11-11T23:52:28,351][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.19}
[2025-11-11T23:52:28,352][INFO ][logstash.inputs.beats    ][main] Starting input listener {:address=>"0.0.0.0:5044"}
[2025-11-11T23:52:28,354][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2025-11-11T23:52:28,359][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2025-11-11T23:52:28,376][INFO ][org.logstash.beats.Server][main][0710cad67e8f47667bc7612580d5b91f691dd8262a4187d9eca8cf87229d04aa] Starting server on port: 5044
[2025-11-11T23:52:34,268][WARN ][logstash.runner          ] SIGTERM received. Shutting down.
[2025-11-11T23:52:34,278][WARN ][logstash.runner          ] SIGTERM received. Shutting down.
[2025-11-11T23:52:42,608][INFO ][logstash.javapipeline    ][main] Pipeline terminated {"pipeline.id"=>"main"}
[2025-11-11T23:52:43,552][INFO ][logstash.pipelinesregistry] Removed pipeline from registry successfully {:pipeline_id=>:main}
[2025-11-11T23:52:43,558][INFO ][logstash.runner          ] Logstash shut down.

real	0m19.441s
user	0m0.019s
sys	0m0.019s

@github-actions
Copy link
Contributor

πŸ€– GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
  • /run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

@mergify
Copy link
Contributor

mergify bot commented Nov 11, 2025

This pull request does not have a backport label. Could you fix it @donoghuc? πŸ™
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • If no backport is necessary, please add the backport-skip label

@donoghuc
Copy link
Member Author

/run exhaustive tests

@donoghuc
Copy link
Member Author

Failure looks suspiciously related to a treetop gem release from yesterday https://rubygems.org/gems/treetop I started a build off main to see if the same failure appears there: https://buildkite.com/elastic/logstash-pull-request-pipeline/builds/3850

Managing a Go toolchain for persisting ENV vars in logstash container artifacts
has become cumbersome. We already manage a java runtime so this commit presents
a path forward to use that instead of Go. The Go binary is faster than java (in
my testing Go would complete in around less than 200ms while java takes over
300ms). Given the container startup time is on the order of magnitute of seconds
this change should be inperceptable to consumers. The benefit from consolidating
in Java is worth the slightly lower performance.
@donoghuc donoghuc force-pushed the rewrite-env2yaml-in-java branch from 1a3eb07 to aa61b56 Compare November 12, 2025 19:49
@donoghuc donoghuc changed the title WIP: Rewrite Env2yaml in java instead of Go Rewrite Env2yaml in java instead of Go Nov 12, 2025
@donoghuc
Copy link
Member Author

Double checked what copilot came up with. Also to show how insignificant the performance is i did this simple ruby test:

#!/usr/bin/env ruby

GO_IMAGE = "docker.elastic.co/logstash/logstash:9.2.0"
JAVA_IMAGE = "772bb6342461"

def measure_env2yaml(image, runs = 100)
  times = []
  runs.times do
    start = Time.now
    system("docker run --rm -e PIPELINE_WORKERS=4 --entrypoint='' #{image} env2yaml /usr/share/logstash/config/logstash.yml > /dev/null 2>&1")
    times << ((Time.now - start) * 1000).to_i
  end
  times.sum / times.size
end

puts "=== env2yaml execution time (#{100} runs) ==="
puts "Go:   #{measure_env2yaml(GO_IMAGE)}ms avg"
puts "Java: #{measure_env2yaml(JAVA_IMAGE)}ms avg"

example run:

=== env2yaml execution time (100 runs) ===
Go:   213ms avg
Java: 308ms avg

@donoghuc donoghuc marked this pull request as ready for review November 12, 2025 20:01
@donoghuc donoghuc marked this pull request as draft November 12, 2025 20:15
@donoghuc
Copy link
Member Author

Still need to do some more go removal stuff and sort out the ironbank container, back to WIP.

@donoghuc
Copy link
Member Author

Still working through how ironbank is different. Will push an update when possible. The java re-write is ready for review (and works in the other container), just making sure to get all the golang stuff out of the repo.

@donoghuc
Copy link
Member Author

/run exhaustive tests

@donoghuc
Copy link
Member Author

donoghuc commented Nov 13, 2025

ironbank has been sorted out and i think i've got all the other references to the golang verions removed. Kicked off exhaustive test run https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/2897

@donoghuc donoghuc marked this pull request as ready for review November 13, 2025 22:14
@donoghuc
Copy link
Member Author

Exhaustive tests for packages (unrelated to container only env2yaml) are failing due to a separate issue #18424

@@ -0,0 +1,203 @@
import org.yaml.snakeyaml.Yaml;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use snakeyaml-engine instaed of snakeyaml? couple of reasons:

  1. we don't need a lot of the higher level snakeyaml machinery for this very simple yaml parsing
  2. reduces surface for vulnerabilities and cve findings, given most are typically found in snakeyaml itself and not snakeyaml-engine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt even know about this πŸ˜… I see it ships with jruby. I pushed a commit using that instead. The only difference i can spot is that it seems to want to quote values (which in my testing does not matter). Take a look, we can always revert back to snakeyaml if needed (assuming we continue to ship that with logstash).

# The .class file is compiled during Docker build and placed in /usr/local/bin

exec /usr/share/logstash/jdk/bin/java \
-cp "/usr/share/logstash/logstash-core/lib/jars/*:/usr/local/bin" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested a series of JVM flags to make startup faster. overall I can see ever-so-slight benefits with -XX:+UseSerialGC -Xms16m -Xmx16m, I tried messing with tiered complation settings but no obvious benefit. best tests are likely run on a very limited container.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, updated with these. I think 16m is probably fine, do you feel comfortable with that or should we bump it a bit?

Use snakeyaml-engine and some java flags for faster execution
@elasticmachine
Copy link
Collaborator

πŸ’š Build Succeeded

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants