-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add docs about configuring LS for use with Filebeat modules #6832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| contain ingest node pipelines, Elasticsearch templates, Filebeat prospector | ||
| configurations, and Kibana dashboards. | ||
|
|
||
| Filebeat modules do not currently provide Logstash pipeline configurations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a sub-section called "Migrating from ingest pipelines". In this section, we can add context on why a user would want to graduate to using LS instead of the ingest pipelines. The following reasons are all valid:
- Use multiple outputs. Ingest was designed to only support ES as an output. For example, users may want to archive their incoming data to S3 as well as indexing it in ES.
- Use the persistent queue feature to handle spikes when ingesting data (from beats and other sources).
- Take advantage of the richer transformation capabilities in Logstash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I actually prefer the term "Graduating" in this context.
For the last bullet, we can extend to say "like external lookups."
| build Logstash pipeline configurations that are equivalent to the ingest | ||
| node pipelines available with the Filebeat modules. | ||
|
|
||
| Then you'll be able to use the sample Kibana dashboards available with Filebeat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about changing sample Kibana -> same Kibana?
By providing the LS configs, we are telling users they can continue to use the dashboards set up by the filebeat modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, the point is its a "seamless" transition.
|
@dedemorton @acchen97 I'm a bit confused on how to message this to our users. IMO, if someone is moving from |
| Filebeat needs to create the index pattern and load the sample dashboards into the | ||
| Kibana index. | ||
| + | ||
| After the dashboards are loaded, you'll see the message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this error is weird.. Does this happen only with LS output enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe have Logstash just define this index mapping and pattern on LS startup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyograo You can't use the -setup step with logstash configured as output because the dashboards need to be loaded into an Elasticsearch instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake. This error happens only when you have Elasticsearch and Logstash configured as outputs. I'll removed that sentence from the topic.
|
|
||
| . Configure Filebeat to send log lines to Logstash. | ||
| + | ||
| In version 5.3, Filebeat modules won't work when Logstash is configured as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is confusing. We should just remove it.
I guess from this point on -- once you've run setup from FBM -- you are using FB as a regular shipper to LS. We should say, "remove the ES output from filebeat.yml and add LS as an output".
output.logstash:
hosts: ["localhost:5044"]
That should just work then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See earlier comment. The problem here is the configuration options get set dynamically when you run the Filebeat modules (I don't think the config for the module is persisted anywhere). So you can just run a module once to get Filebeat configured. If you want to use Logstash as the output in 5.3, my understanding is that you need to configure Filebeat, too. @tsg Can you confirm that this is true and that I'm not misunderstanding something fundamental here?
| * <<parsing-nginx>> | ||
| * <<parsing-system>> | ||
|
|
||
| //REVIEWERS: Do we want to add an example that shows how to conditionally select the grok pattern? If not, what guidance should we provide to help users understand how to build a config that works with more than one type of log file? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why we need to talk about conditionals here. Can you explain more? Each FBM handles a different log type (mysql, apache2, ..) , so we would do the same in LS as well, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyograo For example, let's say that a user wants to build a Logstash pipeline that parses both apache2 error and access logs. Wouldn't they need conditional logic or something similar to do that? Right now, we show a separate Logstash config for each fileset.
See Tudor's comment here: #6542 (comment)
| + | ||
| [source,shell] | ||
| ---------------------------------------------------------------------- | ||
| ./filebeat -e -setup -E "output.elasticsearch.hosts=["http://localhost:9200"]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this missing -modules=nginx flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyograo That's how things should work, but that's not how they work right now. In 5.3, if you run a Filebeat module with the logstash output enabled, you see the error CRIT Exiting: Filebeat modules configured but the Elasticsearch output is not configured/enabled:
Rhodas-MacBook-Pro:filebeat-5.3.0-darwin-x86_64 dedemorton$ sudo ./filebeat -e -modules=system
2017/03/23 17:03:16.622482 beat.go:285: INFO Home path: [/Users/dedemorton/BuildTesting/5.3.0_e5a8c674/filebeat-5.3.0-darwin-x86_64] Config path: [/Users/dedemorton/BuildTesting/5.3.0_e5a8c674/filebeat-5.3.0-darwin-x86_64] Data path: [/Users/dedemorton/BuildTesting/5.3.0_e5a8c674/filebeat-5.3.0-darwin-x86_64/data] Logs path: [/Users/dedemorton/BuildTesting/5.3.0_e5a8c674/filebeat-5.3.0-darwin-x86_64/logs]
2017/03/23 17:03:16.622515 beat.go:186: INFO Setup Beat: filebeat; Version: 5.3.0
2017/03/23 17:03:16.622554 metrics.go:23: INFO Metrics logging every 30s
2017/03/23 17:03:16.622572 logstash.go:90: INFO Max Retries set to: 3
2017/03/23 17:03:16.622625 outputs.go:108: INFO Activated logstash as output plugin.
2017/03/23 17:03:16.622682 publish.go:295: INFO Publisher name: Rhodas-MacBook-Pro.local
2017/03/23 17:03:16.622771 async.go:63: INFO Flush Interval set to: 1s
2017/03/23 17:03:16.622778 async.go:64: INFO Max Bulk Size set to: 2048
2017/03/23 17:03:16.623475 beat.go:221: INFO filebeat start running.
2017/03/23 17:03:16.623535 metrics.go:51: INFO Total non-zero values:
2017/03/23 17:03:16.623542 metrics.go:52: INFO Uptime: 4.588856ms
2017/03/23 17:03:16.623546 beat.go:225: INFO filebeat stopped.
2017/03/23 17:03:16.623551 beat.go:339: CRIT Exiting: Filebeat modules configured but the Elasticsearch output is not configured/enabled
Exiting: Filebeat modules configured but the Elasticsearch output is not configured/enabled
That's why I've outlined the steps the way I have. TBH, I think the workflow is kind of jenky overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @suyograo is right here, the -modules=nginx flag is needed and works because this is specifically meant to use the ES output (-E "output.elasticsearch.hosts=["http://localhost:9200"]").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops...you're right. I was looking at this out of context. :-)
|
|
||
| [source,yml] | ||
| ---------------------------------------------------------------------- | ||
| filebeat.prospectors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to specify the filebeat.yml here? If a user has already done ./filebeat --module=nginx they should have the relevant filebeat.yml file, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think so, if they already set the LS output explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyograo I don't think it works that way. The filebeat.yml isn't updated when you run the module. The settings are determined at runtime based on what's in the manifest file (and what the user passes as variables on the command line). @tsg Can you confirm? It's possible that I've missed something big here. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is needed for 5.3, but hopefully we'll be able to simplify it with 5.4. I'd say a single filebeat.yml example per module is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had missed this in my understanding of how FBM works. Agree with what Dede has here, and making it better for 5.4
| == Working with Filebeat Modules | ||
|
|
||
| Starting with version 5.3, Filebeat comes packaged with pre-built | ||
| {filebeat}filebeat-modules.html[modules] that contain the configuration needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Myabe "configurations needed to collect, parse, enrich, and visualize data from various log file formats out-of-the-box"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I'll remove "out-of-the-box" because it's grammatically ambiguous (could be misread--and mistranslated--as modifying "log file formats"
|
@dedemorton thanks for this, left a few comments, but looks good overall. We should link off to these docs from the FBM documentation as well. |
| paths: | ||
| - /var/log/mysql/mysql-slow.log* | ||
| - /var/lib/mysql/hostname-slow.log | ||
| exclude_files: [".gz$"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the multiline configuration is missing. You can take it from: https://github.com/elastic/beats/blob/master/filebeat/module/mysql/slowlog/config/slowlog.yml#L7
It would be good if these samples would resemble what FBM do as much as possible. I can help with that if want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7de5e34 to
4f61ce2
Compare
cd4f3d4 to
6f26395
Compare
|
@suyograo I've updated the Filbeat configs, but I still don't think this is 100% there. Here's a list of the problems that I'm seeing:
|
cd418f5 to
4fedae8
Compare
|
@suyograo I did a bit more testing of the LS configs today. I didn't want to merge this while you were away, but I'd like to get it merged soon. For the problems I listed in the previous comment:
One other thing: I think the setup step is a little hacky. This step may actually result in a filebeat index getting created for system events...because the default Filebeat config harvests the system log by default. I'm not sure if that matters...or whether I should mention it. But I think some people might wonder why the system info is being captured. If I comment out that section in the filebeat.yml, I think the setup still happens, but Filebeat stops with a critical error. That's not necessarily bad because we want users to stop Filebeat anyhow, but it still seems kind of hacky. Wondering what you and Tudor think about this. |
|
Apologies for the delay; I'll re-run this today and update here. |
|
I tested the syslog config and it does not work as you mentioned. I don't know what the issue is here since the multiline config looks ok. I am thinking we should merge the PR as is which will give users some "template" to use for migration and we can iterate on it. I have a feeling this is one of the differences between the grok implementation in ES and LS |
|
LGTM |
@suyograo Here's what I'm planning to say about Filebeat modules in the 5.3 Logstash docs. I have the LS configs commented out, but will add them after you merge your PR. Let me know if the messaging here is in line with what you expect for 5.3.
IMO the manual configuration experience would be a lot easier if Filebeat allowed Logstash as an output. Instead of generating an error and closing, Filebeat should generate a warning when Logstash is configured as the output. Of course, if
-setupis specified and ES is not configured as the output, Filebeat should generate an error.Right now, FBM users who want to use Logstash instead of ingest node miss out on the benefit that FBMs provide for setting up the Filebeat config automatically. The only real benefit of using Filebeat modules at all for Logstash users (AFAIK) is that they can run the
-setupcommand to get the dashboards.Note that I've added some questions to reviewers within the text of the document.
The Logstash configs that will be included in this doc are in this PR: #6791