Performance and Scalability #8445
Replies: 15 comments 9 replies
-
|
Hi @Tommolo , thanks for opening the issue. What version of Kiali are you using? |
Beta Was this translation helpful? Give feedback.
-
|
I'm using kiali v2.0.0 |
Beta Was this translation helpful? Give feedback.
-
|
This probably should not be a separate issue - we already have an epic on this with related issues, unless you have a specific issue that you can specify what is causing the slowdown. So I would recommend participating in the already existing epic/issues for performance related tasks. Please see: |
Beta Was this translation helpful? Give feedback.
-
|
As for this question:
See the one link above with the test doc page - that's what it is geared to answer |
Beta Was this translation helpful? Give feedback.
-
|
@jmazzitelli at least it's another helpful data point. @Tommolo for performance issues it's also helpful to include as much context as you can to help us get an idea of where the bottleneck is. You've included how many workloads/services this is for which is very helpful. How many namespaces are selected on the graph page? Roughly how many istio configuration objects (DestinationRules, VirtualService, etc.) are there? For graph generation it could be rendering the graph (UI), generating the graph (Kiali backend), prometheus, or the connection between your browser and the Kiali API. We're actively working on making these issues easier to diagnose and report. See: #8345. In the meantime, Kiali emits some metrics about graph generation. You can query for There's not really a magic configuration knob to make performance better. You can try scaling Kiali up (more cpu/mem) or scaling up Prometheus but without knowing exactly where the bottleneck is I can't guarantee that will help. |
Beta Was this translation helpful? Give feedback.
-
|
Iβm also having issues with this and had a couple of questions:
|
Beta Was this translation helpful? Give feedback.
-
That would seem likely, though I can't give you a reason why. I don't think we have had any users up to this point with 10 clusters in the mesh (none that told us about it anyway).
No, the installation mechanism shouldn't make a difference when looking at performance of the server. |
Beta Was this translation helpful? Give feedback.
-
|
I'd also suggest always using the most recent version of Kiali that is compatible with your Istio version. Also, Kiali performance is very tied to Prometheus query performance, so you may also want to look at https://kiali.io/docs/configuration/p8s-jaeger-grafana/prometheus/#prometheus-tuning. |
Beta Was this translation helpful? Give feedback.
-
|
There is no specific issue here, and nothing we can really action on. I'm going to convert this to a discussion... |
Beta Was this translation helpful? Give feedback.
-
|
@jmazzitelli, in our multi-cluster configuration, we have Thanos connected to Kiali with about ten clusters. When we try to access the Kiali dashboard without having selected any namespace yet, the browser freezes. At this point, we wondered if it could be a problem related to the number of metrics. Therefore, we tried connecting a local cluster's Prometheus instead of Thanos, with significantly fewer metrics, and despite the slowness, it worked better. We were even able to load the traffic graph of a namespace. At this point, I ask if there is a mechanism that performs preemptive queries and, if so, whether it is possible to disable it or configure it somehow. P.S. We followed the instructions you sent us by reducing the number of metrics to a minimum, but even by leaving only istio_request_total, istio_tcp_received_bytes_total, istio_tcp_sent_bytes_total, the performance remains unchanged. |
Beta Was this translation helpful? Give feedback.
-
|
I add that with fewer connected clusters the performance was sufficient for general use. |
Beta Was this translation helpful? Give feedback.
-
We've always had Kiali metrics, but Prometheus and Kiali need to be configured to collect them (check your Prometheus data and see if you have any metrics with the prefix In 2.10, you can now set |
Beta Was this translation helpful? Give feedback.
-
|
I'm the same person (@RobyBobby24 ) Upon analyzing the traffic, we identified that the call slowing down the dashboard is: This call is necessary to retrieve the JSON containing the information required to reconstruct the traffic graph. From issue #5743 , it is clear that the call consists of a request to Kubernetes (k8s) and one to Prometheus. Testing, as explained in the issue, the URL that makes the call only to Prometheus: This raises the following questions:
|
Beta Was this translation helpful? Give feedback.
-
|
I'm a @ro-distefano's colleguae who is working at the same project. We tried using an interceptor that, through a regex, extracts the parameters from the HTTP request used by Kiali to generate the graph We immediately noticed a significant improvement: although it no longer loads Kubernetes information, the graph is still fully rendered and easy to interpret. In this image, you can see the response times obtained with the new request. Despite the high load (due to the presence of 13 namespaces), the graph loads in about 1.12 seconds. |
Beta Was this translation helpful? Give feedback.
-
|
If I interpret what you are saying (and I may not - I don't think I fully grok what you are doing with that interceptor), but it looks like all you did was remove the bulk of the appenders from the query parameter in the URL. Removing appenders is definitely expected to speed up the request (the appenders do a lot of work and many access the k8s API). Start removing appenders one-by-one in your URL and see which one(s) cause the slow down. That can help the Kiali devs narrow down where the performance issue is and maybe something can be done to speed up that appender in the code.
Some appenders are turned off by de-selecting the checkbox options in the graph's Display menu. Or you could bookmark those URLs with the small appenders list and just use that bookmark to access the large graphs. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to use Kiali in an environment where I have a namespace with a high number of deployments and services (around 320 deployments and 200 services). I've noticed that Kiali is very slow when loading the traffic graph. Is there a way to reduce the loading time?
Beta Was this translation helpful? Give feedback.
All reactions