-
Notifications
You must be signed in to change notification settings - Fork 370
Home
Some discussions and questions I've had that may be of interest:
@sonia: https://github.com/soniah/gosnmp/issues/191
We are using GOSNMP to get data for millions of devices. We have noticed as we scan more and more devices, timeouts start increasing but CPU and memory stays low.
With a 4 vCPU and 16G VM, we can do around 1500 devices per second per VM, trying to get higher throughput but then timeout errors occur more often.
Any recommendations on ways to effectively use this lib when dealing with this kind of load?
Current optimizations on our end:
- setting linux file descriptors to couple hundred thousand, never reach this limit
- We use sync pool to re-use instances of GoSNMP
- keeping re-try to 1 with 6 second timeout, higher retries causes requests to take longer
- leveraging getbulk and bulkwalk as much as possible
Trying to figure out what might be the bottleneck if it's not CPU, memory or network bandwidth.
@sonia: thanks for you question. Unfortunately I don't have access to a large network of devices at the moment, but I've posted the above in case others have suggestions.
@ps0296 Is there a way to re-use sockets instead of opening and closing for each device?
@int3rlop3r Found this interesting comment. Not sure if it's the same issue. https://github.com/microsoft/ethr/blob/master/ethr.go#L206
// Set GOMAXPROCS to 1024 as running large number of goroutines in a loop
// to send network traffic results in timer starvation, as well as unfair
// processing time across goroutines resulting in starvation of many TCP
// connections. Using a higher number of threads via GOMAXPROCS solves this
// problem.
//
runtime.GOMAXPROCS(1024)