Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

jkool702
Copy link
Owner

@jkool702 jkool702 commented Feb 27, 2025

Summary by Sourcery

Re-implement dynamic worker coproc spawning in forkrun.bash. This change dynamically adjusts the number of worker coprocs based on system load and data processing rate, optimizing resource utilization.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 27, 2025

Reviewer's Guide by Sourcery

This pull request re-implements dynamic worker coproc spawning in forkrun.bash to optimize resource utilization based on system load and input data rate. It also includes improvements to CPU load calculation accuracy and efficiency in _forkrun_get_load and _forkrun_get_load_pid functions. Additionally, it updates default settings and logic for finding the default directory in forkrun.speedtest.hyperfine.bash and removes an unused test file.

Sequence diagram for dynamic worker coproc spawning

sequenceDiagram
  participant MainThread as Main Thread
  participant pSpawnCoproc as pSpawn Coproc
  participant WorkerCoproc as Worker Coproc

  MainThread->>pSpawnCoproc: Spawns pSpawn coproc
  activate pSpawnCoproc
  pSpawnCoproc->>MainThread: Waits for fd_nSpawn0 signal
  MainThread->>pSpawnCoproc: Sends fd_nSpawn0 signal
  loop until quit or max procs
    pSpawnCoproc->>MainThread: Reads runLines and runTime from fd_nSpawn
    alt System load < pLOAD_max and processing slower than input
      pSpawnCoproc->>pSpawnCoproc: Calculates pAdd (number of workers to add)
      loop pAdd times
        pSpawnCoproc->>MainThread: source coprocSrcCode
        MainThread->>WorkerCoproc: Spawns Worker Coproc
      end
      pSpawnCoproc->>MainThread: Updates worker count
    else System load > pLOAD_max
      pSpawnCoproc->>pSpawnCoproc: Continues to next iteration
    end
  end
  deactivate pSpawnCoproc
Loading

Updated class diagram for CPU load calculation functions

classDiagram
  class _forkrun_get_load {
    -loadMaxVal: int
    -cpu_user: int
    -cpu_nice: int
    -cpu_system: int
    -cpu_idle: int
    -cpu_IOwait: int
    -cpu_irq: int
    -cpu_softirq: int
    -cpu_steal: int
    -cpu_guest: int
    -cpu_guestnice: int
    -tLOAD: int
    -tALL: int
    -tALL0: int
    -cpu_ALL: int
    -cpu_ALL0: int
    -cpu_LOAD: int
    -cpu_LOAD0: int
    -pLOAD: int
    -pLOAD0: int
    -argCount: int
    -initFlag: bool
    -echoFlag: bool
    +():
  }
  note for _forkrun_get_load "Calculates average CPU load"

  class _forkrun_get_load_pid {
    -loadMaxVal: int
    -tLOAD: int
    -tALL0: int
    -tALL: int
    -cpu_ALL: int
    -cpu_ALL0: int
    -cpu_LOAD: int
    -cpu_LOAD0: int
    -pLOAD: int
    -pLOAD0: int
    -argCount: int
    -u0: int
    -s0: int
    -u1: int
    -s1: int
    -initFlag: bool
    -echoFlag: bool
    -grep_str: string
    -pidA: array
    -cpu_ALLA: array
    -cpu_LOADA: array
    +():
  }
  note for _forkrun_get_load_pid "Calculates CPU load for specific PIDs"
Loading

File-Level Changes

Change Details Files
Reimplemented dynamic worker coproc spawning to optimize resource utilization based on system load and input data rate.
  • Introduced adaptive adjustment of the number of worker coprocs based on real-time system load and input data processing rate.
  • Implemented logic to prevent excessive system load by dynamically adjusting the target load and limiting coproc spawning.
  • Added rate-based adjustments considering both system load and the rate at which data is processed versus the rate at which data arrives on stdin.
  • Refactored load calculation and coproc management for improved efficiency and responsiveness.
  • Added logic to avoid spawning new workers if the current processing rate is faster than the input rate, or if processing is slower than before the last worker was spawned.
  • Added logic to dynamically adjust the target load based on the current system load.
  • Added logic to estimate how many additional workers are needed to hit the target load.
  • Added logic to estimate how many additional workers are needed to process lines as fast as they arrive on stdin.
  • Added logic to take the harmonic average of the two estimates to put more weight on the smaller of the two values.
  • Added logic to compare how much the line rate increased to how much the worker count increased.
  • Added logic to reduce the number of new workers to spawn a bit, since we cant unspawn them if we spawn too many.
  • Added logic to update when system load is measured from since we are about to spawn new workers.
  • Added logic to compare system load now to what it was just before the previous most recent group of new coprocs was spawned.
  • Added logic to update previous load and worker count variables.
  • Added logic to update the load-per-coproc-worker estimate smoothly.
  • Added logic to dynamically adjust the target load and abort if system load is above threshold.
  • Added logic to update counts of lines run and run times for the current batch size.
  • Added logic to get average system load since the last time a new worker coproc was spawned.
  • Added logic to get background load from non-worker coprocs.
  • Added logic to get extra load from coprocs forked from the main thread and use it to estimate extra CPU load per coproc worker.
  • Added logic to record the average load at the current worker coproc count.
  • Added logic to fetch coproc source code from tmpdir.
  • Added logic to initialize CPU load calculation.
  • Added logic to wait for the helper coprocs spawned by the main thread to be spawned.
  • Added logic to wait for the worker coprocs spawned by the main thread to be spawned.
forkrun.bash
Modified _forkrun_get_load and _forkrun_get_load_pid functions to improve CPU load calculation accuracy and efficiency.
  • Adjusted scaling factor for CPU load representation to improve precision.
  • Implemented averaging of CPU load measurements to smooth out fluctuations.
  • Optimized process statistics gathering to reduce overhead.
  • Added logic to handle cases where process statistics are unavailable.
  • Added logic to use grep to get process statistics if available.
  • Added logic to read process statistics from /proc//stat.
  • Added logic to initialize CPU load calculation.
  • Added logic to initialize CPU load calculation.
  • Added logic to initialize CPU load calculation.
forkrun.bash
Updated default settings and logic for finding the default directory in forkrun.speedtest.hyperfine.bash.
  • Modified the default directory search logic to prioritize /mnt/ramdisk/usr if it exists.
  • Adjusted variable initialization to ensure correct behavior when no directory is specified.
  • Added logic to set the ramdisk transfer flag to false if the default directory is /mnt/ramdisk/usr.
hyperfine_benchmark/forkrun.speedtest.hyperfine.bash
Removed unused test file.
  • Deleted new-test-code.tmp.bash.
new-test-code.tmp.bash

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jkool702 - I've reviewed your changes - here's some feedback:

Overall Comments:

  • It looks like you're trying to optimize coproc spawning based on system load and input data rate, which is a great idea, but the logic is very complex and hard to follow.
  • Consider breaking down the _forkrun_get_load_pid function into smaller, more manageable pieces to improve readability.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +47 to +50
[[ -d /mnt/ramdisk/usr ]] && [[ -z "${findDir}" ]] && {
findDir='/mnt/ramdisk/usr'
ramdiskTransferFlag=false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (bug_risk): Review conditional logic for setting findDir.

The new conditional sets findDir based on the existence of /mnt/ramdisk/usr. Ensure that this logic correctly covers all intended cases, especially when findDir is already set or when directory availability changes, to avoid unexpected behavior.

}

# get average system load since the last time a new worker coproc was spawned
mapfile -t pLOADA < <(_forkrun_get_load "${pLOADA0[@]}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting the worker-spawn calculation and CPU load update logic into separate helper functions to improve readability and maintainability.

Consider extracting the nested arithmetic and conditional blocks into separate helper functions. For example, you could pull the worker‐spawn calculation logic into its own function and similarly the CPU load estimation into another. This not only clarifies the intent but also reduces the “inlined” complexity.

For instance, you could refactor the worker count calculation like this:

calculate_worker_addition() {
  local pLoadA="$1"
  local pLoadMax="$2"
  local pLoad1="$3"
  local nProcsMax="$4"
  local kkProcs="$5"
  local nCPU="$6"
  local pAddMax=$(( nProcsMax - kkProcs ))

  (( pAddMax > ( 1 + ( 2 * nCPU ) ) / 3 )) && pAddMax=$(( ( 1 + ( 2 * nCPU ) ) / 3 ))

  local pAdd=$(( ( pLoadMax - pLoadA ) / pLoad1 ))
  (( pAdd < 1 )) && pAdd=0
  (( pAdd > pAddMax )) && pAdd=$pAddMax

  # Additional harmonizing of estimates can go here.
  echo "$pAdd"
}

Then in your main loop, simply call:

new_workers=$(calculate_worker_addition "$pLOADA" "$pLOAD_max" "$pLOAD1" "$nProcsMax" "$kkProcs" "$nCPU")
(( new_workers < 1 )) && continue

Similarly, consider extracting the CPU load update logic into a helper function. This will make each piece self-contained and easier to manage while keeping functionality intact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant